Deployment#
Handle#
get_model_serving#
Connection.get_model_serving()
Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.
Example
import hopsworks
project = hopsworks.login()
ms = project.get_model_serving()
Returns
ModelServing
. A model serving handle object to perform operations on.
Creation#
create_deployment#
ModelServing.create_deployment(predictor, name=None)
Create a Deployment metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = ms.create_deployment(my_predictor)
my_deployment.save()
Using the model object
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
my_deployment = my_model.deploy()
my_deployment.get_state().describe()
Using the Model Serving handle
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
my_deployment.get_state().describe()
Lazy
This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save()
method.
Arguments
- predictor
hsml.predictor.Predictor
: predictor to be used in the deployment - name
Optional[str]
: name of the deployment
Returns
Deployment
. The model metadata object.
deploy#
Model.deploy(
name=None,
description=None,
artifact_version="CREATE",
serving_tool=None,
script_file=None,
resources=None,
inference_logger=None,
inference_batcher=None,
transformer=None,
)
Deploy the model.
Example
import hopsworks
project = hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
my_deployment = my_model.deploy()
Arguments
- name
Optional[str]
: Name of the deployment. - description
Optional[str]
: Description of the deployment. - artifact_version
Optional[str]
: Version number of the model artifact to deploy,CREATE
to create a new model artifact orMODEL-ONLY
to reuse the shared artifact containing only the model files. - serving_tool
Optional[str]
: Serving tool used to deploy the model server. - script_file
Optional[str]
: Path to a custom predictor script implementing the Predict class. - resources
Optional[Union[hsml.resources.PredictorResources, dict]]
: Resources to be allocated for the predictor. - inference_logger
Optional[Union[hsml.inference_logger.InferenceLogger, dict]]
: Inference logger configuration. - inference_batcher
Optional[Union[hsml.inference_batcher.InferenceBatcher, dict]]
: Inference batcher configuration. - transformer
Optional[Union[hsml.transformer.Transformer, dict]]
: Transformer to be deployed together with the predictor.
Returns
Deployment
. The deployment metadata object of a new or existing deployment.
deploy#
Predictor.deploy()
Create a deployment for this predictor and persists it in the Model Serving.
Example
import hopsworks
project = hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
print(my_deployment.get_state())
Returns
Deployment
. The deployment metadata object of a new or existing deployment.
Retrieval#
get_deployment#
ModelServing.get_deployment(name)
Get a deployment by name from Model Serving.
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by name
my_deployment = ms.get_deployment('deployment_name')
Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Arguments
- name
str
: Name of the deployment to get.
Returns
Deployment
: The deployment metadata object.
Raises
RestAPIError
: If unable to retrieve deployment from model serving.
get_deployment_by_id#
ModelServing.get_deployment_by_id(id)
Get a deployment by id from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by id
my_deployment = ms.get_deployment_by_id(1)
Arguments
- id
int
: Id of the deployment to get.
Returns
Deployment
: The deployment metadata object.
Raises
RestAPIError
: If unable to retrieve deployment from model serving.
get_deployments#
ModelServing.get_deployments(model=None, status=None)
Get all deployments from model serving.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
list_deployments = ms.get_deployment(my_model)
for deployment in list_deployments:
print(deployment.get_state())
Arguments
- model
Optional[hsml.model.Model]
: Filter by model served in the deployments - status
Optional[str]
: Filter by status of the deployments
Returns
List[Deployment]
: A list of deployments.
Raises
RestAPIError
: If unable to retrieve deployments from model serving.
Properties#
artifact_path#
Path of the model artifact deployed by the predictor.
artifact_version#
Artifact version deployed by the predictor.
created_at#
Created at date of the predictor.
creator#
Creator of the predictor.
description#
Description of the deployment.
id#
Id of the deployment.
inference_batcher#
Configuration of the inference batcher attached to this predictor.
inference_logger#
Configuration of the inference logger attached to this predictor.
model_name#
Name of the model deployed by the predictor
model_path#
Model path deployed by the predictor.
model_server#
Model server ran by the predictor.
model_version#
Model version deployed by the predictor.
name#
Name of the deployment.
predictor#
Predictor used in the deployment.
requested_instances#
Total number of requested instances in the deployment.
resources#
Resource configuration for the predictor.
script_file#
Script file used by the predictor.
serving_tool#
Serving tool used to run the model server.
transformer#
Transformer configured in the predictor.
Methods#
delete#
Deployment.delete(force=False)
Delete the deployment
Arguments
- force: Force the deletion of the deployment. If the deployment is running, it will be stopped and deleted automatically. !!! warn A call to this method does not ask for a second confirmation.
describe#
Deployment.describe()
Print a description of the deployment
download_artifact#
Deployment.download_artifact()
Download the model artifact served by the deployment
get_logs#
Deployment.get_logs(component="predictor", tail=10)
Prints the deployment logs of the predictor or transformer.
Arguments
- component: Deployment component to get the logs from (e.g., predictor or transformer)
- tail: Number of most recent lines to retrieve from the logs.
get_state#
Deployment.get_state()
Get the current state of the deployment
Returns
PredictorState
. The state of the deployment.
get_url#
Deployment.get_url()
Get url to the deployment in Hopsworks
is_running#
Deployment.is_running(or_idle=True, or_updating=True)
Check whether the deployment is ready to handle inference requests
Arguments
- or_idle: Whether the idle state is considered as running (default is True)
- or_updating: Whether the updating state is considered as running (default is True)
Returns
bool
. Whether the deployment is ready or not.
is_stopped#
Deployment.is_stopped(or_created=True)
Check whether the deployment is stopped
Arguments
- or_created: Whether the created state is considered as stopped (default is True)
Returns
bool
. Whether the deployment is stopped or not.
predict#
Deployment.predict(data=None, inputs=None)
Send inference requests to the deployment. One of data or inputs parameters must be set. If both are set, inputs will be ignored.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# retrieve deployment by name
my_deployment = ms.get_deployment("my_deployment")
# (optional) retrieve model input example
my_model = project.get_model_registry() .get_model(my_deployment.model_name, my_deployment.model_version)
# make predictions using model inputs (single or batch)
predictions = my_deployment.predict(inputs=my_model.input_example)
# or using more sophisticated inference request payloads
data = { "instances": [ my_model.input_example ], "key2": "value2" }
predictions = my_deployment.predict(data)
Arguments
- data
Optional[dict]
: Payload dictionary for the inference request including the model input(s) - inputs
Optional[list]
: Model inputs used in the inference requests
Returns
dict
. Inference response.
save#
Deployment.save(await_update=60)
Persist this deployment including the predictor and metadata to Model Serving.
Arguments
- await_update
Optional[int]
: If the deployment is running, awaiting time (seconds) for the running instances to be updated. If the running instances are not updated within this timespan, the call to this method returns while the update in the background.
start#
Deployment.start(await_running=60)
Start the deployment
Arguments
- await_running
Optional[int]
: Awaiting time (seconds) for the deployment to start. If the deployment has not started within this timespan, the call to this method returns while it deploys in the background.
stop#
Deployment.stop(await_stopped=60)
Stop the deployment
Arguments
- await_stopped
Optional[int]
: Awaiting time (seconds) for the deployment to stop. If the deployment has not stopped within this timespan, the call to this method returns while it stopping in the background.
to_dict#
Deployment.to_dict()