Skip to content

Deployment#

Handle#

[source]

get_model_serving#

Connection.get_model_serving()

Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.

Returns

ModelServing. A model serving handle object to perform operations on.


Creation#

[source]

create_deployment#

ModelServing.create_deployment(predictor, name=None)

Create a Deployment metadata object.

Lazy

This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save() method.

Arguments

  • predictor hsml.predictor.Predictor: predictor to be used in the deployment
  • name Optional[str]: name of the deployment

Returns

Deployment. The model metadata object.


[source]

deploy#

Model.deploy(
    name=None,
    artifact_version="CREATE",
    model_server=None,
    serving_tool=None,
    script_file=None,
    resources=None,
    inference_logger=None,
    inference_batcher=None,
    transformer=None,
)

Deploy the model.

Arguments

  • name Optional[str]: Name of the deployment.
  • artifact_version Optional[str]: Version number of the model artifact to deploy, CREATE to create a new model artifact or MODEL-ONLY to reuse the shared artifact containing only the model files.
  • model_server Optional[str]: Model server ran by the predictor.
  • serving_tool Optional[str]: Serving tool used to deploy the model server.
  • script_file Optional[str]: Path to a custom predictor script implementing the Predict class.
  • resources Optional[Union[hsml.resources.PredictorResources, dict]]: Resources to be allocated for the predictor.
  • inference_logger Optional[Union[hsml.inference_logger.InferenceLogger, dict]]: Inference logger configuration.
  • inference_batcher Optional[Union[hsml.inference_batcher.InferenceBatcher, dict]]: Inference batcher configuration.
  • transformer Optional[Union[hsml.transformer.Transformer, dict]]: Transformer to be deployed together with the predictor.

Returns

Deployment. The deployment metadata object.


[source]

deploy#

Predictor.deploy()

Create a deployment for this predictor and persists it in the Model Serving.

Returns

Deployment. The deployment metadata object.


Retrieval#

[source]

get_deployment#

ModelServing.get_deployment(name)

Get a deployment by name from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).

Arguments

  • name str: Name of the deployment to get.

Returns

Deployment: The deployment metadata object.

Raises

  • RestAPIError: If unable to retrieve deployment from model serving.

[source]

get_deployment_by_id#

ModelServing.get_deployment_by_id(id)

Get a deployment by id from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).

Arguments

  • id int: Id of the deployment to get.

Returns

Deployment: The deployment metadata object.

Raises

  • RestAPIError: If unable to retrieve deployment from model serving.

[source]

get_deployments#

ModelServing.get_deployments(model=None, status=None)

Get all deployments from model serving.

Arguments

  • model Optional[hsml.model.Model]: Filter by model served in the deployments
  • status Optional[str]: Filter by status of the deployments

Returns

List[Deployment]: A list of deployments.

Raises

  • RestAPIError: If unable to retrieve deployments from model serving.

Properties#

[source]

artifact_path#

Path of the model artifact deployed by the predictor.


[source]

artifact_version#

Artifact version deployed by the predictor.


[source]

created_at#

Created at date of the predictor.


[source]

creator#

Creator of the predictor.


[source]

id#

Id of the deployment.


[source]

inference_batcher#

Configuration of the inference batcher attached to this predictor.


[source]

inference_logger#

Configuration of the inference logger attached to this predictor.


[source]

model_name#

Name of the model deployed by the predictor


[source]

model_path#

Model path deployed by the predictor.


[source]

model_server#

Model server ran by the predictor.


[source]

model_version#

Model version deployed by the predictor.


[source]

name#

Name of the deployment.


[source]

predictor#

Predictor used in the deployment.


[source]

requested_instances#

Total number of requested instances in the deployment.


[source]

resources#

Resource configuration for the predictor.


[source]

script_file#

Script file used by the predictor.


[source]

serving_tool#

Serving tool used to run the model server.


[source]

transformer#

Transformer configured in the predictor.


Methods#

[source]

delete#

Deployment.delete(force=False)

Delete the deployment

Arguments

  • force: Force the deletion of the deployment. If the deployment is running, it will be stopped and deleted automatically. !!! warn A call to this method does not ask for a second confirmation.

[source]

describe#

Deployment.describe()

Print a description of the deployment


[source]

download_artifact#

Deployment.download_artifact()

Download the model artifact served by the deployment


[source]

get_logs#

Deployment.get_logs(component="predictor", tail=10)

Prints the deployment logs of the predictor or transformer.

Arguments

  • component: Deployment component to get the logs from (e.g., predictor or transformer)
  • tail: Number of most recent lines to retrieve from the logs.

[source]

get_state#

Deployment.get_state()

Get the current state of the deployment

Returns

PredictorState. The state of the deployment.


[source]

get_url#

Deployment.get_url()

Get url to the deployment in Hopsworks


[source]

predict#

Deployment.predict(data)

Send inference requests to the deployment

Arguments

  • data dict: Payload of the inference request.

Returns

dict. Inference response.


[source]

save#

Deployment.save(await_update=60)

Persist this deployment including the predictor and metadata to Model Serving.

Arguments

  • await_update Optional[int]: If the deployment is running, awaiting time (seconds) for the running instances to be updated. If the running instances are not updated within this timespan, the call to this method returns while the update in the background.

[source]

start#

Deployment.start(await_running=60)

Start the deployment

Arguments

  • await_running Optional[int]: Awaiting time (seconds) for the deployment to start. If the deployment has not started within this timespan, the call to this method returns while it deploys in the background.

[source]

stop#

Deployment.stop(await_stopped=60)

Stop the deployment

Arguments

  • await_stopped Optional[int]: Awaiting time (seconds) for the deployment to stop. If the deployment has not stopped within this timespan, the call to this method returns while it stopping in the background.

[source]

to_dict#

Deployment.to_dict()