Skip to content

Model Serving#

Retrieval#

[source]

get_model_serving#

Connection.get_model_serving()

Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.

Example

import hopsworks

project = hopsworks.login()

ms = project.get_model_serving()

Returns

ModelServing. A model serving handle object to perform operations on.


Properties#

[source]

project_id#

Id of the project in which Model Serving is located.


[source]

project_name#

Name of the project in which Model Serving is located.


[source]

project_path#

Path of the project the registry is connected to.


Methods#

[source]

create_deployment#

ModelServing.create_deployment(predictor, name=None, environment=None)

Create a Deployment metadata object.

Example

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

my_predictor = ms.create_predictor(my_model)

my_deployment = ms.create_deployment(my_predictor)
my_deployment.save()

Using the model object

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

my_deployment = my_model.deploy()

my_deployment.get_state().describe()

Using the Model Serving handle

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

my_predictor = ms.create_predictor(my_model)

my_deployment = my_predictor.deploy()

my_deployment.get_state().describe()

Lazy

This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save() method.

Arguments

  • predictor hsml.predictor.Predictor: predictor to be used in the deployment
  • name str | None: name of the deployment
  • environment str | None: The inference environment to use

Returns

Deployment. The model metadata object.


[source]

create_predictor#

ModelServing.create_predictor(
    model,
    name=None,
    artifact_version="CREATE",
    serving_tool=None,
    script_file=None,
    config_file=None,
    resources=None,
    inference_logger=None,
    inference_batcher=None,
    transformer=None,
    api_protocol="REST",
)

Create a Predictor metadata object.

Example

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

my_predictor = ms.create_predictor(my_model)

my_deployment = my_predictor.deploy()

Lazy

This method is lazy and does not persist any metadata or deploy any model on its own. To create a deployment using this predictor, call the deploy() method.

Arguments

  • model hsml.model.Model: Model to be deployed.
  • name str | None: Name of the predictor.
  • artifact_version str | None: Version number of the model artifact to deploy, CREATE to create a new model artifact or MODEL-ONLY to reuse the shared artifact containing only the model files.
  • serving_tool str | None: Serving tool used to deploy the model server.
  • script_file str | None: Path to a custom predictor script implementing the Predict class.
  • config_file str | None: Server configuration file to be passed to the model deployment.
  • resources hsml.resources.PredictorResources | dict | None: Resources to be allocated for the predictor.
  • inference_logger hsml.inference_logger.InferenceLogger | dict | str | None: Inference logger configuration.
  • inference_batcher hsml.inference_batcher.InferenceBatcher | dict | None: Inference batcher configuration.
  • transformer hsml.transformer.Transformer | dict | None: Transformer to be deployed together with the predictor.
  • api_protocol str | None: API protocol to be enabled in the deployment (i.e., 'REST' or 'GRPC'). Defaults to 'REST'.

Returns

Predictor. The predictor metadata object.


[source]

create_transformer#

ModelServing.create_transformer(script_file=None, resources=None)

Create a Transformer metadata object.

Example

# login into Hopsworks using hopsworks.login()

# get Dataset API instance
dataset_api = project.get_dataset_api()

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

# create my_transformer.py Python script
class Transformer(object):
    def __init__(self):
        ''' Initialization code goes here '''
        pass

    def preprocess(self, inputs):
        ''' Transform the requests inputs here. The object returned by this method will be used as model input to make predictions. '''
        return inputs

    def postprocess(self, outputs):
        ''' Transform the predictions computed by the model before returning a response '''
        return outputs

uploaded_file_path = dataset_api.upload("my_transformer.py", "Resources", overwrite=True)
transformer_script_path = os.path.join("/Projects", project.name, uploaded_file_path)

my_transformer = ms.create_transformer(script_file=uploaded_file_path)

# or

from hsml.transformer import Transformer

my_transformer = Transformer(script_file)

Create a deployment with the transformer

my_predictor = ms.create_predictor(transformer=my_transformer)
my_deployment = my_predictor.deploy()

# or
my_deployment = ms.create_deployment(my_predictor, transformer=my_transformer)
my_deployment.save()

Lazy

This method is lazy and does not persist any metadata or deploy any transformer. To create a deployment using this transformer, set it in the predictor.transformer property.

Arguments

  • script_file str | None: Path to a custom predictor script implementing the Transformer class.
  • resources hsml.resources.PredictorResources | dict | None: Resources to be allocated for the transformer.

Returns

Transformer. The model metadata object.


[source]

get_deployment#

ModelServing.get_deployment(name=None)

Get a deployment by name from Model Serving.

Example

# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()

# get a deployment by name
my_deployment = ms.get_deployment('deployment_name')

Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).

Arguments

  • name str: Name of the deployment to get.

Returns

Deployment: The deployment metadata object.

Raises

  • RestAPIError: If unable to retrieve deployment from model serving.

[source]

get_deployment_by_id#

ModelServing.get_deployment_by_id(id)

Get a deployment by id from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).

Example

# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()

# get a deployment by id
my_deployment = ms.get_deployment_by_id(1)

Arguments

  • id int: Id of the deployment to get.

Returns

Deployment: The deployment metadata object.

Raises

  • RestAPIError: If unable to retrieve deployment from model serving.

[source]

get_deployments#

ModelServing.get_deployments(model=None, status=None)

Get all deployments from model serving.

Example

# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

list_deployments = ms.get_deployment(my_model)

for deployment in list_deployments:
    print(deployment.get_state())

Arguments

  • model hsml.model.Model: Filter by model served in the deployments
  • status str: Filter by status of the deployments

Returns

List[Deployment]: A list of deployments.

Raises

  • RestAPIError: If unable to retrieve deployments from model serving.

[source]

get_inference_endpoints#

ModelServing.get_inference_endpoints()

Get all inference endpoints available in the current project.

Returns

List[InferenceEndpoint]: Inference endpoints for model inference