Model Serving#
Retrieval#
get_model_serving#
Connection.get_model_serving()
Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.
Example
import hopsworks
project = hopsworks.login()
ms = project.get_model_serving()
Returns
ModelServing
. A model serving handle object to perform operations on.
Properties#
project_id#
Id of the project in which Model Serving is located.
project_name#
Name of the project in which Model Serving is located.
project_path#
Path of the project the registry is connected to.
Methods#
create_deployment#
ModelServing.create_deployment(predictor, name=None, environment=None)
Create a Deployment metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = ms.create_deployment(my_predictor)
my_deployment.save()
Using the model object
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
my_deployment = my_model.deploy()
my_deployment.get_state().describe()
Using the Model Serving handle
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
my_deployment.get_state().describe()
Lazy
This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save()
method.
Arguments
- predictor
hsml.predictor.Predictor
: predictor to be used in the deployment - name
str | None
: name of the deployment - environment
str | None
: The inference environment to use
Returns
Deployment
. The model metadata object.
create_predictor#
ModelServing.create_predictor(
model,
name=None,
artifact_version="CREATE",
serving_tool=None,
script_file=None,
config_file=None,
resources=None,
inference_logger=None,
inference_batcher=None,
transformer=None,
api_protocol="REST",
)
Create a Predictor metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
Lazy
This method is lazy and does not persist any metadata or deploy any model on its own. To create a deployment using this predictor, call the deploy()
method.
Arguments
- model
hsml.model.Model
: Model to be deployed. - name
str | None
: Name of the predictor. - artifact_version
str | None
: Version number of the model artifact to deploy,CREATE
to create a new model artifact orMODEL-ONLY
to reuse the shared artifact containing only the model files. - serving_tool
str | None
: Serving tool used to deploy the model server. - script_file
str | None
: Path to a custom predictor script implementing the Predict class. - config_file
str | None
: Server configuration file to be passed to the model deployment. - resources
hsml.resources.PredictorResources | dict | None
: Resources to be allocated for the predictor. - inference_logger
hsml.inference_logger.InferenceLogger | dict | str | None
: Inference logger configuration. - inference_batcher
hsml.inference_batcher.InferenceBatcher | dict | None
: Inference batcher configuration. - transformer
hsml.transformer.Transformer | dict | None
: Transformer to be deployed together with the predictor. - api_protocol
str | None
: API protocol to be enabled in the deployment (i.e., 'REST' or 'GRPC'). Defaults to 'REST'.
Returns
Predictor
. The predictor metadata object.
create_transformer#
ModelServing.create_transformer(script_file=None, resources=None)
Create a Transformer metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Dataset API instance
dataset_api = project.get_dataset_api()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# create my_transformer.py Python script
class Transformer(object):
def __init__(self):
''' Initialization code goes here '''
pass
def preprocess(self, inputs):
''' Transform the requests inputs here. The object returned by this method will be used as model input to make predictions. '''
return inputs
def postprocess(self, outputs):
''' Transform the predictions computed by the model before returning a response '''
return outputs
uploaded_file_path = dataset_api.upload("my_transformer.py", "Resources", overwrite=True)
transformer_script_path = os.path.join("/Projects", project.name, uploaded_file_path)
my_transformer = ms.create_transformer(script_file=uploaded_file_path)
# or
from hsml.transformer import Transformer
my_transformer = Transformer(script_file)
Create a deployment with the transformer
my_predictor = ms.create_predictor(transformer=my_transformer)
my_deployment = my_predictor.deploy()
# or
my_deployment = ms.create_deployment(my_predictor, transformer=my_transformer)
my_deployment.save()
Lazy
This method is lazy and does not persist any metadata or deploy any transformer. To create a deployment using this transformer, set it in the predictor.transformer
property.
Arguments
- script_file
str | None
: Path to a custom predictor script implementing the Transformer class. - resources
hsml.resources.PredictorResources | dict | None
: Resources to be allocated for the transformer.
Returns
Transformer
. The model metadata object.
get_deployment#
ModelServing.get_deployment(name=None)
Get a deployment by name from Model Serving.
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by name
my_deployment = ms.get_deployment('deployment_name')
Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Arguments
- name
str
: Name of the deployment to get.
Returns
Deployment
: The deployment metadata object.
Raises
RestAPIError
: If unable to retrieve deployment from model serving.
get_deployment_by_id#
ModelServing.get_deployment_by_id(id)
Get a deployment by id from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by id
my_deployment = ms.get_deployment_by_id(1)
Arguments
- id
int
: Id of the deployment to get.
Returns
Deployment
: The deployment metadata object.
Raises
RestAPIError
: If unable to retrieve deployment from model serving.
get_deployments#
ModelServing.get_deployments(model=None, status=None)
Get all deployments from model serving.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
list_deployments = ms.get_deployment(my_model)
for deployment in list_deployments:
print(deployment.get_state())
Arguments
- model
hsml.model.Model
: Filter by model served in the deployments - status
str
: Filter by status of the deployments
Returns
List[Deployment]
: A list of deployments.
Raises
RestAPIError
: If unable to retrieve deployments from model serving.
get_inference_endpoints#
ModelServing.get_inference_endpoints()
Get all inference endpoints available in the current project.
Returns
List[InferenceEndpoint]
: Inference endpoints for model inference