hsml.model_serving #
ModelServing #
project_id property #
Id of the project in which Model Serving is located.
project_name property #
Name of the project in which Model Serving is located.
project_path property #
Path of the project the registry is connected to.
create_deployment #
create_deployment(
predictor: Predictor,
name: str | None = None,
environment: str | None = None,
) -> Deployment
Create a Deployment metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = ms.create_deployment(my_predictor)
my_deployment.save()
Using the model object
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
my_deployment = my_model.deploy()
my_deployment.get_state().describe()
Using the Model Serving handle
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
my_deployment.get_state().describe()
Lazy
This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save() method.
| PARAMETER | DESCRIPTION |
|---|---|
predictor | predictor to be used in the deployment TYPE: |
name | name of the deployment TYPE: |
environment | (Deprecated) The project Python environment to use. This argument will be ignored, use the argument TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Deployment | The deployment metadata object. |
create_endpoint #
create_endpoint(
name: str,
script_file: str,
description: str | None = None,
resources: PredictorResources | dict | None = None,
inference_logger: InferenceLogger
| dict
| str
| None = None,
inference_batcher: InferenceBatcher
| dict
| None = None,
api_protocol: str | None = IE.API_PROTOCOL_REST,
environment: str | None = None,
scaling_configuration: PredictorScalingConfig
| dict
| None = None,
env_vars: dict | None = None,
) -> Predictor
Create an Entrypoint metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
ms = project.get_model_serving()
my_endpoint = ms.create_entrypoint(name="feature_server", entrypoint_file="feature_server.py")
my_deployment = my_endpoint.deploy()
Lazy
This method is lazy and does not persist any metadata or deploy any endpoint on its own. To create a deployment using this endpoint, call the deploy() method.
| PARAMETER | DESCRIPTION |
|---|---|
name | Name of the endpoint. TYPE: |
script_file | Path to a custom script file implementing a HTTP server. TYPE: |
description | Description of the endpoint. TYPE: |
resources | Resources to be allocated for the predictor. TYPE: |
inference_logger | Inference logger configuration. TYPE: |
inference_batcher | Inference batcher configuration. TYPE: |
api_protocol | API protocol to be enabled in the deployment (i.e., 'REST' or 'GRPC'). TYPE: |
environment | The project Python environment to use TYPE: |
scaling_configuration | Scaling configuration for the predictor. TYPE: |
env_vars | Environment variables to set on the predictor. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Predictor | The predictor metadata object. |
create_predictor #
create_predictor(
model: Model,
name: str | None = None,
artifact_version: str | None = None,
serving_tool: str | None = None,
script_file: str | None = None,
config_file: str | None = None,
resources: PredictorResources | dict | None = None,
inference_logger: InferenceLogger
| dict
| str
| None = None,
inference_batcher: InferenceBatcher
| dict
| None = None,
transformer: Transformer | dict | None = None,
api_protocol: str | None = IE.API_PROTOCOL_REST,
environment: str | None = None,
scaling_configuration: PredictorScalingConfig
| dict
| None = None,
env_vars: dict | None = None,
vllm_variant: str | None = None,
vllm_image_tag: str | None = None,
) -> Predictor
Create a Predictor metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
Lazy
This method is lazy and does not persist any metadata or deploy any model on its own. To create a deployment using this predictor, call the deploy() method.
| PARAMETER | DESCRIPTION |
|---|---|
model | Model to be deployed. TYPE: |
name | Name of the predictor. TYPE: |
artifact_version | (Deprecated) Version number of the model artifact to deploy, TYPE: |
serving_tool | Serving tool used to deploy the model server. TYPE: |
script_file | Path to a custom predictor script implementing the Predict class. TYPE: |
config_file | Model server configuration file to be passed to the model deployment. It can be accessed via TYPE: |
resources | Resources to be allocated for the predictor. TYPE: |
inference_logger | Inference logger configuration. TYPE: |
inference_batcher | Inference batcher configuration. TYPE: |
transformer | Transformer to be deployed together with the predictor. TYPE: |
api_protocol | API protocol to be enabled in the deployment (i.e., 'REST' or 'GRPC'). TYPE: |
environment | The project Python environment to use TYPE: |
scaling_configuration | Scaling configuration for the predictor. TYPE: |
env_vars | Environment variables to set on the predictor. TYPE: |
vllm_variant | vLLM image variant for vLLM deployments. One of TYPE: |
vllm_image_tag | vLLM image tag override. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Predictor | The predictor metadata object. |
create_transformer #
create_transformer(
script_file: str | None = None,
resources: PredictorResources | dict | None = None,
scaling_configuration: TransformerScalingConfig
| dict
| None = None,
env_vars: dict | None = None,
) -> Transformer
Create a Transformer metadata object.
Example
# login into Hopsworks using hopsworks.login()
# get Dataset API instance
dataset_api = project.get_dataset_api()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# create my_transformer.py Python script
class Transformer(object):
def __init__(self):
''' Initialization code goes here '''
pass
def preprocess(self, inputs):
''' Transform the requests inputs here. The object returned by this method will be used as model input to make predictions. '''
return inputs
def postprocess(self, outputs):
''' Transform the predictions computed by the model before returning a response '''
return outputs
uploaded_file_path = dataset_api.upload("my_transformer.py", "Resources", overwrite=True)
transformer_script_path = os.path.join("/Projects", project.name, uploaded_file_path)
my_transformer = ms.create_transformer(script_file=uploaded_file_path)
# or
from hsml.transformer import Transformer
my_transformer = Transformer(script_file)
Create a deployment with the transformer
my_predictor = ms.create_predictor(transformer=my_transformer)
my_deployment = my_predictor.deploy()
# or
my_deployment = ms.create_deployment(my_predictor, transformer=my_transformer)
my_deployment.save()
Lazy
This method is lazy and does not persist any metadata or deploy any transformer. To create a deployment using this transformer, set it in the predictor.transformer property.
| PARAMETER | DESCRIPTION |
|---|---|
script_file | Path to a custom predictor script implementing the Transformer class. TYPE: |
resources | Resources to be allocated for the transformer. TYPE: |
scaling_configuration | Scaling configuration for the transformer. TYPE: |
env_vars | Environment variables to set on the transformer. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Transformer | The transformer metadata object. |
deploy_agent #
deploy_agent(
entry: str,
name: str | None = None,
requirements: str | None = None,
environment: str | None = None,
upload_dir: str = "Resources/agents",
description: str | None = None,
resources: PredictorResources | dict | None = None,
inference_logger: InferenceLogger
| dict
| str
| None = None,
inference_batcher: InferenceBatcher
| dict
| None = None,
api_protocol: str | None = IE.API_PROTOCOL_REST,
scaling_configuration: PredictorScalingConfig
| dict
| None = None,
) -> Deployment
Deploy a Python script or package as an agent.
The agent is created on first call and updated on subsequent calls. Each call uploads the latest local code, refreshes the Python environment, and rewrites the deployment's predictor metadata to reflect the arguments passed in — including any unspecified arguments, which fall back to their defaults. The deployment's running state is left untouched; call start() after the first deploy and restart() to roll a running agent onto the new code. Works the same whether invoked from outside or inside a Hopsworks cluster.
Pass either a .py script or a directory containing a pyproject.toml. For a script, the file is uploaded and run directly. For a package, a wheel is built locally with the project's PEP 517 backend, uploaded, and installed; a small runner module invokes the package via runpy.run_module.
ms = project.get_model_serving()
agent = ms.deploy_agent(entry="my_agent.py")
agent.start() # or agent.restart()
# iterate: edit code locally, push, then roll the running agent onto it
agent = ms.deploy_agent(entry="my_agent.py")
agent.restart()
| PARAMETER | DESCRIPTION |
|---|---|
entry | Local path to a TYPE: |
name | Name of the deployment, also used as the default Python environment name. Defaults to the basename of TYPE: |
requirements | Local path to a TYPE: |
environment | Name of the Python environment to use; defaults to TYPE: |
upload_dir | Directory in the Hopsworks Filesystem under which agent files are placed; the agent gets its own subdirectory TYPE: |
description | Description of the deployment. TYPE: |
resources | Resources to be allocated for the predictor. TYPE: |
inference_logger | Inference logger configuration. TYPE: |
inference_batcher | Inference batcher configuration. TYPE: |
api_protocol | API protocol to be enabled in the deployment (i.e., 'REST' or 'GRPC'). TYPE: |
scaling_configuration | Scaling configuration for the predictor. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Deployment | The deployment metadata object. |
| RAISES | DESCRIPTION |
|---|---|
ValueError | If |
hopsworks.client.exceptions.RestAPIError | If the backend encounters an error when handling the request. |
get_deployment #
get_deployment(name: str = None) -> Deployment | None
Get a deployment by name from Model Serving.
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by name
my_deployment = ms.get_deployment('deployment_name')
Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
| PARAMETER | DESCRIPTION |
|---|---|
name | Name of the deployment to get. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Deployment | None | The deployment metadata object or |
| RAISES | DESCRIPTION |
|---|---|
hopsworks.client.exceptions.RestAPIError | If unable to retrieve deployment from model serving. |
get_deployment_by_id #
get_deployment_by_id(id: int) -> Deployment | None
Get a deployment by id from Model Serving.
Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Example
# login and get Hopsworks Model Serving handle using .login() and .get_model_serving()
# get a deployment by id
my_deployment = ms.get_deployment_by_id(1)
| PARAMETER | DESCRIPTION |
|---|---|
id | Id of the deployment to get. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
Deployment | None | The deployment metadata object or |
| RAISES | DESCRIPTION |
|---|---|
hopsworks.client.exceptions.RestAPIError | If unable to retrieve deployment from model serving. |
get_deployments #
get_deployments(
model: Model = None, status: str = None
) -> list[Deployment]
Get all deployments from model serving.
Example
# login into Hopsworks using hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
list_deployments = ms.get_deployment(my_model)
for deployment in list_deployments:
print(deployment.get_state())
| PARAMETER | DESCRIPTION |
|---|---|
model | Filter by model served in the deployments TYPE: |
status | Filter by status of the deployments TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
list[Deployment] | A list of deployments. |
| RAISES | DESCRIPTION |
|---|---|
hopsworks.client.exceptions.RestAPIError | If unable to retrieve deployments from model serving. |