Skip to content

hsml.deployment #

[source] Deployment #

[source] NOT_FOUND_ERROR_CODE class-attribute instance-attribute #

NOT_FOUND_ERROR_CODE = 240000

Metadata object representing a deployment in Model Serving.

[source] id property #

Id of the deployment.

[source] name property writable #

Name of the deployment.

[source] version property #

Version of the deployment.

[source] description property writable #

Description of the deployment.

[source] has_model property #

Whether the deployment has a model associated.

[source] predictor property writable #

Predictor used in the deployment.

[source] requested_instances property #

Total number of requested instances in the deployment.

[source] model_name property writable #

Name of the model deployed by the predictor.

[source] model_path property writable #

Model path deployed by the predictor.

[source] model_version property writable #

Model version deployed by the predictor.

[source] artifact_version property writable #

Artifact version deployed by the predictor.

Deprecated

Artifact versions are deprecated in favor of deployment versions.

[source] artifact_files_path property #

Path of the artifact files deployed by the predictor.

[source] artifact_path property #

Path of the model artifact deployed by the predictor.

Deprecated

Artifact versions are deprecated in favor of deployment versions.

[source] model_server property writable #

Model server ran by the predictor.

[source] serving_tool property writable #

Serving tool used to run the model server.

[source] script_file property writable #

Script file used by the predictor.

[source] config_file property writable #

Model server configuration file passed to the model deployment.

It can be accessed via CONFIG_FILE_PATH environment variable from a predictor or transformer script. For LLM deployments without a predictor script, this file is used to configure the vLLM engine.

[source] resources property writable #

Resource configuration for the predictor.

[source] inference_logger property writable #

Configuration of the inference logger attached to this predictor.

[source] inference_batcher property writable #

Configuration of the inference batcher attached to this predictor.

[source] transformer property writable #

Transformer configured in the predictor.

[source] model_registry_id property writable #

Model Registry Id of the deployment.

[source] created_at property #

Created at date of the predictor.

[source] creator property #

Creator of the predictor.

[source] api_protocol property writable #

API protocol enabled in the deployment (e.g., HTTP or GRPC).

[source] environment property writable #

Name of inference environment.

[source] project_namespace property writable #

Name of inference environment.

[source] project_name property writable #

Name of the project the deployment belongs to.

[source] scaling_configuration property writable #

Scaling configuration for the deployment.

[source] save #

save(await_update: int | None = 600)

Persist this deployment including the predictor and metadata to Model Serving.

PARAMETER DESCRIPTION
await_update

If the deployment is running, awaiting time (seconds) for the running instances to be updated. If the running instances are not updated within this timespan, the call to this method returns while the update in the background.

TYPE: int | None DEFAULT: 600

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] start #

start(await_running: int | None = 600)

Start the deployment.

PARAMETER DESCRIPTION
await_running

Awaiting time (seconds) for the deployment to start. If the deployment has not started within this timespan, the call to this method returns while it deploys in the background.

TYPE: int | None DEFAULT: 600

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] stop #

stop(await_stopped: int | None = 600)

Stop the deployment.

PARAMETER DESCRIPTION
await_stopped

Awaiting time (seconds) for the deployment to stop. If the deployment has not stopped within this timespan, the call to this method returns while it stopping in the background.

TYPE: int | None DEFAULT: 600

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] delete #

delete(force=False)

Delete the deployment.

PARAMETER DESCRIPTION
force

Force the deletion of the deployment. If the deployment is running, it will be stopped and deleted automatically.

DEFAULT: False

Warning

A call to this method does not ask for a second confirmation.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] get_state #

get_state() -> PredictorState

Get the current state of the deployment.

RETURNS DESCRIPTION
PredictorState

PredictorState. The state of the deployment.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] is_created #

is_created() -> bool

Check whether the deployment is created.

RETURNS DESCRIPTION
bool

bool. Whether the deployment is created or not.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] is_running #

is_running(or_idle=True, or_updating=True) -> bool

Check whether the deployment is ready to handle inference requests.

PARAMETER DESCRIPTION
or_idle

Whether the idle state is considered as running (default is True)

DEFAULT: True

or_updating

Whether the updating state is considered as running (default is True)

DEFAULT: True

RETURNS DESCRIPTION
bool

bool. Whether the deployment is ready or not.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] is_stopped #

is_stopped(or_created=True) -> bool

Check whether the deployment is stopped.

PARAMETER DESCRIPTION
or_created

Whether the creating and created state is considered as stopped (default is True)

DEFAULT: True

RETURNS DESCRIPTION
bool

bool. Whether the deployment is stopped or not.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] predict #

predict(
    data: dict | InferInput = None,
    inputs: list | dict = None,
)

Send inference requests to the deployment.

One of data or inputs parameters must be set. If both are set, inputs will be ignored.

Example
# login into Hopsworks using hopsworks.login()

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

# retrieve deployment by name
my_deployment = ms.get_deployment("my_deployment")

# (optional) retrieve model input example
my_model = project.get_model_registry()                                .get_model(my_deployment.model_name, my_deployment.model_version)

# make predictions using model inputs (single or batch)
predictions = my_deployment.predict(inputs=my_model.input_example)

# or using more sophisticated inference request payloads
data = { "instances": [ my_model.input_example ], "key2": "value2" }
predictions = my_deployment.predict(data)
PARAMETER DESCRIPTION
data

Payload dictionary for the inference request including the model input(s)

TYPE: dict | InferInput DEFAULT: None

inputs

Model inputs used in the inference requests

TYPE: list | dict DEFAULT: None

RETURNS DESCRIPTION

dict. Inference response.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] get_model #

get_model()

Retrieve the metadata object for the model being used by this deployment.

[source] download_artifact_files #

download_artifact_files(local_path=None)

Download the artifact files served by the deployment.

PARAMETER DESCRIPTION
local_path

path where to download the artifact files in the local filesystem

DEFAULT: None

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] get_logs #

get_logs(component='predictor', tail=10)

Prints the deployment logs of the predictor or transformer.

PARAMETER DESCRIPTION
component

Deployment component to get the logs from (e.g., predictor or transformer)

DEFAULT: 'predictor'

tail

Number of most recent lines to retrieve from the logs.

DEFAULT: 10

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue

[source] get_url #

get_url()

Get url to the deployment in Hopsworks.

[source] describe #

describe()

Print a JSON description of the deployment.