Skip to content

hsml.predictor #

Predictor #

Bases: DeployableComponent

Metadata object representing a predictor in Model Serving.

id property #

Id of the predictor.

name property writable #

Name of the predictor.

version property #

Version of the predictor.

description property writable #

Description of the predictor.

model_name property writable #

Name of the model deployed by the predictor.

model_path property writable #

Model path deployed by the predictor.

model_version property writable #

Model version deployed by the predictor.

model_framework property writable #

Model framework of the model to be deployed by the predictor.

artifact_version property writable #

Artifact version deployed by the predictor.

Deprecated

Artifact versions are deprecated in favor of deployment versions.

artifact_files_path property #

Path of the artifact files deployed by the predictor.

artifact_path property #

Path of the model artifact deployed by the predictor. Resolves to /Projects/{project_name}/Models/{name}/{version}/Artifacts/{artifact_version}/{name}{version}.zip.

model_server property #

Model server used by the predictor.

serving_tool property writable #

Serving tool used to run the model server.

script_file property writable #

Script file used to load and run the model.

config_file property writable #

Model server configuration file passed to the model deployment.

It can be accessed via CONFIG_FILE_PATH environment variable from a predictor or transformer script. For LLM deployments without a predictor script, this file is used to configure the vLLM engine.

inference_logger property writable #

Configuration of the inference logger attached to this predictor.

transformer property writable #

Transformer configuration attached to the predictor.

created_at property #

Created at date of the predictor.

creator property #

Creator of the predictor.

requested_instances property #

Total number of requested instances in the predictor.

api_protocol property writable #

API protocol enabled in the predictor (e.g., HTTP or GRPC).

environment property writable #

Name of the inference environment.

project_namespace property writable #

Kubernetes project namespace.

project_name property writable #

Name of the project the deployment belongs to.

deploy #

deploy() -> deployment.Deployment

Create a deployment for this predictor and persists it in the Model Serving.

RETURNS DESCRIPTION
deployment.Deployment

The deployment metadata object of a new or existing deployment.

Examples:

import hopsworks

project = hopsworks.login()

# get Hopsworks Model Registry handle
mr = project.get_model_registry()

# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()

print(my_deployment.get_state())

describe #

describe()

Print a JSON description of the predictor.

get_endpoint_url #

get_endpoint_url() -> str | None

Get the base endpoint URL for this predictor.

Returns the base URL that can be used with external HTTP clients. This is the path-based routing base endpoint without any protocol-specific suffixes like :predict or /v1.

If Istio client is not available, returns None (Hopsworks REST API doesn't support base-only endpoints).

RETURNS DESCRIPTION
str | None

Base endpoint URL, or None if unavailable.

Examples:

url = predictor.get_endpoint_url()
# url = "https://host:port/v1/project/name"

get_openai_url #

get_openai_url() -> str | None

Get the OpenAI-compatible API URL for vLLM deployments.

Returns the URL for OpenAI-compatible API endpoints (e.g., /v1/chat/completions). This method only returns a URL for LLM (vLLM) deployments.

RETURNS DESCRIPTION
str | None

OpenAI-compatible URL (base URL + "/v1"), or None if not a LLM deployment.

Examples:

url = predictor.get_openai_compatible_url()
# url = "https://host:port/v1/project/name/v1"
# Then use: url + "/chat/completions"

get_inference_url #

get_inference_url() -> str | None

Get the KServe inference URL for standard model deployments.

Returns the full URL with :predict suffix for KServe inference protocol. This method only returns a URL for standard model deployments (non-vLLM, with a model attached).

If Istio client is not available, falls back to Hopsworks REST API path.

RETURNS DESCRIPTION
str | None

Inference URL with :predict suffix, or None if not a standard model deployment.

Examples:

url = predictor.get_inference_url()
# url = "https://host:port/v1/project/name/v1/models/name:predict"