hsml.predictor #
Predictor #
Bases: DeployableComponent
Metadata object representing a predictor in Model Serving.
id property #
Id of the predictor.
name property writable #
Name of the predictor.
version property #
Version of the predictor.
description property writable #
Description of the predictor.
model_name property writable #
Name of the model deployed by the predictor.
model_path property writable #
Model path deployed by the predictor.
model_version property writable #
Model version deployed by the predictor.
model_framework property writable #
Model framework of the model to be deployed by the predictor.
artifact_version property writable #
Artifact version deployed by the predictor.
Deprecated
Artifact versions are deprecated in favor of deployment versions.
artifact_files_path property #
Path of the artifact files deployed by the predictor.
artifact_path property #
Path of the model artifact deployed by the predictor. Resolves to /Projects/{project_name}/Models/{name}/{version}/Artifacts/{artifact_version}/{name}{version}.zip.
model_server property #
Model server used by the predictor.
serving_tool property writable #
Serving tool used to run the model server.
script_file property writable #
Script file used to load and run the model.
config_file property writable #
Model server configuration file passed to the model deployment.
It can be accessed via CONFIG_FILE_PATH environment variable from a predictor or transformer script. For LLM deployments without a predictor script, this file is used to configure the vLLM engine.
inference_logger property writable #
Configuration of the inference logger attached to this predictor.
transformer property writable #
Transformer configuration attached to the predictor.
created_at property #
Created at date of the predictor.
creator property #
Creator of the predictor.
requested_instances property #
Total number of requested instances in the predictor.
api_protocol property writable #
API protocol enabled in the predictor (e.g., HTTP or GRPC).
environment property writable #
Name of the inference environment.
project_namespace property writable #
Kubernetes project namespace.
project_name property writable #
Name of the project the deployment belongs to.
deploy #
deploy() -> deployment.Deployment
Create a deployment for this predictor and persists it in the Model Serving.
| RETURNS | DESCRIPTION |
|---|---|
deployment.Deployment | The deployment metadata object of a new or existing deployment. |
Examples:
import hopsworks
project = hopsworks.login()
# get Hopsworks Model Registry handle
mr = project.get_model_registry()
# retrieve the trained model you want to deploy
my_model = mr.get_model("my_model", version=1)
# get Hopsworks Model Serving handle
ms = project.get_model_serving()
my_predictor = ms.create_predictor(my_model)
my_deployment = my_predictor.deploy()
print(my_deployment.get_state())
get_endpoint_url #
get_endpoint_url() -> str | None
Get the base endpoint URL for this predictor.
Returns the base URL that can be used with external HTTP clients. This is the path-based routing base endpoint without any protocol-specific suffixes like :predict or /v1.
If Istio client is not available, returns None (Hopsworks REST API doesn't support base-only endpoints).
| RETURNS | DESCRIPTION |
|---|---|
str | None | Base endpoint URL, or |
Examples:
url = predictor.get_endpoint_url()
# url = "https://host:port/v1/project/name"
get_openai_url #
get_openai_url() -> str | None
Get the OpenAI-compatible API URL for vLLM deployments.
Returns the URL for OpenAI-compatible API endpoints (e.g., /v1/chat/completions). This method only returns a URL for LLM (vLLM) deployments.
| RETURNS | DESCRIPTION |
|---|---|
str | None | OpenAI-compatible URL (base URL + "/v1"), or |
Examples:
url = predictor.get_openai_compatible_url()
# url = "https://host:port/v1/project/name/v1"
# Then use: url + "/chat/completions"
get_inference_url #
get_inference_url() -> str | None
Get the KServe inference URL for standard model deployments.
Returns the full URL with :predict suffix for KServe inference protocol. This method only returns a URL for standard model deployments (non-vLLM, with a model attached).
If Istio client is not available, falls back to Hopsworks REST API path.
| RETURNS | DESCRIPTION |
|---|---|
str | None | Inference URL with |
Examples:
url = predictor.get_inference_url()
# url = "https://host:port/v1/project/name/v1/models/name:predict"