Model Serving#
Retrieval#
get_model_serving#
Connection.get_model_serving()
Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.
Returns
ModelServing. A model serving handle object to perform operations on.
Properties#
project_id#
Id of the project in which Model Serving is located.
project_name#
Name of the project in which Model Serving is located.
project_path#
Path of the project the registry is connected to.
Methods#
create_deployment#
ModelServing.create_deployment(predictor, name=None)
Create a Deployment metadata object.
Lazy
This method is lazy and does not persist any metadata or deploy any model. To create a deployment, call the save() method.
Arguments
- predictor
hsml.predictor.Predictor: predictor to be used in the deployment - name
Optional[str]: name of the deployment
Returns
Deployment. The model metadata object.
create_predictor#
ModelServing.create_predictor(
model,
name=None,
artifact_version="CREATE",
model_server=None,
serving_tool=None,
script_file=None,
resources=None,
inference_logger=None,
inference_batcher=None,
transformer=None,
)
Create a Predictor metadata object.
Lazy
This method is lazy and does not persist any metadata or deploy any model on its own.
To create a deployment using this predictor, call the deploy() method.
Arguments
- model
hsml.model.Model: Model to be deployed. - name
Optional[str]: Name of the predictor. - artifact_version
Optional[str]: Version number of the model artifact to deploy,CREATEto create a new model artifact orMODEL-ONLYto reuse the shared artifact containing only the model files. - model_server
Optional[str]: Model server ran by the predictor. - serving_tool
Optional[str]: Serving tool used to deploy the model server. - script_file
Optional[str]: Path to a custom predictor script implementing the Predict class. - resources
Optional[Union[hsml.resources.PredictorResources, dict]]: Resources to be allocated for the predictor. - inference_logger
Optional[Union[hsml.inference_logger.InferenceLogger, dict, str]]: Inference logger configuration. - inference_batcher
Optional[Union[hsml.inference_batcher.InferenceBatcher, dict]]: Inference batcher configuration. - transformer
Optional[Union[hsml.transformer.Transformer, dict]]: Transformer to be deployed together with the predictor.
Returns
Predictor. The predictor metadata object.
create_transformer#
ModelServing.create_transformer(script_file=None, resources=None)
Create a Transformer metadata object.
Lazy
This method is lazy and does not persist any metadata or deploy any transformer. To create a deployment using this transformer, set it in the predictor.transformer property.
Arguments
- script_file
Optional[str]: Path to a custom predictor script implementing the Transformer class. - resources
Optional[Union[hsml.resources.PredictorResources, dict]]: Resources to be allocated for the transformer.
Returns
Transformer. The model metadata object.
get_deployment#
ModelServing.get_deployment(name)
Get a deployment by name from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Arguments
- name
str: Name of the deployment to get.
Returns
Deployment: The deployment metadata object.
Raises
RestAPIError: If unable to retrieve deployment from model serving.
get_deployment_by_id#
ModelServing.get_deployment_by_id(id)
Get a deployment by id from Model Serving. Getting a deployment from Model Serving means getting its metadata handle so you can subsequently operate on it (e.g., start or stop).
Arguments
- id
int: Id of the deployment to get.
Returns
Deployment: The deployment metadata object.
Raises
RestAPIError: If unable to retrieve deployment from model serving.
get_deployments#
ModelServing.get_deployments(model=None, status=None)
Get all deployments from model serving.
Arguments
- model
Optional[hsml.model.Model]: Filter by model served in the deployments - status
Optional[str]: Filter by status of the deployments
Returns
List[Deployment]: A list of deployments.
Raises
RestAPIError: If unable to retrieve deployments from model serving.
get_inference_endpoints#
ModelServing.get_inference_endpoints()
Get all inference endpoints available in the current project.
Returns
List[InferenceEndpoint]: Inference endpoints for model inference