Skip to content

How To Inspect A Deployment State#

Introduction#

In this guide, you will learn how to inspect the state of a deployment.

A state can be seen as a snapshot of the current inner workings of a deployment. The following is the state transition diagram for deployments.

Deployments statuses
State transitions of deployments

States are composed of a status and a condition. While a status represents a high-level view of the state, conditions contain more detailed information closely related to infrastructure terms.

Web UI#

Step 1: Inspect deployment status#

If you have at least one deployment already created, navigate to the deployments page by clicking on the Deployments tab on the navigation menu on the left.

Deployments navigation tab
Deployments navigation tab

Once in the deployments page, find the deployment you want to inspect. Next to the actions buttons, you can find an indicator showing the current status of the deployment. This indicator changes its color based on the status.

To inspect the condition of the deployment, click on the name of the deployment to open the deployment overview page.

Step 2: Inspect condition#

Once in the deployment overview page, you can find the aforementioned status indicator at the top of page. Below it, a one-line message is shown with a more detailed description of the deployment status. This message is built using the current condition of the deployment.

Deployment status condition
Deployments status condition

Step 3: Check nº of running instances#

Additionally, you can find the nº of instances currently running by scrolling down to the resource allocation section.

Resource allocation for a deployment
Resource allocation for a deployment

Scale-to-zero capabilities

If scale-to-zero capabilities are enabled, you can see how the nº of instances of a running deployment goes to zero and the status changes to idle. To enable scale-to-zero in a deployment, see Resources Guide

Code#

Step 1: Connect to Hopsworks#

import hopsworks


project = hopsworks.login()

# get Hopsworks Model Serving handle
ms = project.get_model_serving()

Step 2: Retrieve an existing deployment#

deployment = ms.get_deployment("mydeployment")

Step 3: Inspect deployment state#

state = deployment.get_state()

state.describe()

Step 4: Check nº of running instances#

# nº of predictor instances
deployment.resources.describe()

# nº of transformer instances
deployment.transformer.resources.describe()

API Reference#

Deployment

PredictorState

Deployment status#

The status of a deployment is a high-level description of its current state.

Deployment statuses
Status Description
CREATING Deployment artifacts are being prepared
CREATED Deployment has never been started
STARTING Deployment is starting
RUNNING Deployment is ready and running. Predictions are served without additional latencies.
IDLE Deployment is ready but scaled to zero or has no active replicas. Higher latencies (cold-start) are expected on the first inference request.
FAILED Terminal state. The deployment has encountered an unrecoverable error. More details can be found in the status condition.
UPDATING Deployment is applying updates to the running instances
STOPPING Deployment is stopping
STOPPED Deployment has been stopped

How States Are Determined#

Deployment state is determined from multiple sources: the database state (whether the deployment has been deployed and its revision), KServe InferenceService conditions, pod presence (available replicas for predictor and transformer), and the artifact filesystem (whether the deployment artifact files are ready).

A revision ID and deployment version are used to distinguish between STARTING (first generation) and UPDATING (subsequent changes to a running deployment).

Deployment conditions#

A condition contains more specific information about the status of the deployment. They are mainly useful to track the progress of starting or stopping deployments.

Status conditions contain three pieces of information: type, status and reason. While the type describes the purpose of the condition, the status represents its progress. Additionally, a reason field is provided with a more descriptive message of the status.

Deployment conditions
Type Status Description
STOPPED Unknown Deployment is stopping.
True Deployment is stopped. Therefore, no instances are running and no resources are allocated.
SCHEDULED Unknown Deployment is being scheduled
False Deployment failed to be scheduled. This is commonly due to insufficient resources to satisfy the deployment requirements
True Deployment has been scheduled successfully. At this point, resources have been already allocated for the deployment.
INITIALIZED Unknown Deployment is initializing. This step involves initialization tasks such as pulling docker images or mounting data volumes
False Deployment failed to initialized
True Deployment has been initialized successfully. At this point, the docker images have been pulled and data volumes mounted
STARTED Unknown Deployment is starting. In this step, the model server is started and predictor / transformer scripts (if provided) are executed
False Deployment failed to start. This can be due to errors in the predictor / transformer script, missing dependencies or model server incompatibilities.
True Deployment has been started successfully. At this point, the model server has been started and the predictor / transformer scripts (if provided) executed.
READY Unknown Connectivity is being set up.
False Connectivity failed to be set up, mainly due to networking issues.
True Connectivity has been set up and the deployment is ready

The following are two diagrams with the state transitions of conditions in starting and stopping deployments, respectively.

Conditions in starting deployments
Condition transitions in starting deployments

Conditions in stopping deployments
Condition transitions in stopping deployments