Skip to content

Inference batcher#

Creation#

InferenceBatcher#

hsml.inference_batcher.InferenceBatcher(
    enabled=None, max_batch_size=None, max_latency=None, timeout=None, **kwargs
)

Configuration of an inference batcher for a predictor.

Arguments

enabled bool | None: Whether the inference batcher is enabled or not. The default value is false.
max_batch_size int | None: Maximum requests batch size.
max_latency int | None: Maximum latency for request batching.
timeout int | None: Maximum waiting time for request batching.

Returns

InferenceLogger. Configuration of an inference logger.

Retrieval#

predictor.inference_batcher#

Inference batchers can be accessed from the predictor metadata objects.

predictor.inference_batcher

Predictors can be found in the deployment metadata objects (see Predictor Reference). To retrieve a deployment, see the Deployment Reference.

Properties#

enabled#

Whether the inference batcher is enabled or not.

max_batch_size#

Maximum requests batch size.

max_latency#

Maximum latency.

timeout#

Maximum timeout.

Methods#

describe#

InferenceBatcher.describe()

Print a JSON description of the inference batcher.