Skip to content

Inference batcher#

Creation#

[source]

InferenceBatcher#

hsml.inference_batcher.InferenceBatcher(
    enabled=None, max_batch_size=None, max_latency=None, timeout=None, **kwargs
)

Configuration of an inference batcher for a predictor.

Arguments

  • enabled Optional[bool]: Whether the inference batcher is enabled or not. The default value is false.
  • max_batch_size Optional[int]: Maximum requests batch size.
  • max_latency Optional[int]: Maximum latency for request batching.
  • timeout Optional[int]: Maximum waiting time for request batching.

Returns

InferenceLogger. Configuration of an inference logger.


Retrieval#

predictor.inference_batcher#

Inference batchers can be accessed from the predictor metadata objects.

predictor.inference_batcher

Predictors can be found in the deployment metadata objects (see Predictor Reference). To retrieve a deployment, see the Deployment Reference.

Properties#

[source]

enabled#

Whether the inference batcher is enabled or not.


[source]

max_batch_size#

Maximum requests batch size.


[source]

max_latency#

Maximum latency.


[source]

timeout#

Maximum timeout.


Methods#

[source]

describe#

InferenceBatcher.describe()

Print a description of the inference batcher


[source]

to_dict#

InferenceBatcher.to_dict()