Skip to content

hsfs.embedding #

EmbeddingFeature #

Represents an embedding feature.

PARAMETER DESCRIPTION
name

The name of the embedding feature.

TYPE: str | None DEFAULT: None

dimension

The dimensionality of the embedding feature.

TYPE: int | None DEFAULT: None

similarity_function_type

The type of similarity function used for the embedding feature. Available functions are L2, COSINE, and DOT_PRODUCT.

TYPE: SimilarityFunctionType DEFAULT: SimilarityFunctionType.L2

model

A Model in hsml.

TYPE: Model | None DEFAULT: None

feature_group

The feature group object that contains the embedding feature.

TYPE: FeatureGroup | None DEFAULT: None

embedding_index

The index for managing embedding features.

TYPE: EmbeddingIndex | None DEFAULT: None

name property #

str: The name of the embedding feature.

dimenstion property #

The dimensionality of the embedding feature.

This one is excluded from the docs as the name is misspelled but kept to avoid breaking the API.

dimension property #

dimension: int

The dimensionality of the embedding feature.

similarity_function_type property #

similarity_function_type: SimilarityFunctionType

SimilarityFunctionType: The type of similarity function used for the embedding feature.

model property #

model: Model | None

The Model in hsml.

feature_group property writable #

feature_group: FeatureGroup | None

The feature group object that contains the embedding feature.

embedding_index property writable #

embedding_index: EmbeddingIndex | None

The index for managing embedding features.

EmbeddingIndex #

Represents an index for managing embedding features.

PARAMETER DESCRIPTION
index_name

The name of the embedding index. The name of the project index is used if not provided.

TYPE: str | None DEFAULT: None

features

A list of the features that contain embeddings that should be indexed for similarity search.

TYPE: list[EmbeddingFeature] | None DEFAULT: None

col_prefix

The prefix to be added to column names when using project index. It is managed by Hopsworks and should not be provided.

TYPE: str | None DEFAULT: None

Example
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256)
embeddings = embedding_index.get_embeddings()

feature_group property writable #

feature_group: FeatureGroup | None

The feature group object that contains the embedding feature.

index_name property #

index_name: str

The name of the embedding index.

col_prefix property #

col_prefix: str

The prefix to be added to column names.

add_embedding #

add_embedding(
    name: str,
    dimension: int,
    similarity_function_type: SimilarityFunctionType = SimilarityFunctionType.L2,
    model: Model | None = None,
)

Adds a new embedding feature to the index.

Example
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256)

# Attach a hsml model to the embedding feature
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256, model=hsml_model)
PARAMETER DESCRIPTION
name

The name of the embedding feature.

TYPE: str

dimension

The dimensionality of the embedding feature.

TYPE: int

similarity_function_type

The type of similarity function to be used.

TYPE: SimilarityFunctionType DEFAULT: SimilarityFunctionType.L2

model

hsml.model.Model | None The hsml model used to generate the embedding.

TYPE: Model | None DEFAULT: None

get_embedding #

get_embedding(name: str) -> EmbeddingFeature

Get EmbeddingFeature associated with the feature name.

PARAMETER DESCRIPTION
name

The name of the embedding feature.

TYPE: str

RETURNS DESCRIPTION
EmbeddingFeature

The EmbeddingFeature associated with the name.

get_embeddings #

get_embeddings() -> list[EmbeddingFeature]

Returns the list of EmbeddingFeature objects associated with the index.

RETURNS DESCRIPTION
list[EmbeddingFeature]

All embedding features in the index.

count #

count(options: dict | None = None) -> int

Count the number of records in the feature group.

PARAMETER DESCRIPTION
options

The options used for the request to the vector database. The keys are attribute values of OpensearchRequestOption.

TYPE: dict | None DEFAULT: None

RETURNS DESCRIPTION
int

The number of records in the feature group.

RAISES DESCRIPTION
ValueError

If the feature group is not initialized.

hopsworks.client.exceptions.FeatureStoreException

If an error occurs during the count operation.

SimilarityFunctionType #

Enumeration class representing different types of similarity functions.

L2 class-attribute instance-attribute #

L2: str = 'l2_norm'

Represents L2 norm similarity function.

COSINE class-attribute instance-attribute #

COSINE: str = 'cosine'

Represents cosine similarity function.

DOT_PRODUCT class-attribute instance-attribute #

DOT_PRODUCT: str = 'dot_product'

Represents dot product similarity function.