EmbeddingIndex#
EmbeddingIndex#
hsfs.embedding.EmbeddingIndex(index_name=None, features=None, col_prefix=None)
Represents an index for managing embedding features.
Arguments
- index_name
str | None
: The name of the embedding index. The name of the project index is used if not provided. - features
List[hsfs.embedding.EmbeddingFeature] | None
: A list ofEmbeddingFeature
objects for the features that contain embeddings that should be indexed for similarity search. - col_prefix
str | None
: The prefix to be added to column names when using project index. It is managed by Hopsworks and should not be provided.
Example
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256)
embeddings = embedding_index.get_embeddings()
Properties#
col_prefix#
str: The prefix to be added to column names.
feature_group#
FeatureGroup: The feature group object that contains the embedding feature.
index_name#
str: The name of the embedding index.
Methods#
add_embedding#
EmbeddingIndex.add_embedding(name, dimension, similarity_function_type="l2_norm", model=None)
Adds a new embedding feature to the index.
Example:
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256)
# Attach a hsml model to the embedding feature
embedding_index = EmbeddingIndex()
embedding_index.add_embedding(name="user_vector", dimension=256, model=hsml_model)
Arguments
- name
str
: The name of the embedding feature. - dimension
int
: The dimensionality of the embedding feature. - similarity_function_type
hsfs.embedding.SimilarityFunctionType | None
: The type of similarity function to be used. - model (hsml.model.Model, optional): The hsml model used to generate the embedding. Defaults to None.
count#
EmbeddingIndex.count(options=None)
Count the number of records in the feature group.
Arguments
- options
map | None
: The options used for the request to the vector database. The keys are attribute values of thehsfs.core.opensearch.OpensearchRequestOption
class.
Returns
int: The number of records in the feature group.
Raises:
ValueError: If the feature group is not initialized. FeaturestoreException: If an error occurs during the count operation.
get_embedding#
EmbeddingIndex.get_embedding(name)
Returns the hsfs.embedding.EmbeddingFeature
object associated with the feature name.
Arguments
- name (str): The name of the embedding feature.
Returns
hsfs.embedding.EmbeddingFeature
object
get_embeddings#
EmbeddingIndex.get_embeddings()
Returns the list of hsfs.embedding.EmbeddingFeature
objects associated with the index.
Returns
A list of hsfs.embedding.EmbeddingFeature
objects
json#
EmbeddingIndex.json()
Serialize the EmbeddingIndex object to a JSON string.
to_dict#
EmbeddingIndex.to_dict()
Convert the EmbeddingIndex object to a dictionary.
Returns: dict: A dictionary representation of the EmbeddingIndex object.