Skip to content

hsfs.core.data_source #

[source] DataSource #

Metadata object used to provide data source information.

You can obtain data sources using [FeatureStore.get_data_source][hsfs.feature_store.FeatureStore.get_data_source].

The DataSource class encapsulates the details of a data source that can be used for reading or writing data. It supports various types of sources, such as SQL queries, database tables, file paths, and storage connectors.

ATTRIBUTE DESCRIPTION
_query

SQL query string for the data source, if applicable.

TYPE: Optional[str]

_database

Name of the database containing the data source.

TYPE: Optional[str]

_group

Group or schema name for the data source.

TYPE: Optional[str]

_table

Table name for the data source.

TYPE: Optional[str]

_path

File system path for the data source.

TYPE: Optional[str]

_storage_connector

Storage connector object holds configuration for accessing the data source.

TYPE: Optional[StorageConnector]

_metrics

List of metric column names for the data source.

TYPE: List[str]

_dimensions

List of dimension column names for the data source.

TYPE: List[str]

_rest_endpoint

REST endpoint configuration for the data source.

TYPE: Optional[RestEndpointConfig]

[source] query property writable #

query: str | None

Get or set the SQL query string for the data source.

RETURNS DESCRIPTION
str | None

The SQL query string.

[source] database property writable #

database: str | None

Get or set the database name for the data source.

RETURNS DESCRIPTION
str | None

The database name.

[source] group property writable #

group: str | None

Get or set the group/schema name for the data source.

RETURNS DESCRIPTION
str | None

The group or schema name.

[source] table property writable #

table: str | None

Get or set the table name for the data source.

RETURNS DESCRIPTION
str | None

The table name.

[source] path property writable #

path: str | None

Get or set the file system path for the data source.

RETURNS DESCRIPTION
str | None

The file system path.

[source] storage_connector property writable #

storage_connector: sc.StorageConnector | None

Get or set the storage connector for the data source.

RETURNS DESCRIPTION
sc.StorageConnector | None

The storage connector object.

[source] get_databases #

get_databases() -> list[str]

Retrieve the list of available databases.

Example
# connect to the Feature Store
fs = ...

data_source = fs.get_data_source("test_data_source")

databases = data_source.get_databases()
RETURNS DESCRIPTION
list[str]

A list of database names available in the data source.

[source] get_tables #

get_tables(database: str | None = None) -> list[DataSource]

Retrieve the list of tables from the specified database.

Example
# connect to the Feature Store
fs = ...

data_source = fs.get_data_source("test_data_source")

tables = data_source.get_tables()
PARAMETER DESCRIPTION
database

The name of the database to list tables from. If not provided, the default database is used.

TYPE: str | None DEFAULT: None

RETURNS DESCRIPTION
list[DataSource]

A list of DataSource objects representing the tables.

[source] get_data #

get_data() -> dsd.DataSourceData

Retrieve the data from the data source.

Example
# connect to the Feature Store
fs = ...

table = fs.get_data_source("test_data_source").get_tables()[0]

data = table.get_data()
RETURNS DESCRIPTION
dsd.DataSourceData

An object containing the data retrieved from the data source.

[source] get_metadata #

get_metadata() -> dict

Retrieve metadata information about the data source.

Example
# connect to the Feature Store
fs = ...

table = fs.get_data_source("test_data_source").get_tables()[0]

metadata = table.get_metadata()
RETURNS DESCRIPTION
dict

A dictionary containing metadata about the data source.

[source] get_feature_groups_provenance #

get_feature_groups_provenance() -> Links | None

Get the generated feature groups using this data source, based on explicit provenance.

These feature groups can be accessible or inaccessible. Explicit provenance does not track deleted generated feature group links, so deleted will always be empty. For inaccessible feature groups, only a minimal information is returned.

RETURNS DESCRIPTION
Links | None

The feature groups generated using this data source or None if none were created.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue.

[source] get_feature_groups #

get_feature_groups() -> list[fg.FeatureGroup]

Get the feature groups using this data source, based on explicit provenance.

Only the accessible feature groups are returned. For more items use the base method, DataSource.get_feature_groups_provenance.

RETURNS DESCRIPTION
list[fg.FeatureGroup]

List of feature groups.

[source] get_training_datasets_provenance #

get_training_datasets_provenance() -> Links

Get the generated training datasets using this data source, based on explicit provenance.

These training datasets can be accessible or inaccessible. Explicit provenance does not track deleted generated training dataset links, so deleted will always be empty. For inaccessible training datasets, only a minimal information is returned.

RETURNS DESCRIPTION
Links

The training datasets generated using this data source or None if none were created.

RAISES DESCRIPTION
hopsworks.client.exceptions.RestAPIError

In case the backend encounters an issue.

[source] get_training_datasets #

get_training_datasets() -> list[TrainingDataset]

Get the training datasets using this data source, based on explicit provenance.

Only the accessible training datasets are returned. For more items use the base method, get_training_datasets_provenance.

RETURNS DESCRIPTION
list[TrainingDataset]

List of training datasets.