hsfs.core.data_source #
[source] DataSource #
Metadata object used to provide data source information.
You can obtain data sources using [FeatureStore.get_data_source][hsfs.feature_store.FeatureStore.get_data_source].
The DataSource class encapsulates the details of a data source that can be used for reading or writing data. It supports various types of sources, such as SQL queries, database tables, file paths, and storage connectors.
| ATTRIBUTE | DESCRIPTION |
|---|---|
_query | SQL query string for the data source, if applicable. TYPE: |
_database | Name of the database containing the data source. TYPE: |
_group | Group or schema name for the data source. TYPE: |
_table | Table name for the data source. TYPE: |
_path | File system path for the data source. TYPE: |
_storage_connector | Storage connector object holds configuration for accessing the data source. TYPE: |
_metrics | List of metric column names for the data source. TYPE: |
_dimensions | List of dimension column names for the data source. TYPE: |
_rest_endpoint | REST endpoint configuration for the data source. TYPE: |
[source] query property writable #
query: str | None
Get or set the SQL query string for the data source.
| RETURNS | DESCRIPTION |
|---|---|
str | None | The SQL query string. |
[source] database property writable #
database: str | None
Get or set the database name for the data source.
| RETURNS | DESCRIPTION |
|---|---|
str | None | The database name. |
[source] group property writable #
group: str | None
Get or set the group/schema name for the data source.
| RETURNS | DESCRIPTION |
|---|---|
str | None | The group or schema name. |
[source] table property writable #
table: str | None
Get or set the table name for the data source.
| RETURNS | DESCRIPTION |
|---|---|
str | None | The table name. |
[source] path property writable #
path: str | None
Get or set the file system path for the data source.
| RETURNS | DESCRIPTION |
|---|---|
str | None | The file system path. |
[source] storage_connector property writable #
storage_connector: sc.StorageConnector | None
Get or set the storage connector for the data source.
| RETURNS | DESCRIPTION |
|---|---|
sc.StorageConnector | None | The storage connector object. |
[source] get_tables #
get_tables(database: str | None = None) -> list[DataSource]
Retrieve the list of tables from the specified database.
Example
# connect to the Feature Store
fs = ...
data_source = fs.get_data_source("test_data_source")
tables = data_source.get_tables()
| PARAMETER | DESCRIPTION |
|---|---|
database | The name of the database to list tables from. If not provided, the default database is used. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
list[DataSource] | A list of DataSource objects representing the tables. |
[source] get_data #
get_data() -> dsd.DataSourceData
Retrieve the data from the data source.
Example
# connect to the Feature Store
fs = ...
table = fs.get_data_source("test_data_source").get_tables()[0]
data = table.get_data()
| RETURNS | DESCRIPTION |
|---|---|
dsd.DataSourceData | An object containing the data retrieved from the data source. |
[source] get_metadata #
get_metadata() -> dict
Retrieve metadata information about the data source.
Example
# connect to the Feature Store
fs = ...
table = fs.get_data_source("test_data_source").get_tables()[0]
metadata = table.get_metadata()
| RETURNS | DESCRIPTION |
|---|---|
dict | A dictionary containing metadata about the data source. |
[source] get_feature_groups_provenance #
get_feature_groups_provenance() -> Links | None
Get the generated feature groups using this data source, based on explicit provenance.
These feature groups can be accessible or inaccessible. Explicit provenance does not track deleted generated feature group links, so deleted will always be empty. For inaccessible feature groups, only a minimal information is returned.
| RETURNS | DESCRIPTION |
|---|---|
Links | None | The feature groups generated using this data source or |
| RAISES | DESCRIPTION |
|---|---|
hopsworks.client.exceptions.RestAPIError | In case the backend encounters an issue. |
[source] get_feature_groups #
get_feature_groups() -> list[fg.FeatureGroup]
Get the feature groups using this data source, based on explicit provenance.
Only the accessible feature groups are returned. For more items use the base method, DataSource.get_feature_groups_provenance.
| RETURNS | DESCRIPTION |
|---|---|
list[fg.FeatureGroup] | List of feature groups. |
[source] get_training_datasets_provenance #
get_training_datasets_provenance() -> Links
Get the generated training datasets using this data source, based on explicit provenance.
These training datasets can be accessible or inaccessible. Explicit provenance does not track deleted generated training dataset links, so deleted will always be empty. For inaccessible training datasets, only a minimal information is returned.
| RETURNS | DESCRIPTION |
|---|---|
Links | The training datasets generated using this data source or |
| RAISES | DESCRIPTION |
|---|---|
hopsworks.client.exceptions.RestAPIError | In case the backend encounters an issue. |
[source] get_training_datasets #
get_training_datasets() -> list[TrainingDataset]
Get the training datasets using this data source, based on explicit provenance.
Only the accessible training datasets are returned. For more items use the base method, get_training_datasets_provenance.
| RETURNS | DESCRIPTION |
|---|---|
list[TrainingDataset] | List of training datasets. |