public class StorageConnectorUtils extends Object
| Constructor and Description |
|---|
StorageConnectorUtils() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.AdlsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the AdlsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.BigqueryConnector connector,
String query,
Map<String,String> options,
String path)
Reads a query or a path into a spark dataframe using the sBigqueryConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.GcsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads a path into a spark dataframe using the GcsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.HopsFsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the HopsFsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.JdbcConnector connector,
String query)
Reads query into a spark dataframe using the JdbcConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.RedshiftConnector connector,
String query)
Reads query into a spark dataframe using the RedshiftConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.S3Connector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the S3Connector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.SnowflakeConnector connector,
String query)
Reads query into a spark dataframe using the SnowflakeConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector connector,
String query,
String dataFormat,
Map<String,String> options,
String path)
Reads a query or a path into a spark dataframe using the storage connector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
readStream(StorageConnector.KafkaConnector connector,
String topic,
boolean topicPattern,
String messageFormat,
String schema,
Map<String,String> options,
boolean includeMetadata)
Reads stream into a spark dataframe using the kafka storage connector.
|
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.HopsFsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - HopsFsConnector object.dataFormat - specify the file format to be read, e.g. `csv`, `parquet`.options - Any additional key/value options to be passed to the connector.path - Path to be read from within the storage connector. .FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.S3Connector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - S3Connector object.dataFormat - specify the file format to be read, e.g. `csv`, `parquet`.options - Any additional key/value options to be passed to the connector.path - Path to be read from within the bucket.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.RedshiftConnector connector, String query) throws FeatureStoreException, IOException
connector - Storage connector object.query - SQL query string.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.AdlsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - AdlsConnector object.dataFormat - specify the file format to be read, e.g. `csv`, `parquet`.options - Any additional key/value options to be passed to the connector.path - Path to be read from within the storage connector.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.SnowflakeConnector connector, String query) throws FeatureStoreException, IOException
connector - SnowflakeConnector object.query - SQL query string.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.JdbcConnector connector, String query) throws FeatureStoreException, IOException
connector - JdbcConnector object.query - SQL query string.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.GcsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - GcsConnector object.dataFormat - Specify the file format to be read, e.g. `csv`, `parquet`.options - Any additional key/value options to be passed to the connector.path - Path to be read from within the storage connector.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.BigqueryConnector connector, String query, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - BigqueryConnector object.query - SQL query string.options - Any additional key/value options to be passed to the connector.path - Path to the table be read from within the storage connector.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector connector, String query, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector - Storage connector object.query - SQL query string.dataFormat - When reading from object stores such as S3, HopsFS and ADLS, specify the file format to be read,
e.g. `csv`, `parquet`.options - Any additional key/value options to be passed to the connector.path - Path to be read from within the bucket of the storage connector. Not relevant for JDBC or database
based connectors such as Snowflake, JDBC or Redshift.FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> readStream(StorageConnector.KafkaConnector connector, String topic, boolean topicPattern, String messageFormat, String schema, Map<String,String> options, boolean includeMetadata) throws FeatureStoreException, IOException
connector - Storage connector object.topic - name of the topic.topicPattern - if provided will subscribe topics that match provided pattern.messageFormat - format of the message. "avro" or "json".schema - schema of the messageoptions - Any additional key/value options to be passed to the connector.includeMetadata - whether to include metadata of the topic in the dataframe, such as "key", "topic",
"partition", offset", "timestamp", "timestampType", "value.*".FeatureStoreException - If unable to retrieve StorageConnector from the feature store.IOException - Generic IO exception.Copyright © 2025. All rights reserved.