public class StorageConnectorUtils extends Object
Constructor and Description |
---|
StorageConnectorUtils() |
Modifier and Type | Method and Description |
---|---|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.AdlsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the AdlsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.BigqueryConnector connector,
String query,
Map<String,String> options,
String path)
Reads a query or a path into a spark dataframe using the sBigqueryConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.GcsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads a path into a spark dataframe using the GcsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.HopsFsConnector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the HopsFsConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.JdbcConnector connector,
String query)
Reads query into a spark dataframe using the JdbcConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.RedshiftConnector connector,
String query)
Reads query into a spark dataframe using the RedshiftConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.S3Connector connector,
String dataFormat,
Map<String,String> options,
String path)
Reads path into a spark dataframe using the S3Connector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector.SnowflakeConnector connector,
String query)
Reads query into a spark dataframe using the SnowflakeConnector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
read(StorageConnector connector,
String query,
String dataFormat,
Map<String,String> options,
String path)
Reads a query or a path into a spark dataframe using the storage connector.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
readStream(StorageConnector.KafkaConnector connector,
String topic,
boolean topicPattern,
String messageFormat,
String schema,
Map<String,String> options,
boolean includeMetadata)
Reads stream into a spark dataframe using the kafka storage connector.
|
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.HopsFsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- HopsFsConnector object.dataFormat
- specify the file format to be read, e.g. `csv`, `parquet`.options
- Any additional key/value options to be passed to the connector.path
- Path to be read from within the storage connector. .FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.S3Connector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- S3Connector object.dataFormat
- specify the file format to be read, e.g. `csv`, `parquet`.options
- Any additional key/value options to be passed to the connector.path
- Path to be read from within the bucket.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.RedshiftConnector connector, String query) throws FeatureStoreException, IOException
connector
- Storage connector object.query
- SQL query string.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.AdlsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- AdlsConnector object.dataFormat
- specify the file format to be read, e.g. `csv`, `parquet`.options
- Any additional key/value options to be passed to the connector.path
- Path to be read from within the storage connector.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.SnowflakeConnector connector, String query) throws FeatureStoreException, IOException
connector
- SnowflakeConnector object.query
- SQL query string.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.JdbcConnector connector, String query) throws FeatureStoreException, IOException
connector
- JdbcConnector object.query
- SQL query string.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.GcsConnector connector, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- GcsConnector object.dataFormat
- Specify the file format to be read, e.g. `csv`, `parquet`.options
- Any additional key/value options to be passed to the connector.path
- Path to be read from within the storage connector.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector.BigqueryConnector connector, String query, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- BigqueryConnector object.query
- SQL query string.options
- Any additional key/value options to be passed to the connector.path
- Path to the table be read from within the storage connector.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> read(StorageConnector connector, String query, String dataFormat, Map<String,String> options, String path) throws FeatureStoreException, IOException
connector
- Storage connector object.query
- SQL query string.dataFormat
- When reading from object stores such as S3, HopsFS and ADLS, specify the file format to be read,
e.g. `csv`, `parquet`.options
- Any additional key/value options to be passed to the connector.path
- Path to be read from within the bucket of the storage connector. Not relevant for JDBC or database
based connectors such as Snowflake, JDBC or Redshift.FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> readStream(StorageConnector.KafkaConnector connector, String topic, boolean topicPattern, String messageFormat, String schema, Map<String,String> options, boolean includeMetadata) throws FeatureStoreException, IOException
connector
- Storage connector object.topic
- name of the topic.topicPattern
- if provided will subscribe topics that match provided pattern.messageFormat
- format of the message. "avro" or "json".schema
- schema of the messageoptions
- Any additional key/value options to be passed to the connector.includeMetadata
- whether to include metadata of the topic in the dataframe, such as "key", "topic",
"partition", offset", "timestamp", "timestampType", "value.*".FeatureStoreException
- If unable to retrieve StorageConnector from the feature store.IOException
- Generic IO exception.Copyright © 2023. All rights reserved.