HopsFS

HopsFS is a HDFS-compatible filesystem on AWS/Azure/on-premises for data analytics. HopsFS stores its data on object storage in the cloud (S3 in AWs and Blob storage on Azure) and on commodity servers on-premises, ensuring low-cost storage, high availability, and disaster recovery. In Hopsworks, you can access HopsFS natively in programs (Spark, TensorFlow, etc) without the need to define a Storage Connector. By default, every Project has a Storage Connector for Training Datasets. When you create training datasets from features in the Feature Store the HopsFS connector is the default Storage Connector. However, if you want to output data to a different dataset, you can define a new Storage Connector for that dataset.

Define a HopsFS storage connector by selecting a directory in the current Project.
You can define a storage connector to a directory in the same project in HopsFS by selecting the directory.