Connecting to the Feature Store from an external Flink cluster, such as AWS EMR and GCP DataProc requires configuring it with the Hopsworks certificates. This guide explains step by step how to connect to the Feature Store from an external Flink cluster.
Download the Hopsworks Certificates#
In the Project Settings, select the integration tab and scroll to the Configure Spark Integration section. Click on Download certificates.
Hopsworks uses X.509 certificates for authentication and authorization. If you are interested in the Hopsworks security model, you can read more about it in this blog post. The certificates are composed of three different components: the
keyStore.jks containing the private key and the certificate for your project user, the
trustStore.jks containing the certificates for the Hopsworks certificates authority, and a password to unlock the private key in the
keyStore.jks. The password is displayed in a pop-up when downloading the certificate and should be saved in a file named
When you copy-paste the password to the
material_passwd file, pay attention to not introduce additional empty spaces or new lines.
The three files (
material_passwd) should be accessible by your Flink application.
Configure your Flink cluster#
Add the following configuration to the
flink.hadoop.hops.ssl.keystore.name: replace_this_with_actual_path/keyStore.jks flink.hadoop.hops.ssl.truststore.name: replace_this_with_actual_path/trustStore.jks flink.hadoop.hops.ssl.keystores.passwd.name: replace_this_with_actual_path/material_passwd
Generating an API Key#
For instructions on how to generate an API key follow this user guide. For the Flink integration to work correctly make sure you add the following scopes to your API key:
Connecting to the Feature Store#
You are now ready to connect to the Hopsworks Feature Store from Flink:
//Establish connection with Hopsworks. HopsworksConnection hopsworksConnection = HopsworksConnection.builder() .host("my_instance") // DNS of your Feature Store instance .port(443) // Port to reach your Hopsworks instance, defaults to 443 .project("my_project") // Name of your Hopsworks Feature Store project .apiKeyValue("api_key") // The API key to authenticate with the feature store .hostnameVerification(false) // Disable for self-signed certificates .build(); //get feature store handle FeatureStore fs = hopsworksConnection.getFeatureStore();
For more information and how to integrate Flink streaming feature pipeline to the Hopsworks Feature store follow the tutorial.