How-To set up a JDBC Storage Connector#

Introduction#

JDBC is an API provided by many database systems. Using JDBC connections one can query and update data in a database, usually oriented towards relational databases. Examples of databases you can connect to using JDBC are MySQL, Postgres, Oracle, DB2, MongoDB or Microsoft SQLServer.

In this guide, you will configure a Storage Connector in Hopsworks to save all the authentication information needed in order to set up a JDBC connection to your database of choice. When you're finished, you'll be able to query the database using Spark through HSFS APIs.

Note

Currently, it is only possible to create storage connectors in the Hopsworks UI. You cannot create a storage connector programmatically.

Prerequisites#

Before you begin this guide you'll need to retrieve the following information from your JDBC compatible database:

JDBC Connection URL: Consult the documentation of your target database to determine the correct JDBC URL and parameters. As an example, for MySQL the URL could be:

jdbc:mysql://10.0.2.15:3306/[databaseName]?useSSL=false&allowPublicKeyRetrieval=true

Username and Password: Typically, you will need to add username and password in your JDBC URL or as key/value parameters. So make sure you have retrieved a username and password with the suitable permissions for the database and table you want to query.

Creation in the UI#

Step 1: Set up new storage connector#

Head to the Storage Connector View on Hopsworks (1) and set up a new storage connector (2).

Storage Connector Creation — The Storage Connector View in the User Interface

Step 2: Enter JDBC Settings#

Enter the details for your JDBC enabled database.

Select "JDBC" as connector protocol.
Enter the JDBC connection url. This can for example also contain the username and password.
Add additional key/value arguments to be passed to the connection, such as username or password. These might differ by database.

Note

Driver class name is a mandatory argument even if using the default MySQL driver. Add it by specifying a property with the name driver and class name as value. The driver class name will differ based on the database. For MySQL databases, the class name is com.mysql.cj.jdbc.Driver, as shown in the example image.
Click "Setup storage connector".

Note

To be able to use the connector, you need to upload the driver JAR file to the Jupyter configuration or Job configuration in Addtional Jars. For MySQL connections the default JDBC driver is already included in Hopsworks so this step can be skipped.

Next Steps#

Move on to the usage guide for storage connectors to see how you can use your newly created JDBC connector.