Skip to content

AWS SageMaker Integration#

Connecting to the Model Registry from SageMaker requires setting up a Model Registry API key for SageMaker and installing HSML on SageMaker. This guide explains step by step how to connect to the Model Registry from SageMaker.

Generate an API key#

In Hopsworks, click on your username in the top-right corner and select Settings to open the user settings. Select API keys. Give the key a name and select the project, model_registry, scopes before creating the key. Copy the key into your clipboard for the next step.

Scopes

The API key should contain at least the following scopes:

  1. project
  2. modelregistry
  3. dataset.create
  4. dataset.view
  5. dataset.delete

Generate an API key on Hopsworks
API keys can be created in the User Settings on Hopsworks

Info

You are only ably to retrieve the API key once. If you did not manage to copy it to your clipboard, delete it again and create a new one.

Quickstart API key Argument#

API key as Argument

To get started quickly, without saving the Hopsworks API in a secret storage, you can simply supply it as an argument when instantiating a connection:

    import hsml
    conn = hsml.connection(
        host='my_instance',                 # DNS of your Model Registry instance
        port=443,                           # Port to reach your Hopsworks instance, defaults to 443
        project='my_project',               # Name of your Hopsworks Model Registry project
        api_key_value='apikey',             # The API key to authenticate with Hopsworks
        hostname_verification=True          # Disable for self-signed certificates
    )
    mr = conn.get_model_registry()           # Get the project's default model registry

Store the API key on AWS#

The API key now needs to be stored on AWS, so it can be retrieved from within SageMaker notebooks.

Identify your SageMaker role#

You need to know the IAM role used by your SageMaker instance to set up the API key for it. You can find it in the overview of your SageMaker notebook instance of the AWS Management Console.

In this example, the name of the role is AmazonSageMaker-ExecutionRole-20190511T072435.

Identify your SageMaker Role
The role is attached to your SageMaker notebook instance

Store the API key#

You have two options to make your API key accessible from SageMaker:

Option 1: Using the AWS Systems Manager Parameter Store#

Store the API key in the AWS Systems Manager Parameter Store#
  1. In the AWS Management Console, ensure that your active region is the region you use for SageMaker.
  2. Go to the AWS Systems Manager choose Parameter Store in the left navigation bar and select Create Parameter.
  3. As name, enter /hopsworks/role/[MY_SAGEMAKER_ROLE]/type/api-key replacing [MY_SAGEMAKER_ROLE] with the AWS role used by the SageMaker instance that should access the Model Registry.
  4. Select Secure String as type and create the parameter.

AWS Systems Manager Parameter Store
Store the API key in the AWS Systems Manager Parameter Store

Grant access to the Parameter Store from the SageMaker notebook role#
  1. In the AWS Management Console, go to IAM, select Roles and then the role that is used when creating SageMaker notebook instances.
  2. Select Add inline policy.
  3. Choose Systems Manager as service, expand the Read access level and check GetParameter.
  4. Expand Resources and select Add ARN.
  5. Enter the region of the Systems Manager as well as the name of the parameter WITHOUT the leading slash e.g. hopsworks/role/[MY_SAGEMAKER_ROLE]/type/api-key and click Add.
  6. Click on Review, give the policy a name und click on Create policy.

AWS Systems Manager Parameter Store Get Parameter Policy
Grant access to the Parameter Store from the SageMaker notebook role

Option 2: Using the AWS Secrets Manager#

Store the API key in the AWS Secrets Manager#
  1. In the AWS Management Console, ensure that your active region is the region you use for SageMaker.
  2. Go to the AWS Secrets Manager and select Store new secret.
  3. Select Other type of secrets and add api-key as the key and paste the API key created in the previous step as the value.
  4. Click next.

AWS Systems Manager
Store the API key in the AWS Secrets Manager

  1. As secret name, enter hopsworks/role/[MY_SAGEMAKER_ROLE] replacing [MY_SAGEMAKER_ROLE] with the AWS role used by the SageMaker instance that should access the Model Registry.
  2. Select next twice and finally store the secret.
  3. Then click on the secret in the secrets list and take note of the Secret ARN.

AWS Systems Manager
Store the API key in the AWS Secrets Manager

Grant access to the SecretsManager to the SageMaker notebook role#
  1. In the AWS Management Console, go to IAM, select Roles and then the role that is used when creating SageMaker notebook instances.
  2. Select Add inline policy.
  3. Choose Secrets Manager as service, expand the Read access level and check GetSecretValue.
  4. Expand Resources and select Add ARN.
  5. Paste the ARN of the secret created in the previous step.
  6. Click on Review, give the policy a name und click on Create policy.

AWS Systems Manager GetSecretValue policy
Grant access to the SecretsManager to the SageMaker notebook role

Install HSML#

To be able to access the Hopsworks Model Registry, the HSML Python library needs to be installed. One way of achieving this is by opening a Python notebook in SageMaker and installing the HSML with a magic command and pip:

Matching Hopsworks version

The major and minor version of HSML needs to match the major and minor version of Hopsworks.

For example for a Hopsworks cluster running with version 2.5.0, the following installation command will install the latest release available for HSML.

pip install hsml==2.5.*

HSML version needs to match the major version of Hopsworks
You find the Hopsworks version inside any of your Project's settings tab on Hopsworks

Note that the library will not be persistent. For information around how to permanently install a library to SageMaker, see Install External Libraries and Kernels in Notebook Instances.

Connect to the Model Registry Store#

You are now ready to connect to the Hopsworks Model Registry from SageMaker:

import hsml
conn = hsml.connection(
    'my_instance',                      # DNS of your Model Registry instance
    443,                                # Port to reach your Hopsworks instance, defaults to 443
    'my_project',                       # Name of your Hopsworks Model Registry project
    secrets_store='secretsmanager',     # Either parameterstore or secretsmanager
    hostname_verification=True         # Disable for self-signed certificates
)
mr = conn.get_model_registry()           # Get the project's default model registry

Ports

If you have trouble connecting, please ensure that the Security Group of your Hopsworks instance on AWS is configured to allow incoming traffic from your SageMaker instance on port 443. See VPC Security Groups for more information. If your SageMaker instances are not in the same VPC as your Hopsworks instance and the Hopsworks instance is not accessible from the internet then you will need to configure VPC Peering on AWS.

Next Steps#

For more information about how to use the Model Registry, see the Quickstart Guide.