Skip to content

Integration with Amazon EKS and Amazon ECR#

This guide shows how to create a cluster in Hopsworks.ai with integrated support for Amazon Elastic Kubernetes Service (EKS) and Amazon Elastic Container Registry (ECR). So that Hopsworks can launch Python jobs, Jupyter servers, and ML model servings on top of Amazon EKS.

Warning

In the current version, we don't support sharing EKS clusters between Hopsworks clusters. That is, an EKS cluster can be only used by one Hopsworks cluster.

Step 1: Create an EKS cluster on AWS#

If you have an existing EKS cluster, skip this step and go directly to Step 2.

Amazon provides two getting started guides using AWS management console or eksctl to help you create an EKS cluster. The easiest way is to use the eksctl command.

Step 1.1: Installing eksctl, aws, and kubectl#

Follow the prerequisites section in getting started with eksctl to install aws, eksctl, and kubectl.

Step 1.2: Create an EKS cluster using eksctl#

You can create a sample EKS cluster with the name my-eks-cluster using Kubernetes version 1.17 with 2 managed nodes in the us-east-2 region by running the following command. For more details on the eksctl usage, check the eksctl documentation.

eksctl create cluster --name my-eks-cluster --version 1.17 --region us-east-2 --nodegroup-name my-nodes --nodes 2 --managed

Output:

[]  eksctl version 0.26.0
[]  using region us-east-2
[]  setting availability zones to [us-east-2b us-east-2a us-east-2c]
[]  subnets for us-east-2b - public:192.168.0.0/19 private:192.168.96.0/19
[]  subnets for us-east-2a - public:192.168.32.0/19 private:192.168.128.0/19
[]  subnets for us-east-2c - public:192.168.64.0/19 private:192.168.160.0/19
[]  using Kubernetes version 1.17
[]  creating EKS cluster "my-eks-cluster" in "us-east-2" region with managed nodes
[]  will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup
[]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-east-2 --cluster=my-eks-cluster'
[]  CloudWatch logging will not be enabled for cluster "my-eks-cluster" in "us-east-2"
[]  you can enable it with 'eksctl utils update-cluster-logging --region=us-east-2 --cluster=my-eks-cluster'
[]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "my-eks-cluster" in "us-east-2"
[]  2 sequential tasks: { create cluster control plane "my-eks-cluster", 2 sequential sub-tasks: { no tasks, create managed nodegroup "my-nodes" } }
[]  building cluster stack "eksctl-my-eks-cluster-cluster"
[]  deploying stack "eksctl-my-eks-cluster-cluster"
[]  building managed nodegroup stack "eksctl-my-eks-cluster-nodegroup-my-nodes"
[]  deploying stack "eksctl-my-eks-cluster-nodegroup-my-nodes"
[]  waiting for the control plane availability...
[]  saved kubeconfig as "/Users/maism/.kube/config"
[]  no tasks
[]  all EKS cluster resources for "my-eks-cluster" have been created
[]  nodegroup "my-nodes" has 2 node(s)
[]  node "ip-192-168-21-142.us-east-2.compute.internal" is ready
[]  node "ip-192-168-62-117.us-east-2.compute.internal" is ready
[]  waiting for at least 2 node(s) to become ready in "my-nodes"
[]  nodegroup "my-nodes" has 2 node(s)
[]  node "ip-192-168-21-142.us-east-2.compute.internal" is ready
[]  node "ip-192-168-62-117.us-east-2.compute.internal" is ready
[]  kubectl command should work with "/Users/maism/.kube/config", try 'kubectl get nodes'
[]  EKS cluster "my-eks-cluster" in "us-east-2" region is ready

Once the cluster is created, eksctl will write the cluster credentials for the newly created cluster to your local kubeconfig file (~/.kube/config). To test the cluster credentials, you can run the following command to get the list of nodes in the cluster.

kubectl get nodes

Output:

NAME                                           STATUS   ROLES    AGE     VERSION
ip-192-168-21-142.us-east-2.compute.internal   Ready    <none>   2m35s   v1.17.9-eks-4c6976
ip-192-168-62-117.us-east-2.compute.internal   Ready    <none>   2m34s   v1.17.9-eks-4c6976

Step 2: Create an instance profile role on AWS#

You need to add permission to the instance profile you use for instances deployed by Hopsworks.ai to give them access to EKS and ECR. Go to the IAM service in the AWS management console, click Roles, search for your role, and click on it. Click on Add inline policy. Go to the JSON tab and replace the existing JSON permissions with the JSON permissions below..

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowPullMainImages",
            "Effect": "Allow",
            "Action": [
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage"
            ],
            "Resource": [
                "arn:aws:ecr:*:*:repository/filebeat",
                "arn:aws:ecr:*:*:repository/base"
            ]
        },
        {
            "Sid": "AllowPushandPullImages",
            "Effect": "Allow",
            "Action": [
                "ecr:CreateRepository",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "ecr:CompleteLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:InitiateLayerUpload",
                "ecr:DeleteRepository",
                "ecr:BatchCheckLayerAvailability",
                "ecr:PutImage",
                "ecr:ListImages",
                "ecr:BatchDeleteImage",
                "ecr:GetLifecyclePolicy",
                "ecr:PutLifecyclePolicy"
            ],
            "Resource": [
                "arn:aws:ecr:*:*:repository/*/filebeat",
                "arn:aws:ecr:*:*:repository/*/base"
            ]
        },
        {
            "Sid": "AllowGetAuthToken",
            "Effect": "Allow",
            "Action": "ecr:GetAuthorizationToken",
            "Resource": "*"
        },
        {
            "Sid": "AllowDescirbeEKS",
            "Effect": "Allow",
            "Action": "eks:DescribeCluster",
            "Resource": "arn:aws:eks:*:*:cluster/*"
        }
    ]
}

Click on Review policy. Give a name to your policy and click on Create policy.

Copy the Role ARN of your profile (not to be confused with the Instance Profile ARNs two lines bellow).

Coppy the Role ARN
Coppy the *Role ARN*

Step 3: Allow your role to use your EKS cluster#

You need to configure your EKS cluster to accept connections from the role you created above. This is done by using the following kubectl command. For more details, check Managing users or IAM roles for your cluster.

Note

The kubectl edit command uses the vi editor by default, however, you can override this behavior by setting KUBE_EDITOR to your preferred editor.

KUBE_EDITOR="vi" kubectl edit configmap aws-auth -n kube-system

Output:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
mapRoles: |
    - groups:
        - system:bootstrappers
        - system:nodes
        rolearn: arn:aws:iam::xxxxxxxxxxxx:role/eksctl-my-eks-cluster-nodegroup-m-NodeInstanceRole-FQ7L0HQI4NCC
        username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
creationTimestamp: "2020-08-24T07:42:31Z"
name: aws-auth
namespace: kube-system
resourceVersion: "770"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: c794b2d8-9f10-443d-9072-c65d0f2eb552

Follow the example below (lines 13-16) to add your role to mapRoles and assign the system:masters group to your role. Make sure to replace 'YOUR ROLE RoleARN' with the Role ARN you copied in the previous step before saving.

Warning

Make sure to keep the same formatting as in the example below. The configuration format is sensitive to indentation and copy-pasting does not always keep the correct indentation.

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::xxxxxxxxxxxx:role/eksctl-my-eks-cluster-nodegroup-m-NodeInstanceRole-FQ7L0HQI4NCC
      username: system:node:{{EC2PrivateDNSName}}
    - groups:
      - system:masters
      rolearn: <YOUR ROLE RoleARN>
      username: hopsworks
kind: ConfigMap
metadata:
creationTimestamp: "2020-08-24T07:42:31Z"
name: aws-auth
namespace: kube-system
resourceVersion: "770"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: c794b2d8-9f10-443d-9072-c65d0f2eb552

Once you are done with editing the configmap, save it and exit the editor. The output should be:

configmap/aws-auth edited

Step 4: Open Hopsworks required ports on your EKS cluster security group#

To keep this documentation simple will run Hopsworks in the same virtual network as the EKS cluster. For this purpose, we need to open ports for HTTP (80) and HTTPS (443) to allow Hopsworks to run with all its functionalities.

Note

It is possible not to open ports 80 and 443 at the cost of some features. See Limiting permissions for more details.

You can also use VPC peering to run hopsworks and EKS in two different VPCs. Make sure to create the peering before starting the hopsworks cluster as it connects to EKS at startup.

First, you need to get the name of the security group of your EKS cluster by using the following eksctl command. Notice that you need to replace my-eks-cluster with the name of your cluster.

eksctl utils describe-stacks --region=us-east-2 --cluster=my-eks-cluster | grep 'OutputKey: "ClusterSecurityGroupId"' -a1

Check the output for OutputValue, which will be the id of your EKS security group.

ExportName: "eksctl-my-eks-cluster-cluster::ClusterSecurityGroupId",
OutputKey: "ClusterSecurityGroupId",
OutputValue: "YOUR_EKS_SECURITY_GROUP_ID"

Go to the Security Groups section of EC2 in the AWS management console and search for your security group using the id obtained above. Note the VPC ID, you will need it when creating the hopsworks cluster. Then, click on it then go to the Inbound rules tab and click on Edit inbound rules. You should now see the following screen.

Edit inbound rules
Edit inbound rules

Add two rules for HTTP and HTTPS as follows:

Edit inbound rules
Edit inbound rules

Click Save rules to save the updated rules to the security group.

Step 5: Allow Hopsworks.ai to delete ECR repositories on your behalf#

For hopsworks.ai to be able to clean up the ECR repo when terminating your hopsworks cluster, you need to add a new inline policy to the Cross-Account role or user connected to Hopsworks.ai, that you set up when connecting your AWS account to hopsworks.ai.

Navigate to AWS management console, then click on Roles or Users depending on which connection method you have used in Hopsworks.ai, and then search for your role or user name and click on it. Go to the Permissions tab, click on Add inline policy and go to the JSON tab. Replace the existing JSON permissions with the JSON permissions below. Click on Review policy, name it, and click Create policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowDeletingECRRepositories",
            "Effect": "Allow",
            "Action": [
                "ecr:DeleteRepository"
            ],
            "Resource": [
                "arn:aws:ecr:*:*:repository/*/filebeat",
                "arn:aws:ecr:*:*:repository/*/base"
            ]
        }
    ]
}

Step 6: Create a Hopsworks cluster with EKS and ECR support#

In Hopsworks.ai, select Create cluster. Choose the region of your EKS cluster and fill in the name of your S3 bucket, then click Next:

Create Hopsworks cluster
Create Hopsworks cluster

Choose your preferred SSH key to use with the cluster, then click Next:

Choose SSH key
Choose SSH key

Choose the instance profile role that you have created in Step 2 (click on the refresh button if your instance profile is not in the list), then click Next:

Choose instance profile role
Choose instance profile role

Choose the backup retention period and click Next:

Choose the backup retention policy
Choose the backup retention policy

Choose Enabled to enable the use of Amazon EKS and ECR:

Choose Enabled
Choose Enabled

Add your EKS cluster name, then click Next:

Add EKS cluster name
Add EKS cluster name

Choose the VPC of your EKS cluster. It's name should have the form eksctl-YOUR-CLUSTER-NAME-cluster. You can also find it using the VPC ID you noted in Step 4 (click on the refresh button if the VPC is not in the list). Then click Next:

Choose VPC
Choose VPC

Choose any of the subnets in the VPC, then click Next.

Note

Avoid private subnets if you want to enjoy all the hopsworks features.

Choose Subnet
Choose Subnet

Choose the security group that you have updated in Step 4, then click Next:

Note

Select the Security Group with the same id as in Step 4 and NOT the ones containing ControlPlaneSecurity or ClusterSharedNode in their name.

Choose Security Group
Choose Security Group

Click Review and submit, then Create. Once the cluster is created, Hopsworks will use EKS to launch Python jobs, Jupyter servers, and ML model servings.