Skip to content

CI/CD

You can setup traditional development, staging, and production environment in Hopsworks using Projects. A project enables you provide access control for the different environments - just like a GitHub repository, owners of projects can add and remove members of projects and assign different roles to project members - the "data owner" role can write to feature store, while a "data scientist" can only read from the feature store and create training data.

Dev, Staging, Prod#

You can create dev, staging, and prod projects - either on the same cluster, but mostly commonly, with production on its own cluster:

Versioning#

Hopsworks supports the versioning of ML assets, including:

  • Feature Groups: the version of its schema - breaking schema changes require a new version and backfilling the new version;
  • Feature Views: the version of its schema, and breaking schema changes only require a new version;
  • Models: the version of a model;
  • Deployments: the version of the deployment of a model - a model with the same version can be found in >1 deployment.

Pytest for feature logic and feature pipeline tests#

Pytest and Great Expectations can be used for testing feature pipelines. Pytest is used to test feature logic and for end-to-end feature pipeline tests, while Great Expectations is used for data validation tests. Here, we can see how a feature pipeline test uses sample data to compute features and validate they have been written successfully, first to a development feature store, and then they can be pushed to a staging feature store, before finally being promoted to production.