Python Environments#
Introduction#
Hopsworks postulates that building ML systems following the FTI pipeline architecture is best practice. This architecture consists of three independently developed and operated ML pipelines:
- Feature Pipeline: takes as input raw data that it transforms into features (and labels)
- Training Pipeline: takes as input features (and labels) and outputs a trained model
- Inference Pipeline: takes new feature data and a trained model and makes predictions.
In order to facilitate the development of these pipelines Hopsworks bundles several python environments containing necessary dependencies. Each environment can also be customized further by installing additional dependencies from PyPi, Conda, Wheel files, GitHub repos or applying custom Dockerfiles on top.
Step 1: Go to environments page#
Under the Project settings
section you can find the Python environment
setting.
Step 2: List available environments#
Environments listed under FEATURE ENGINEERING
corresponds to environments you would use in a feature pipeline, MODEL TRAINING
maps to environments used in a training pipeline and MODEL INFERENCE
are what you would use in inference pipelines.
Feature engineering#
The FEATURE ENGINEERING
environments can be used in Jupyter notebooks, a Python job or a PySpark job.
python-feature-pipeline
for writing feature pipelines using Pythonspark-feature-pipeline
for writing feature pipelines using PySpark
Model training#
The MODEL TRAINING
environments can be used in Jupyter notebooks or a Python job.
tensorflow-training-pipeline
to train TensorFlow modelstorch-training-pipeline
to train PyTorch modelspandas-training-pipeline
to train XGBoost, Catboost and Sklearn models
Model inference#
The MODEL INFERENCE
environments can be used in a deployment using a custom predictor script.
tensorflow-inference-pipeline
to load and serve TensorFlow modelstorch-inference-pipeline
to load and serve PyTorch modelspandas-inference-pipeline
to load and serve XGBoost, Catboost and Sklearn modelsminimal-inference-pipeline
to install your own custom framework, contains a minimal set of dependencies
Next steps#
In this guide you learned how to find the bundled python environments and where they can be used. Now you can test out the environment in a Jupyter notebook.