Connection#
connection#
Connection.connection(
host=None,
port=443,
project=None,
engine=None,
hostname_verification=False,
trust_store_path=None,
cert_folder="/tmp",
api_key_file=None,
api_key_value=None,
)
Connection factory method, accessible through hopsworks.connection()
.
This class provides convenience classmethods accessible from the hopsworks
-module:
Connection factory
For convenience, hopsworks
provides a factory method, accessible from the top level module, so you don't have to import the Connection
class manually:
import hopsworks
conn = hopsworks.connection()
Save API Key as File
To get started quickly, you can simply create a file with the previously created Hopsworks API Key and place it on the environment from which you wish to connect to Hopsworks.
You can then connect by simply passing the path to the key file when instantiating a connection:
import hopsworks
conn = hopsworks.connection(
'my_instance', # DNS of your Hopsworks instance
443, # Port to reach your Hopsworks instance, defaults to 443
api_key_file='hopsworks.key', # The file containing the API key generated above
hostname_verification=True) # Disable for self-signed certificates
)
project = conn.get_project("my_project")
Clients in external clusters need to connect to the Hopsworks using an API key. The API key is generated inside the Hopsworks platform, and requires at least the "project" scope to be able to access a project. For more information, see the integration guides.
Arguments
- host
str | None
: The hostname of the Hopsworks instance in the form of[UUID].cloud.hopsworks.ai
, defaults toNone
. Do not use the url includinghttps://
when connecting programatically. - port
int
: The port on which the Hopsworks instance can be reached, defaults to443
. - project
str | None
: The name of the project to connect to. When running on Hopsworks, this defaults to the project from where the client is run from. Defaults toNone
. - engine
str | None
: Which engine to use,"spark"
,"python"
or"training"
. Defaults toNone
, which initializes the engine to Spark if the environment provides Spark, for example on Hopsworks and Databricks, or falls back on Hive in Python if Spark is not available, e.g. on local Python environments or AWS SageMaker. This option allows you to override this behaviour."training"
engine is useful when only feature store metadata is needed, for example training dataset location and label information when Hopsworks training experiment is conducted. - hostname_verification
bool
: Whether or not to verify Hopsworks' certificate, defaults toTrue
. - trust_store_path
str | None
: Path on the file system containing the Hopsworks certificates, defaults toNone
. - cert_folder
str
: The directory to store retrieved HopsFS certificates, defaults to"/tmp"
. Only required when running without a Spark environment. - api_key_file
str | None
: Path to a file containing the API Key, defaults toNone
. - api_key_value
str | None
: API Key as string, if provided,api_key_file
will be ignored, however, this should be used with care, especially if the used notebook or job script is accessible by multiple parties. Defaults toNone
.
Returns
Connection
. Connection handle to perform operations on a Hopsworks project.
Properties#
api_key_file#
api_key_value#
cert_folder#
host#
hostname_verification#
port#
project#
trust_store_path#
Methods#
close#
Connection.close()
Close a connection gracefully.
This will clean up any materialized certificates on the local file system of external environments such as AWS SageMaker.
Usage is optional.
Example
import hopsworks
conn = hopsworks.connection()
conn.close()
connect#
Connection.connect()
Instantiate the connection.
Creating a Connection
object implicitly calls this method for you to instantiate the connection. However, it is possible to close the connection gracefully with the close()
method, in order to clean up materialized certificates. This might be desired when working on external environments such as AWS SageMaker. Subsequently you can call connect()
again to reopen the connection.
Example
import hopsworks
conn = hopsworks.connection()
conn.close()
conn.connect()
connection#
Connection.connection(
host=None,
port=443,
project=None,
engine=None,
hostname_verification=False,
trust_store_path=None,
cert_folder="/tmp",
api_key_file=None,
api_key_value=None,
)
Connection factory method, accessible through hopsworks.connection()
.
This class provides convenience classmethods accessible from the hopsworks
-module:
Connection factory
For convenience, hopsworks
provides a factory method, accessible from the top level module, so you don't have to import the Connection
class manually:
import hopsworks
conn = hopsworks.connection()
Save API Key as File
To get started quickly, you can simply create a file with the previously created Hopsworks API Key and place it on the environment from which you wish to connect to Hopsworks.
You can then connect by simply passing the path to the key file when instantiating a connection:
import hopsworks
conn = hopsworks.connection(
'my_instance', # DNS of your Hopsworks instance
443, # Port to reach your Hopsworks instance, defaults to 443
api_key_file='hopsworks.key', # The file containing the API key generated above
hostname_verification=True) # Disable for self-signed certificates
)
project = conn.get_project("my_project")
Clients in external clusters need to connect to the Hopsworks using an API key. The API key is generated inside the Hopsworks platform, and requires at least the "project" scope to be able to access a project. For more information, see the integration guides.
Arguments
- host
str | None
: The hostname of the Hopsworks instance in the form of[UUID].cloud.hopsworks.ai
, defaults toNone
. Do not use the url includinghttps://
when connecting programatically. - port
int
: The port on which the Hopsworks instance can be reached, defaults to443
. - project
str | None
: The name of the project to connect to. When running on Hopsworks, this defaults to the project from where the client is run from. Defaults toNone
. - engine
str | None
: Which engine to use,"spark"
,"python"
or"training"
. Defaults toNone
, which initializes the engine to Spark if the environment provides Spark, for example on Hopsworks and Databricks, or falls back on Hive in Python if Spark is not available, e.g. on local Python environments or AWS SageMaker. This option allows you to override this behaviour."training"
engine is useful when only feature store metadata is needed, for example training dataset location and label information when Hopsworks training experiment is conducted. - hostname_verification
bool
: Whether or not to verify Hopsworks' certificate, defaults toTrue
. - trust_store_path
str | None
: Path on the file system containing the Hopsworks certificates, defaults toNone
. - cert_folder
str
: The directory to store retrieved HopsFS certificates, defaults to"/tmp"
. Only required when running without a Spark environment. - api_key_file
str | None
: Path to a file containing the API Key, defaults toNone
. - api_key_value
str | None
: API Key as string, if provided,api_key_file
will be ignored, however, this should be used with care, especially if the used notebook or job script is accessible by multiple parties. Defaults toNone
.
Returns
Connection
. Connection handle to perform operations on a Hopsworks project.
create_project#
Connection.create_project(name, description=None, feature_store_topic=None)
Create a new project.
Example for creating a new project
import hopsworks
connection = hopsworks.connection()
connection.create_project("my_hopsworks_project", description="An example Hopsworks project")
- name
str
: The name of the project. - description
str
: optional description of the project - feature_store_topic
str
: optional feature store topic name
Returns
Project
. A project handle object to perform operations on.
get_feature_store#
Connection.get_feature_store(name=None, engine=None)
Get a reference to a feature store to perform operations on.
Defaulting to the project name of default feature store. To get a Shared feature stores, the project name of the feature store is required.
Arguments
- name
str | None
: The name of the feature store, defaults toNone
. - engine
str | None
: Which engine to use,"spark"
,"python"
or"training"
. Defaults toNone
, which initializes the engine to Spark if the environment provides Spark, for example on Hopsworks and Databricks, or falls back on Hive in Python if Spark is not available, e.g. on local Python environments or AWS SageMaker. This option allows you to override this behaviour."training"
engine is useful when only feature store metadata is needed, for example training dataset location and label information when Hopsworks training experiment is conducted.
Returns
FeatureStore
. A feature store handle object to perform operations on.
get_model_registry#
Connection.get_model_registry(project=None)
Get a reference to a model registry to perform operations on, defaulting to the project's default model registry. Shared model registries can be retrieved by passing the project
argument.
Arguments
- project
str
: The name of the project that owns the shared model registry, the model registry must be shared with the project the connection was established for, defaults toNone
.
Returns
ModelRegistry
. A model registry handle object to perform operations on.
get_model_serving#
Connection.get_model_serving()
Get a reference to model serving to perform operations on. Model serving operates on top of a model registry, defaulting to the project's default model registry.
Example
import hopsworks
project = hopsworks.login()
ms = project.get_model_serving()
Returns
ModelServing
. A model serving handle object to perform operations on.
get_project#
Connection.get_project(name=None)
Get an existing project.
Arguments
- name
str
: The name of the project.
Returns
Project
. A project handle object to perform operations on.
get_projects#
Connection.get_projects()
Get all projects.
Returns
List[Project]
: List of Project objects
get_secrets_api#
Connection.get_secrets_api()
Get the secrets api.
Returns
SecretsApi
: The Secrets Api handle
project_exists#
Connection.project_exists(name)
Check if a project exists.
Arguments
- name
str
: The name of the project.
Returns
bool
. True if project exists, otherwise False