Skip to content

Transformation Function#

[source]

TransformationFunction#

hsfs.transformation_function.TransformationFunction(
    featurestore_id,
    hopsworks_udf,
    version=None,
    id=None,
    transformation_type=None,
    type=None,
    items=None,
    count=None,
    href=None,
    **kwargs
)

DTO class for transformation functions.

Arguments

  • featurestore_id : int. Id of the feature store in which the transformation function is saved.
  • hopsworks_udf : HopsworksUDF. The meta data object for UDF in Hopsworks, which can be created using the @udf decorator.
  • version : int. The version of the transformation function.
  • id : int. The id of the transformation function in the feature store.
  • transformation_type : UDFType. The type of the transformation function. Can be "on-demand" or "model-dependent"

Properties#

[source]

hopsworks_udf#

Meta data class for the user defined transformation function.


[source]

id#

Transformation function id.


[source]

output_column_names#

Names of the output columns generated by the transformation functions


[source]

transformation_statistics#

Feature statistics required for the defined UDF


[source]

transformation_type#

Type of the Transformation : Can be "model dependent" or "on-demand"


[source]

version#

Version of the transformation function.


Methods#

[source]

delete#

TransformationFunction.delete()

Delete transformation function from backend.

Example

# import hopsworks udf decorator
from hopworks import udf

# define function
@udf(int)
def plus_one(value):
    return value + 1

# create transformation function
plus_one_meta = fs.create_transformation_function(
        transformation_function=plus_one,
        version=1
    )
# persist transformation function in backend
plus_one_meta.save()

# retrieve transformation function
plus_one_fn = fs.get_transformation_function(name="plus_one")

# delete transformation function from backend
plus_one_fn.delete()

[source]

save#

TransformationFunction.save()

Save a transformation function into the backend.

Example

# import hopsworks udf decorator
from hopworks import udf

# define function
@udf(int)
def plus_one(value):
    return value + 1

# create transformation function
plus_one_meta = fs.create_transformation_function(
        transformation_function=plus_one,
        version=1
    )

# persist transformation function in backend
plus_one_meta.save()

Creation#

[source]

create_transformation_function#

FeatureStore.create_transformation_function(transformation_function, version=None)

Create a transformation function metadata object.

Example

# define the transformation function as a Hopsworks's UDF
@udf(int)
def plus_one(value):
    return value + 1

# create transformation function
plus_one_meta = fs.create_transformation_function(
        transformation_function=plus_one,
        version=1
    )

# persist transformation function in backend
plus_one_meta.save()

Lazy

This method is lazy and does not persist the transformation function in the feature store on its own. To materialize the transformation function and save call the save() method of the transformation function metadata object.

Arguments

  • transformation_function hsfs.hopsworks_udf.HopsworksUdf: Hopsworks UDF.

Returns:

TransformationFunction: The TransformationFunction metadata object.


Retrieval#

[source]

get_transformation_function#

FeatureStore.get_transformation_function(name, version=None)

Get transformation function metadata object.

Get transformation function by name. This will default to version 1

# get feature store instance
fs = ...

# get transformation function metadata object
plus_one_fn = fs.get_transformation_function(name="plus_one")

Get built-in transformation function min max scaler

# get feature store instance
fs = ...

# get transformation function metadata object
min_max_scaler_fn = fs.get_transformation_function(name="min_max_scaler")

Get transformation function by name and version

# get feature store instance
fs = ...

# get transformation function metadata object
min_max_scaler = fs.get_transformation_function(name="min_max_scaler", version=2)

You can define in the feature view transformation functions as dict, where key is feature name and value is online transformation function instance. Then the transformation functions are applied when you read training data, get batch data, or get feature vector(s).

Attach transformation functions to the feature view

# get feature store instance
fs = ...

# define query object
query = ...

# get transformation function metadata object
min_max_scaler = fs.get_transformation_function(name="min_max_scaler", version=1)

# attach transformation functions
feature_view = fs.create_feature_view(
    name='feature_view_name',
    query=query,
    labels=["target_column"],
    transformation_functions=[min_max_scaler("feature1")]
)

Built-in transformation functions are attached in the same way. The only difference is that it will compute the necessary statistics for the specific function in the background. For example min and max values for min_max_scaler; mean and standard deviation for standard_scaler etc.

Attach built-in transformation functions to the feature view

# get feature store instance
fs = ...

# define query object
query = ...

# retrieve transformation functions
min_max_scaler = fs.get_transformation_function(name="min_max_scaler")
standard_scaler = fs.get_transformation_function(name="standard_scaler")
robust_scaler = fs.get_transformation_function(name="robust_scaler")
label_encoder = fs.get_transformation_function(name="label_encoder")

# attach built-in transformation functions while creating feature view
feature_view = fs.create_feature_view(
    name='transactions_view',
    query=query,
    labels=["fraud_label"],
    transformation_functions = [
        label_encoder("category_column"),
        robust_scaler("weight"),
        min_max_scaler("age"),
        standard_scaler("salary")
    ]
)

Arguments

  • name str: name of transformation function.
  • version int | None: version of transformation function. Optional, if not provided all functions that match to provided name will be retrieved.

Returns:

TransformationFunction: The TransformationFunction metadata object.


[source]

get_transformation_functions#

FeatureStore.get_transformation_functions()

Get all transformation functions metadata objects.

Get all transformation functions

# get feature store instance
fs = ...

# get all transformation functions
list_transformation_fns = fs.get_transformation_functions()

Returns:

List[TransformationFunction]. List of transformation function instances.