Transformation Function#
TransformationFunction#
hsfs.transformation_function.TransformationFunction(
featurestore_id,
transformation_fn=None,
version=None,
name=None,
source_code_content=None,
builtin_source_code=None,
output_type=None,
id=None,
type=None,
items=None,
count=None,
href=None,
**kwargs
)
Properties#
id#
Training dataset id.
name#
output_type#
source_code_content#
transformation_fn#
transformer_code#
version#
Methods#
delete#
TransformationFunction.delete()
Delete transformation function from backend.
Example
# define function
def plus_one(value):
return value + 1
# create transformation function
plus_one_meta = fs.create_transformation_function(
transformation_function=plus_one,
output_type=int,
version=1
)
# persist transformation function in backend
plus_one_meta.save()
# retrieve transformation function
plus_one_fn = fs.get_transformation_function(name="plus_one")
# delete transformation function from backend
plus_one_fn.delete()
save#
TransformationFunction.save()
Persist transformation function in backend.
Example
# define function
def plus_one(value):
return value + 1
# create transformation function
plus_one_meta = fs.create_transformation_function(
transformation_function=plus_one,
output_type=int,
version=1
)
# persist transformation function in backend
plus_one_meta.save()
Creation#
create_transformation_function#
FeatureStore.create_transformation_function(transformation_function, output_type, version=None)
Create a transformation function metadata object.
Example
# define function
def plus_one(value):
return value + 1
# create transformation function
plus_one_meta = fs.create_transformation_function(
transformation_function=plus_one,
output_type=int,
version=1
)
# persist transformation function in backend
plus_one_meta.save()
Lazy
This method is lazy and does not persist the transformation function in the feature store on its own. To materialize the transformation function and save call the save()
method of the transformation function metadata object.
Arguments
- transformation_function
callable
: callable object. - output_type
str | bytes | int | numpy.int8 | numpy.int16 | numpy.int32 | numpy.int64 | float | numpy.float64 | datetime.datetime | numpy.datetime64 | datetime.date | bool
: python or numpy output type that will be inferred as pyspark.sql.types type.
Returns:
TransformationFunction
: The TransformationFunction metadata object.
Retrieval#
get_transformation_function#
FeatureStore.get_transformation_function(name, version=None)
Get transformation function metadata object.
Get transformation function by name. This will default to version 1
# get feature store instance
fs = ...
# get transformation function metadata object
plus_one_fn = fs.get_transformation_function(name="plus_one")
Get built-in transformation function min max scaler
# get feature store instance
fs = ...
# get transformation function metadata object
min_max_scaler_fn = fs.get_transformation_function(name="min_max_scaler")
Get transformation function by name and version
# get feature store instance
fs = ...
# get transformation function metadata object
min_max_scaler = fs.get_transformation_function(name="min_max_scaler", version=2)
You can define in the feature view transformation functions as dict, where key is feature name and value is online transformation function instance. Then the transformation functions are applied when you read training data, get batch data, or get feature vector(s).
Attach transformation functions to the feature view
# get feature store instance
fs = ...
# define query object
query = ...
# get transformation function metadata object
min_max_scaler = fs.get_transformation_function(name="min_max_scaler", version=1)
# attach transformation functions
feature_view = fs.create_feature_view(
name='feature_view_name',
query=query,
labels=["target_column"],
transformation_functions={
"column_to_transform": min_max_scaler
}
)
Built-in transformation functions are attached in the same way. The only difference is that it will compute the necessary statistics for the specific function in the background. For example min and max values for min_max_scaler
; mean and standard deviation for standard_scaler
etc.
Attach built-in transformation functions to the feature view
# get feature store instance
fs = ...
# define query object
query = ...
# retrieve transformation functions
min_max_scaler = fs.get_transformation_function(name="min_max_scaler")
standard_scaler = fs.get_transformation_function(name="standard_scaler")
robust_scaler = fs.get_transformation_function(name="robust_scaler")
label_encoder = fs.get_transformation_function(name="label_encoder")
# attach built-in transformation functions while creating feature view
feature_view = fs.create_feature_view(
name='transactions_view',
query=query,
labels=["fraud_label"],
transformation_functions = {
"category_column": label_encoder,
"weight": robust_scaler,
"age": min_max_scaler,
"salary": standard_scaler
}
)
Arguments
- name
str
: name of transformation function. - version
int | None
: version of transformation function. Optional, if not provided all functions that match to provided name will be retrieved.
Returns:
TransformationFunction
: The TransformationFunction metadata object.
get_transformation_functions#
FeatureStore.get_transformation_functions()
Get all transformation functions metadata objects.
Get all transformation functions
# get feature store instance
fs = ...
# get all transformation functions
list_transformation_fns = fs.get_transformation_functions()
Returns:
List[TransformationFunction]
. List of transformation function instances.