HopsworksUDF#

HopsworksUdf#

hsfs.hopsworks_udf.HopsworksUdf(
    func,
    return_types,
    execution_mode,
    name=None,
    transformation_features=None,
    transformation_function_argument_names=None,
    dropped_argument_names=None,
    dropped_feature_names=None,
    feature_name_prefix=None,
    output_column_names=None,
    generate_output_col_names=True,
)

Meta data for user defined functions.

Stores meta data required to execute the user defined function in both spark and python engine. The class generates uses the metadata to dynamically generate user defined functions based on the engine it is executed in.

Arguments

func : Union[Callable, str]. The transformation function object or the source code of the transformation function.
return_types : Union[List[type], type, List[str], str]. A python type or a list of python types that denotes the data types of the columns output from the transformation functions.
name : Optional[str]. Name of the transformation function.
transformation_features : Optional[List[TransformationFeature]]. A list of objects of TransformationFeature that maps the feature used for transformation to their corresponding statistics argument names if any
transformation_function_argument_names : Optional[List[str]]. The argument names of the transformation function.
dropped_argument_names : Optional[List[str]]. The arguments to be dropped from the finial DataFrame after the transformation functions are applied.
dropped_feature_names : Optional[List[str]]. The feature name corresponding to the arguments names that are dropped
feature_name_prefix str | None: Optional[str]. Prefixes if any used in the feature view.
output_column_names str | None: Optional[List[str]]. The names of the output columns returned from the transformation function.
generate_output_col_names bool: bool. Generate default output column names for the transformation function. Default's to True.

Properties#

[source]

dropped_features#

List of features that will be dropped after the UDF is applied.

[source]

execution_mode#

[source]

feature_name_prefix#

The feature name prefix that needs to be added to the feature names

[source]

function_name#

Get the function name of the UDF

[source]

output_column_names#

Output columns names of the transformation function

[source]

return_types#

Get the output types of the UDF

[source]

statistics_features#

List of feature names that require statistics

[source]

statistics_required#

Get if statistics for any feature is required by the UDF

[source]

transformation_context#

Dictionary that contains the context variables required for the UDF. These context variables passed to the UDF during execution.

[source]

transformation_features#

List of feature names to be used in the User Defined Function.

[source]

transformation_statistics#

Feature statistics required for the defined UDF

[source]

unprefixed_transformation_features#

List of feature name used in the transformation function without the feature name prefix.

Methods#

[source]

alias#

HopsworksUdf.alias(*args)

Set the names of the transformed features output by the UDF.

[source]

from_response_json#

HopsworksUdf.from_response_json(json_dict)

Function that constructs the class object from its json serialization.

Arguments

json_dict Dict[str, Any]: Dict[str, Any]. Json serialized dictionary for the class.

Returns

HopsworksUdf: Json deserialized class object.

[source]

json#

HopsworksUdf.json()

Convert class into its json serialized form.

Returns

str: Json serialized object.

[source]

to_dict#

HopsworksUdf.to_dict()

Convert class into a dictionary.

Returns

Dict: Dictionary that contains all data required to json serialize the object.

TransformationFeature#

[source]

TransformationFeature#

hsfs.hopsworks_udf.TransformationFeature(feature_name, statistic_argument_name)

Mapping of feature names to their corresponding statistics argument names in the code.

The statistic_argument_name for a feature name would be None if the feature does not need statistics.

Arguments

feature_name : str. Name of the feature.
statistic_argument_name : str. Name of the statistics argument in the code for the feature specified in the feature name.