Skip to content

Validation#

[source]

ValidationResult#

hsfs.validation_result.ValidationResult(
    status,
    message,
    value,
    features,
    rule,
    href=None,
    expand=None,
    items=None,
    count=None,
    type=None,
)

Metadata object representing the validation result of a single rule of an expectation result of a Feature Group.


Properties#

[source]

features#

Feature of the validation result on which the rule was applied.


[source]

message#

Message describing the outcome of applying the rule against the feature.


[source]

rule#

Feature of the validation result on which the rule was applied.


[source]

status#


[source]

value#

The computed value of the feature according to the rule.


Methods#

{{expectation_methods}}

Validate a dataframe#

[source]

validate#

FeatureGroup.validate(
    dataframe=None, log_activity=False, save_report=False, validation_options={}
)

Run validation based on the attached expectations.

Runs any expectation attached with Deequ. But also runs attached Great Expectation Suites.

Arguments

  • dataframe Optional[Union[pandas.DataFrame, pyspark.sql.DataFrame]]: The PySpark dataframe to run the data validation expectations against.
  • log_activity Optional[bool]: Boolean to indicate whether to persist validation results along with the feature group. Defaults to False.
  • expectation_suite: Optionally provide an Expectation Suite to override the one that is possibly attached to the feature group. This is useful for testing new Expectation suites. When an extra suite is provided, the results will never be persisted. Defaults to None.
  • validation_options Optional[Dict[Any, Any]]: Additional validation options as key-value pairs, defaults to {}.
    • key run_validation boolean value, set to False to skip validation temporarily on ingestion.
    • key save_report boolean value, set to False to skip upload of the validation report to Hopsworks.
    • key ge_validate_kwargs a dictionary containing kwargs for the validate method of Great Expectations.

Returns

FeatureGroupValidation, ValidationReport. The feature group validation metadata object, as well as the Validation Report produced by Great Expectations.


Retrieval#

[source]

get_validations#

FeatureGroup.get_validations(validation_time=None, commit_time=None)

Get feature group data validation results based on the attached expectations.

Arguments

  • validation_time: The data validation time, when the data validation started. commit_time: The commit time of a time travel enabled feature group.

Returns

FeatureGroupValidation. The feature group validation metadata object.