Validation#
ValidationResult#
hsfs.validation_result.ValidationResult(
status,
message,
value,
features,
rule,
href=None,
expand=None,
items=None,
count=None,
type=None,
)
Metadata object representing the validation result of a single rule of an expectation result of a Feature Group.
Properties#
features#
Feature of the validation result on which the rule was applied.
message#
Message describing the outcome of applying the rule against the feature.
rule#
Feature of the validation result on which the rule was applied.
status#
value#
The computed value of the feature according to the rule.
Methods#
{{expectation_methods}}
Validate a dataframe#
validate#
FeatureGroup.validate(
dataframe=None, log_activity=False, save_report=False, validation_options={}
)
Run validation based on the attached expectations.
Runs any expectation attached with Deequ. But also runs attached Great Expectation Suites.
Arguments
- dataframe
Optional[Union[pandas.DataFrame, pyspark.sql.DataFrame]]
: The PySpark dataframe to run the data validation expectations against. - log_activity
Optional[bool]
: Boolean to indicate whether to persist validation results along with the feature group. Defaults toFalse
. - expectation_suite: Optionally provide an Expectation Suite to override the
one that is possibly attached to the feature group. This is useful for
testing new Expectation suites. When an extra suite is provided, the results
will never be persisted. Defaults to
None
. - validation_options
Optional[Dict[Any, Any]]
: Additional validation options as key-value pairs, defaults to{}
.- key
run_validation
boolean value, set toFalse
to skip validation temporarily on ingestion. - key
save_report
boolean value, set toFalse
to skip upload of the validation report to Hopsworks. - key
ge_validate_kwargs
a dictionary containing kwargs for the validate method of Great Expectations.
- key
Returns
FeatureGroupValidation
, ValidationReport
. The feature group validation metadata object,
as well as the Validation Report produced by Great Expectations.
Retrieval#
get_validations#
FeatureGroup.get_validations(validation_time=None, commit_time=None)
Get feature group data validation results based on the attached expectations.
Arguments
- validation_time: The data validation time, when the data validation started. commit_time: The commit time of a time travel enabled feature group.
Returns
FeatureGroupValidation
. The feature group validation metadata object.