Python great expectation conditional logic with pyspark

28 Views Asked by At

I am trying to test a few data validation rules for my spark DF(running great expectation 0.18.9). I want to add a conditional logic such as verify colA is NULL when colB is also NULL . I am referring to the syntax here [https://docs.greatexpectations.io/docs/reference/learn/expectations/conditional_expectations/][1] This is what i am trying

expectations_json = {
        "expectation_suite_name": "name",
        "expectations": [
            {
                "expectation_type": "expect_column_values_to_not_be_null",
                "kwargs": 
                {
                    "column": "colA", 
                    "row_condition" :"col(\"colB\").isNull()",
                    "condition_parser": "great_expectations__experimental__"
                }
            }
            ]
        }

geDF = SparkDFDataset(df)
expectation_suite = ExpectationSuite(**expectations_json)
dq_json_result = geDF.validate(expectation_suite)

It's a sparkDF i am running the code against. The code doesn't give an error, but the returned dq_json contains the following exception

"exception_info": {
        "raised_exception": true,
        "exception_message": "TypeError: SparkDFDataset.expect_column_values_to_not_be_null() got an unexpected keyword argument 'row_condition'"
}

Would appreciate any leads

0

There are 0 best solutions below