So, I'm using Amazon Deequ in spark, and I have a dataframe 'df' with two columns being of type 'Long' or numeric. I simply want to check:
value(column1) lies between value(column2)-20% and value(column2)+20%
for all rows
I'm not sure what check to put here:
val verificationResult: VerificationResult = { VerificationSuite()
.onData(df)
.addCheck(
Check(CheckLevel.Error, "Review Check")
//.funtionToCheckThis()
)
.run()
Check
has a methodsatisfies
which can take a column expression as condition parameter.To check whether
column1
is between-20%column2
and+20%column2
, you can use expression like:|column1 - column2| < 0.20*column2
or
column1 between 0.80*column2 and 1.20*column2
: