So, I'm using Amazon Deequ in spark, and I have a dataframe 'df' with two columns being of type 'Long' or numeric. I simply want to check:
value(column1) lies between value(column2)-20% and value(column2)+20% for all rows
I'm not sure what check to put here:
val verificationResult: VerificationResult = { VerificationSuite()
.onData(df)
.addCheck(
Check(CheckLevel.Error, "Review Check")
//.funtionToCheckThis()
)
.run()
Checkhas a methodsatisfieswhich can take a column expression as condition parameter.To check whether
column1is between-20%column2and+20%column2, you can use expression like:|column1 - column2| < 0.20*column2or
column1 between 0.80*column2 and 1.20*column2: