inequality test of two columns from same dataframe in pyspark

953 Views Asked by At

in scala spark we can filter if column A value is not equal to column B or same dataframe as df.filter(col("A")=!=col("B")) How we can do this same in Pyspark ?

I have tried differemt options like df.filter(~(df["A"] == df["B"])) and != operator but got errors

1

There are 1 best solutions below

0
On

Take a look at this snippet:

df = spark.createDataFrame([(1, 2), (1, 1)], "id: int, val: int")
df.show()
+---+---+
| id|val|
+---+---+
|  1|  2|
|  1|  1|
+---+---+

from pyspark.sql.functions import col

df.filter(col("id") != col("val")).show()
+---+---+
| id|val|
+---+---+
|  1|  2|
+---+---+