I couldn't find a way to have a dataframe that has the difference of 2 dataframes based on a column. So basically:
dfA = ID, val
1, test
2, other test
dfB = ID, val
2, other test
I want to have a dfC
that holds the difference dfA - dfB
based on column ID
dfC = ID, val
1, test
merge the dataframe on ID
In the merged dataframe, name collisions are avoided using the suffix
_x
&_y
to denote left and right source dataframes.So, you'll end up with (most likely)
val_x
andval_y
. compare these columns however you want to. For example:Use this as a mask to get to the desired
dfC
in your question.