I have DataFrame in Python Pandas like below:
data types:
ID - int
TYPE - object
TG_A - int
TG_B - int
ID TYPE TG_A TG_B 111 A 1 0 111 B 1 0 222 B 1 0 222 A 1 0 333 B 0 1 333 A 0 1
And I need to drop duplicates in above DataFrame, so as to:
- If value in ID in my DF is duplicated -> drop rows where TYPE = B and TG_A = 1 or TYPE = A and TG_B = 1
So, as a result I need something like below:
ID | TYPE | TG_A | TG_B
----|------|------|-----
111 | A | 1 | 0
222 | A | 1 | 0
333 | B | 0 | 1
How can I do that in Python Pandas ?
You can use two boolean masks and
groupby.idxmax
to get the first non matching value:Output: