I have a DataFrame like this:
import pandas as pd
df = pd.DataFrame({'id':[1,2,3,4,5,6,7,8,9,10], 'name': ['mary','mary','mary','tom','tom','john','sarah','tom','tom','tom'], 'age': [30,30,30,25,25,28,36,25,25,25]})
id name age
0 1 mary 30
1 2 mary 30
2 3 mary 30
3 4 tom 25
4 5 tom 25
5 6 john 28
6 7 sarah 36
7 8 tom 25
8 9 tom 25
9 10 tom 25
There are multiple duplicated rows on it (not considering 'id' column).
I want to remove only the tail duplicated rows keeping the first one (not considering 'id' column).
I expect to get this:
id name age
0 1 mary 30
1 2 mary 30
2 3 mary 30
3 4 tom 25
4 5 tom 25
5 6 john 28
6 7 sarah 36
7 8 tom 25
I couldn't find a way to resolve it with drop_duplicates.
Code
out
Intermediate