How to filter a pandas dataframe by unique column values

48 Views Asked by claghorn At 07 October 2023 at 17:31

I have a pandas data frame with emails and I want to extract only the unique emails per row. I tried the code below but it does not work. It returns no change to the original data frame. Here is the original data frame: Here is the wanted data frame:

df = pd.DataFrame({'z':[1,2,3,4],'a':['[email protected]','[email protected]','[email protected]','[email protected]'], 'b':['[email protected]','[email protected]','[email protected]','[email protected]'],'c':['[email protected]','[email protected]','[email protected]','[email protected]']})
df.to_csv('../output/try.csv', index=False)

df = pd.read_csv('../output/try.csv')
df2 = df.drop_duplicates(subset=['a', 'b', 'c'])
df2.to_csv('../output/try2.csv', index=False)

I've seen solutions that work with numbers in the columns but I have strings and for some reason it does not work with email strings. I tried the following code but it does nothing. df2 = df.drop_duplicates(subset=['a', 'b', 'c'])

Original Q&A

There are 1 best solutions below

Shubham Sharma On 07 October 2023 at 17:49 BEST ANSWER

DataFrame.drop_duplicates will check for duplicate rows in the subset along the index axis but here you need to check for duplicates along each row so you have to apply this function on each row along column axis.

cols = ['a', 'b', 'c']
df[cols] = df[cols].apply(pd.Series.drop_duplicates, axis=1)

   z                 a                 b                 c
0  1  [email protected]  [email protected]  [email protected]
1  2     [email protected]               NaN               NaN
2  3      [email protected]               NaN               NaN
3  4  [email protected]  [email protected]               NaN

How to filter a pandas dataframe by unique column values

There are 1 best solutions below

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in DISTINCT-VALUES

Trending Questions

Popular # Hahtags

Popular Questions