I have the following dataframe.
ID path1 path2 path3
1 12 NaN NaN
1 1 5 NaN
1 2 NaN ''
1 2 4 111
2 123 NaN NaN
3 11 25 NaN
3 1 NaN NaN
3 21 34 NaN
3 NaN NaN NaN
I want to update column values based on ID if there are NaN values. first priority is path1, path2 and path3 have values then keep. check which column has more results keep that column.
What is the best way to get the result of the dataframe below?
ID path1 path2 path3
1 2 4 111
2 123 NaN NaN
3 11 25 NaN
3 21 34 NaN
Group the dataframe by
IDthen get all the records which have maximum number of nonNaNvalues in the rows by applying a function to the group.Here is slightly modified version as above code creates mask and calls sum twice, it can be avoided by use of normal function than a lambda function: