Replace NaN by 0, starting after first non-nan value per row

48 Views Asked by At

I have a pandas DataFrame in which I need to populate the NaN values by 0 starting from the first non-nan numeric value in each row.

It is close to this post, but doing the operation per row and not per column: How to populate NaN by 0, starting after first non-nan value

Example df

   name day1 day2 day3 day4
0 Alice  NaN  5.0  NaN  6.0
1   Bob  NaN  NaN  3.0  2.0
2 Carol  3.0  NaN  NaN  4.0

Desired output

   name day1 day2 day3 day4
0 Alice  NaN  5.0  0.0  6.0
1   Bob  NaN  NaN  3.0  2.0
2 Carol  3.0  0.0  0.0  4.0

I tried to modify the method from linked question, but also some workaround this post, unfortunately didn't managed to get through.

Thanks!

1

There are 1 best solutions below

0
mozway On BEST ANSWER

You can craft a boolean mask with notna+cummax:

cols = list(df.filter(like='day'))
# or
# cols = ['day1', 'day2', 'day3', 'day4']

df[cols] = df[cols].fillna(0).where(df[cols].notna().cummax(axis=1))

Output:

    name  day1  day2  day3  day4
0  Alice   NaN   5.0   0.0   6.0
1    Bob   NaN   NaN   3.0   2.0
2  Carol   3.0   0.0   0.0   4.0

Intermediate:

# df[cols].notna().cummax(axis=1)

    day1   day2  day3  day4
0  False   True  True  True
1  False  False  True  True
2   True   True  True  True

Or adapting the linked approach be used with rows and ignoring the non-day column (which I don't like as much as the above):

df[cols] = df[cols].fillna(df[cols].mask(df[cols].ffill(axis=1).notna(), 0), axis=1)