Fill pandas column forward iteratively, but without using iteration?

Question

Fill pandas column forward iteratively, but without using iteration?

71 Views Asked by DISC-O At 16 June 2025 at 00:30

I have a pandas data frame with a column where a condition is met based on other elements in the data frame (not shown). Additionally, I have a column that extends the validness one row further with the following rule:

If a valid row is followed directly by ExtendsValid, that row is also valid, even if the underlying valid condition doesnt apply. Continue filling valid forward as long as ExtendsValid is 1

I have illustrated the result in column "FinalValid" (desired result. Doesnt have to be a new column, can also fill Valid forward). Note that rows 8 and 9 in the example also become valid. Also note that row 13 does NOT result in FinalValid, because you need a preceding valid row. Preceding valid row can be Valid or an extended valid row.

So far if I had a problem like that I did a cumbersome multi-step process:

Create a new column for when "Valid" or "ExtendsValid" is true
Create a new column marking the start point for each "sub-series" (a consecutive set of ones)
Number each sub-series
fillna using "group by" for each sub series

I can provide sample code but I am really looking for a totally different, more efficient approach, which of course must be non-iterating as well.

Any ideas would be welcome.

#	Valid	ExtendsValid	FinalValid
1	0	0	0
2	1	0	1
3	0	0	0
4	0	0	0
5	1	0	1
6	0	0	0
7	1	0	1
8	0	1	1
9	0	1	1
10	0	0	0
11	1	0	1
12	0	0	0
13	0	1	0
14	0	0	0

Original Q&A

There are 2 best solutions below

Andrej Kesely On 04 November 2023 at 20:35

Try:

state, data = 0, []
for v, e in zip(df.Valid, df.ExtendsValid):
    state = v or (not v and e and state)
    data.append(int(state))

df["FinalValid_new"] = data
print(df)

Prints:

     #  Valid  ExtendsValid  FinalValid  FinalValid_new
0    1      0             0           0               0
1    2      1             0           1               1
2    3      0             0           0               0
3    4      0             0           0               0
4    5      1             0           1               1
5    6      0             0           0               0
6    7      1             0           1               1
7    8      0             1           1               1
8    9      0             1           1               1
9   10      0             0           0               0
10  11      1             0           1               1
11  12      0             0           0               0
12  13      0             1           0               0
13  14      0             0           0               0

Note: For more performance I recommend to look at numba package.

EDIT: Example using numba:

from numba import njit


@njit
def compute_valid(valid, extends_valid, out):
    state = 0
    for i, (v, e) in enumerate(zip(valid, extends_valid)):
        state = v or (not v and e and state)
        out[i] = int(state)


df["FinalValid_new"] = 0
compute_valid(
    df.Valid.to_numpy(), df.ExtendsValid.to_numpy(), df.FinalValid_new.to_numpy()
)
print(df)

**mozway** · Accepted Answer

IIUC, you want to ffill the 1s only if there is an uninterrupted series of 1s starting on Valid and eventually continuing on ExtendsValid.

For this you can use a groupby.cummin:

df['FinalValid'] = (
 (df['Valid']|df['ExtendsValid'])
 .groupby(df['Valid'].cumsum())
 .cummin()
 )

Output:

NB. I slightly modified the input on row 3 to better illustrate the logic.

     #  Valid  ExtendsValid  FinalValid
0    1      0             0           0
1    2      1             0           1
2    3      0             0           0
3    4      0             1           0
4    5      1             0           1
5    6      0             0           0
6    7      1             0           1
7    8      0             1           1
8    9      0             1           1
9   10      0             0           0
10  11      1             0           1
11  12      0             0           0

Fill pandas column forward iteratively, but without using iteration?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in FILLNA

Trending Questions

Popular # Hahtags

Popular Questions

#	Valid	ExtendsValid	FinalValid
1	0	0	0
2	1	0	1
3	0	0	0
4	0	0	0
5	1	0	1
6	0	0	0
7	1	0	1
8	0	1	1
9	0	1	1
10	0	0	0
11	1	0	1
12	0	0	0
13	0	1	0
14	0	0	0

#	Valid	ExtendsValid	FinalValid
1	0	0	0
2	1	0	1
3	0	0	0
4	0	0	0
5	1	0	1
6	0	0	0
7	1	0	1
8	0	1	1
9	0	1	1
10	0	0	0
11	1	0	1
12	0	0	0
13	0	1	0
14	0	0	0

#	Valid	ExtendsValid	FinalValid
1	0	0	0
2	1	0	1
3	0	0	0
4	0	0	0
5	1	0	1
6	0	0	0
7	1	0	1
8	0	1	1
9	0	1	1
10	0	0	0
11	1	0	1
12	0	0	0
13	0	1	0
14	0	0	0