Python ffill and example

99 Views Asked by At

I have below dataframe, and as you notice EffectiveDate is beginning of quarter and sometime values in EffectiveDate are missing. I would like to have a continuous monthly dates in EffectiveDate column with all column filled from most recent EffectiveDate.

Example: For each Group = A, Since values for EffectiveDate = 2/1/2022 & 3/1/2022 are missing so need to fill all the column values from 1/1/2022, and so on..

Input DF:

import pandas as pd
    
data = {
        'Group': ['A'] * 24,
        'EffectiveDate': [
            '1/1/2022', '1/1/2022', '1/1/2022', '1/1/2022', '1/1/2022', '1/1/2022',
            '4/1/2022', '4/1/2022', '4/1/2022', '4/1/2022', '4/1/2022', '4/1/2022',
            '7/1/2022', '7/1/2022', '7/1/2022', '7/1/2022', '7/1/2022', '7/1/2022',
            '10/1/2022', '10/1/2022', '10/1/2022', '10/1/2022', '10/1/2022', '10/1/2022'
        ],
        'ForecastDate': [
            '1/1/2022', '2/1/2022', '3/1/2022', '4/1/2022', '5/1/2022', '6/1/2022',
            '4/1/2022', '5/1/2022', '6/1/2022', '7/1/2022', '8/1/2022', '9/1/2022',
            '7/1/2022', '8/1/2022', '9/1/2022', '10/1/2022', '11/1/2022', '12/1/2022',
            '10/1/2022', '11/1/2022', '12/1/2022', '1/1/2023', '2/1/2023', '3/1/2023'
        ],
        'SKU': ['ABC12'] * 24,
        'Source': ['fdhh'] * 24
    }
    
df = pd.DataFrame(data)

Output DF:


    | Group |EffectiveDate| ForecastDate|   SKU| Source |
    |-------|------------|------------|--------|--------|
    | A     | 1/1/2022   | 1/1/2022   | ABC12  | fdhh   |
    | A     | 1/1/2022   | 2/1/2022   | ABC12  | fdhh   |
    | A     | 1/1/2022   | 3/1/2022   | ABC12  | fdhh   |
    | A     | 1/1/2022   | 4/1/2022   | ABC12  | fdhh   |
    | A     | 1/1/2022   | 5/1/2022   | ABC12  | fdhh   |
    | A     | 1/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 2/1/2022   | 2/1/2022   | ABC12  | fdhh   |
    | A     | 2/1/2022   | 3/1/2022   | ABC12  | fdhh   |
    | A     | 2/1/2022   | 4/1/2022   | ABC12  | fdhh   |
    | A     | 2/1/2022   | 5/1/2022   | ABC12  | fdhh   |
    | A     | 2/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 3/1/2022   | 3/1/2022   | ABC12  | fdhh   |
    | A     | 3/1/2022   | 4/1/2022   | ABC12  | fdhh   |
    | A     | 3/1/2022   | 5/1/2022   | ABC12  | fdhh   |
    | A     | 3/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 4/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 5/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 7/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 8/1/2022   | ABC12  | fdhh   |
    | A     | 4/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 5/1/2022   | 5/1/2022   | ABC12  | fdhh   |
    | A     | 5/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 5/1/2022   | 7/1/2022   | ABC12  | fdhh   |
    | A     | 5/1/2022   | 8/1/2022   | ABC12  | fdhh   |
    | A     | 5/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 6/1/2022   | 6/1/2022   | ABC12  | fdhh   |
    | A     | 6/1/2022   | 7/1/2022   | ABC12  | fdhh   |
    | A     | 6/1/2022   | 8/1/2022   | ABC12  | fdhh   |
    | A     | 6/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 7/1/2022   | 7/1/2022   | ABC12  | fdhh   |
    | A     | 7/1/2022   | 8/1/2022   | ABC12  | fdhh   |
    | A     | 7/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 7/1/2022   | 10/1/2022  | ABC12  | fdhh   |
    | A     | 7/1/2022   | 11/1/2022  | ABC12  | fdhh   |
    | A     | 7/1/2022   | 12/1/2022  | ABC12  | fdhh   |
    | A     | 8/1/2022   | 8/1/2022   | ABC12  | fdhh   |
    | A     | 8/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 8/1/2022   | 10/1/2022  | ABC12  | fdhh   |
    | A     | 8/1/2022   | 11/1/2022  | ABC12  | fdhh   |
    | A     | 8/1/2022   | 12/1/2022  | ABC12  | fdhh   |
    | A     | 9/1/2022   | 9/1/2022   | ABC12  | fdhh   |
    | A     | 9/1/2022   | 10/1/2022  | ABC12  | fdhh   |
    | A     | 9/1/2022   | 11/1/2022  | ABC12  | fdhh   |
    | A     | 9/1/2022   | 12/1/2022  | ABC12  | fdhh   |
    | A     | 10/1/2022  | 10/1/2022  | ABC12  | fdhh   |
    | A     | 10/1/2022  | 11/1/2022  | ABC12  | fdhh   |
    | A     | 10/1/2022  | 12/1/2022  | ABC12  | fdhh   |
    | A     | 10/1/2022  | 1/1/2023   | ABC12  | fdhh   |
    | A     | 10/1/2022  | 2/1/2023   | ABC12  | fdhh   |
    | A     | 10/1/2022  | 3/1/2023   | ABC12  | fdhh   |
    | A     | 11/1/2022  | 11/1/2022  | ABC12  | fdhh   |
    | A     | 11/1/2022  | 12/1/2022  | ABC12  | fdhh   |
    | A     | 11/1/2022  | 1/1/2023   | ABC12  | fdhh   |
    | A     | 11/1/2022  | 2/1/2023   | ABC12  | fdhh   |
    | A     | 11/1/2022  | 3/1/2023   | ABC12  | fdhh   |
    | A     | 12/1/2022  | 12/1/2022  | ABC12  | fdhh   |
    | A     | 12/1/2022  | 1/1/2023   | ABC12  | fdhh   |
    | A     | 12/1/2022  | 2/1/2023   | ABC12  | fdhh   |
    | A     | 12/1/2022  | 3/1/2023   | ABC12  | fdhh   |
1

There are 1 best solutions below

4
jezrael On BEST ANSWER

You can convert EffectiveDate to month periods by Serie.dt.to_period, rehsape by GroupBy.cumcount and DataFrame.pivot, so possible use DataFrame.reindex per groups. Then reshape back by DataFrame.stack, convert MultiIndex to columns and filter ForecastDate if greater or equal by EffectiveDate:

df['EffectiveDate'] = pd.to_datetime(df['EffectiveDate']).dt.to_period('m')
df['ForecastDate'] = pd.to_datetime(df['ForecastDate'])

out = (df.assign(g = df.groupby(['Group','EffectiveDate']).cumcount())
        .pivot(index=['Group','EffectiveDate'], columns='g')
        .reset_index(level=0)
        .groupby('Group')
        .apply(lambda x: x.reindex(pd.period_range(x.index.min(),
                                                   pd.Period(f'{x.index.max().year}-12')), 
                                   method='ffill'))
        .drop('Group', axis=1)
        .stack()
        .droplevel(2)
        .rename_axis(['Group','EffectiveDate'])
        .reset_index()
        .assign(EffectiveDate = lambda x: x['EffectiveDate'].dt.to_timestamp())
        .query('ForecastDate >= EffectiveDate'))

print (out.tail(15))
   Group EffectiveDate ForecastDate    SKU Source
54     A    2022-10-01   2022-10-01  ABC12   fdhh
55     A    2022-10-01   2022-11-01  ABC12   fdhh
56     A    2022-10-01   2022-12-01  ABC12   fdhh
57     A    2022-10-01   2023-01-01  ABC12   fdhh
58     A    2022-10-01   2023-02-01  ABC12   fdhh
59     A    2022-10-01   2023-03-01  ABC12   fdhh
61     A    2022-11-01   2022-11-01  ABC12   fdhh
62     A    2022-11-01   2022-12-01  ABC12   fdhh
63     A    2022-11-01   2023-01-01  ABC12   fdhh
64     A    2022-11-01   2023-02-01  ABC12   fdhh
65     A    2022-11-01   2023-03-01  ABC12   fdhh
68     A    2022-12-01   2022-12-01  ABC12   fdhh
69     A    2022-12-01   2023-01-01  ABC12   fdhh
70     A    2022-12-01   2023-02-01  ABC12   fdhh
71     A    2022-12-01   2023-03-01  ABC12   fdhh

EDIT: For dynamic filtering by Group column use helper dictionary mapping with Series.map in last step for filtering:

df['EffectiveDate'] = pd.to_datetime(df['EffectiveDate']).dt.to_period('m')
df['ForecastDate'] = pd.to_datetime(df['ForecastDate'])

mapping = {'A': '2022-10-01',
           'B':'2022-06-01'}

out = (df.assign(g = df.groupby(['Group','EffectiveDate']).cumcount())
        .pivot(index=['Group','EffectiveDate'], columns='g')
        .reset_index(level=0)
        .groupby('Group')
        .apply(lambda x: x.reindex(pd.period_range(x.index.min(),
                                                   pd.Period(f'{x.index.max().year}-12')), 
                                   method='ffill'))
        .drop('Group', axis=1)
        .stack()
        .droplevel(2)
        .rename_axis(['Group','EffectiveDate'])
        .reset_index()
        .assign(EffectiveDate = lambda x: x['EffectiveDate'].dt.to_timestamp())
        .loc[lambda x: (x['ForecastDate']  >= x['EffectiveDate']) &  
                       (x['EffectiveDate'] < x['Group'].map(mapping))]
            )

print(out.tail(10))
   Group EffectiveDate ForecastDate    SKU Source
41     A    2022-07-01   2022-12-01  ABC12   fdhh
43     A    2022-08-01   2022-08-01  ABC12   fdhh
44     A    2022-08-01   2022-09-01  ABC12   fdhh
45     A    2022-08-01   2022-10-01  ABC12   fdhh
46     A    2022-08-01   2022-11-01  ABC12   fdhh
47     A    2022-08-01   2022-12-01  ABC12   fdhh
50     A    2022-09-01   2022-09-01  ABC12   fdhh
51     A    2022-09-01   2022-10-01  ABC12   fdhh
52     A    2022-09-01   2022-11-01  ABC12   fdhh
53     A    2022-09-01   2022-12-01  ABC12   fdhh