Get next available date in Pandas filter by day

1.2k Views Asked by At

I have filtered the datetime64[ns] type in pandas dataframe to get data that falls on specific date of each month using the following line of code.

df[df['Date'].map(lambda x: x.day) == 1]

The output is as follows:

19.9    2013-07-01
34.8    2013-08-01
12.9    2013-10-01
12.6    2013-11-01

But if you notice the entry for 2013-09-01 is missing as it is not available in the original dataset. In such situation I want to get data for 2013-09-02. Ideally if a date falls on weekend (Saturday and Sunday or any missing date like holidays or data not available for specific date), I want to get the data for the next available date. Wondering if we can achieve using pandas or I need to manually iterate over perform this functionality.

2

There are 2 best solutions below

1
On BEST ANSWER

I think you need DatetimeIndex with asfreq and method='bfill' for back filling missing values:

df = df.set_index('Date').asfreq('d', method='bfill')

Then filter by DatetimeIndex.day:

df1 = df[df.index.day == 1]

Sample:

print (df)
    Val       Date
0  19.9 2013-07-01
1  34.8 2013-08-01
2  10.4 2013-09-02
3  12.9 2013-10-01
4  12.6 2013-11-01

print (df.dtypes)
Val            float64
Date    datetime64[ns]

df = df.set_index('Date').asfreq('d', method='bfill')
df1 = df[df.index.day == 1]
print (df1)
             Val
Date            
2013-07-01  19.9
2013-08-01  34.8
2013-09-01  10.4
2013-10-01  12.9
2013-11-01  12.6
0
On

You can also do so by setting the date as the index and searching for the next existing date to the first day of each month using index.get_loc() and set method to be bfill:

print(df)
             Val
Date            
2013-07-01  19.9
2013-08-01  34.8
2013-08-02  34.8
2013-09-02  10.4
2013-10-01  12.9
2013-11-01  12.6

df = df.set_index('Date')
df.iloc[[df.index.get_loc(datetime.datetime(date[0],date[1],1),
                   method='bfill') for date,_ in df.groupby(
                   [df.index.year,df.index.month])]]

             Val
Date            
2013-07-01  19.9
2013-08-01  34.8
2013-09-02  10.4
2013-10-01  12.9
2013-11-01  12.6