I have the following problem: my dataframes look something like this:
ID Date Value
1 2016-06-12 2
1 2016-06-13 2.5
1 2016-06-16 4
2 2016-06-12 3
2 2016-06-15 1.5
As you can see I have missing days in my data. So I much rather want something like this:
ID Date Value
1 2016-06-12 2
1 2016-06-13 2.5
1 2016-06-14 NaN
1 2016-06-15 NaN
1 2016-06-16 4
2 2016-06-12 3
2 2016-06-13 NaN
2 2016-06-14 NaN
2 2016-06-15 1.5
In order to solve that I did the following:
df_new = df.groupby('ID').apply(lambda x: x.set_index('Date').resample('1D').first())
This solution works, but takes about half an hour to process a large dataset. Thus, I wanted to know whether if is there a better solution?
First idea is create all posible combinations of
ID
andDate
values with and then merge with left join:Or use
DataFrame.reindex
, but performance should be worse, depends of data: