How to select a range of consecutive dates of a dataframe with many users in pandas

74 Views Asked by At

I have a dataframe with 19M rows of different customers (~10K customers) and for their daily consumption over different date ranges. I have resampled this data into weekly consumption and the resulted dataframe is 2M rows. I want to know the ranges of consecutive dates for each customer and select those with the max(range). Any ideas? Thank you!

1

There are 1 best solutions below

2
On

It would be great if you could post some example code, so the replies will be more specific.

You probably want to do something like earliest = df.groupby('Customer_ID').min()['Consumption_date'] to get the earliest consumption date per customer, and latest = df.groupby('Customer_ID').max()['Consumption_date'] for the latest consumption date, and then take the difference time_span = latest-earliest to get the time span per customer.

Knowing the specific df and variable names would be great