Starting from a DataFrame with a date
and user
column, I'd like to add a third count_past_5_days
column to indicate the rolling count of occurrences of each row's user during the past 5 days:
date | user | count_past_5_days |
---|---|---|
2020-01-01 | abc | 1 |
2020-01-01 | def | 1 |
2020-01-02 | abc | 2 |
2020-01-03 | abc | 3 |
2020-01-04 | abc | 4 |
2020-01-04 | def | 2 |
2020-01-04 | ghi | 1 |
2020-01-05 | abc | 5 |
2020-01-06 | abc | 5 |
2020-01-07 | abc | 5 |
I've tried the following:
df.set_index('date').rolling('5D')['user'].count()
But this gets the total count for the past five rolling days, not just for the specific user of the current row. How to get this rolling count, for each row's specific user only?
Try this, you can chain
rolling
ongroupby
:Output: