Here is my data:
times = pd.date_range(start=pd.Timestamp.now(), end=pd.Timestamp.now() + pd.Timedelta(minutes=1),
periods=61)
data = np.arange(61)
df = pd.DataFrame({'times': times, 'data': data})
output:
times data
0 2024-03-20 10:38:44.100877000 0
1 2024-03-20 10:38:45.100877416 1
2 2024-03-20 10:38:46.100877833 2
3 2024-03-20 10:38:47.100878250 3
4 2024-03-20 10:38:48.100878666 4
.. ... ...
56 2024-03-20 10:39:40.100900333 56
57 2024-03-20 10:39:41.100900750 57
58 2024-03-20 10:39:42.100901166 58
59 2024-03-20 10:39:43.100901583 59
60 2024-03-20 10:39:44.100902000 60
If I want to group this with a rolling window of say 2 seconds I can do this:
df_windows = df.rolling(on='times', window=pd.Timedelta(seconds=2))
for window in df_windows:
print(window)
Then I get this:
times
2024-03-20 10:48:09.273265 0
data
times
2024-03-20 10:48:09.273265000 0
2024-03-20 10:48:10.273265333 1
data
times
2024-03-20 10:48:10.273265333 1
2024-03-20 10:48:11.273265666 2
data
times
2024-03-20 10:48:11.273265666 2
2024-03-20 10:48:12.273266000 3
data
Cool. But if I don't want a window computed relative to every single row then pandas seems to be lacking features to do that? E.g. a step parameter was added to rolling (https://github.com/pandas-dev/pandas/issues/15354) but it doesn't work for this case:
df_windows = df.rolling(on='times', window=pd.Timedelta(seconds=2), step=2)
NotImplementedError: step is not supported with frequency windows
It also doesn't make much sense because 2 is not a meaningful step, it should be a
pd.Timedelta object, but the step argument has to be an integer.
So, it seems like the rolling function cannot achieve what I want. So, what workaround is there in pandas? I would like one that works with irregular data, i.e. does not rely on my
timestamps being at some regular frequency. I can do something with groupby to get time groups, but I don't see a way to get overlapping windows using groupby...