Alternating averaging times every row

65 Views Asked by At

This is probably a silly question that was already answered, but I'm struggling to find the answer I need. I'm looking to average a large dataset with 1-second times, switching between averaging every 10 minutes then every 20 minutes, then back to 10 minutes consecutively. I can use the pandas 'resample' function to get the full dataset as solely 10-minute or 20-minute averages, but I don't know how to approach alternating between the two.

Here's a sample of my data:

df= pd.DataFrame({'time': time, 'velocity': vel})

                     time           velocity
0     2023-08-29 00:00:10           0.133775
1     2023-08-29 00:00:11           0.152900
2     2023-08-29 00:00:12           0.114675
3     2023-08-29 00:00:13          -0.019125
4     2023-08-29 00:00:14          -0.076450
                  ...                ...
75537 2023-08-29 23:59:55          -0.057325
75538 2023-08-29 23:59:56          -0.095550
75539 2023-08-29 23:59:57          -0.038225
75540 2023-08-29 23:59:58          -0.038225
75541 2023-08-29 23:59:59          -0.057325

I would like to get this, averaging by 10 minutes, the 20, then 10, then 20 and so on, preferably using df.resample:

                     time           velocity
0     2023-08-29 00:00:00           0.003069
1     2023-08-29 00:00:10           0.000212
2     2023-08-29 00:00:30          -0.013170
3     2023-08-29 00:00:40          -0.015599
4     2023-08-29 00:01:00           0.010070

Any and all help is appreciated, and if this is a duplicate question, please direct me to that question because I'm not finding the examples I need. Thank you!

1

There are 1 best solutions below

0
On BEST ANSWER

Example

we need reproducible example to solve your problem. I will make a slightly different example for reproduciblilty, try to understand the logic based on it.

import pandas as pd
import numpy as np
time = pd.date_range('2023-01-01', periods=30, freq='15s')
df = pd.DataFrame({'time':time, 'velocity':np.random.randint(0, 10, 30)})

df.head(5)

    time                velocity
0   2023-01-01 00:00:00 7
1   2023-01-01 00:00:15 9
2   2023-01-01 00:00:30 8
3   2023-01-01 00:00:45 9
4   2023-01-01 00:01:00 0

Code

I will try to calculate the average for 1min and 2min. First, let's resample it to 1min for that.

tmp = df.resample('T', on='time')['velocity'].mean().reset_index()

tmp

    time                velocity
0   2023-01-01 00:00:00 8.25
1   2023-01-01 00:01:00 3.00
2   2023-01-01 00:02:00 1.75
3   2023-01-01 00:03:00 4.00
4   2023-01-01 00:04:00 5.25
5   2023-01-01 00:05:00 3.75
6   2023-01-01 00:06:00 6.00
7   2023-01-01 00:07:00 5.00

make tmp -> 1min & 2min

grp = (tmp.index % 3).isin([0, 1]).cumsum()
out = tmp.groupby(grp).agg({'time':'first', 'velocity':'mean'})

out:

    time                velocity
1   2023-01-01 00:00:00 8.250
2   2023-01-01 00:01:00 2.375
3   2023-01-01 00:03:00 4.000
4   2023-01-01 00:04:00 4.500
5   2023-01-01 00:06:00 6.000
6   2023-01-01 00:07:00 5.000

Rewrite this code for 10min and 20min.