Incremental moving average

59 Views Asked by soiryk139 At 28 March 2024 at 01:29

I have the following sample dataframe:

ID	date	data	avg
1	18/3/2024	45.72991477	9.845398369
1	19/3/2024	16.66879054	12.22665416
1	20/3/2024	0.956178897	12.29306183
1	21/3/2024	0.008082319	12.27617302
1	22/3/2024	18.4076123	14.88212601
1	23/3/2024	8.3212202	12.87506446
1	24/3/2024	2.950587398	13.29176949
1	25/3/2024	79.01593876
2	18/3/2024	5.234445299	1.555186844
2	19/3/2024	13.36603398	3.441331288
2	20/3/2024	2.76175807	3.835868155
2	21/3/2024	0.007854825	3.710561654
2	22/3/2024	32.75987619	8.361661724
2	23/3/2024	0.363040191	7.823976139
2	24/3/2024	3.461841086	8.279264234
2	25/3/2024	6.28783374

The avg column is essentially a 7 days moving average. I currently use the following code to populate avg

index["avg"] = index.groupby("ID")["data"].transform(lambda x: x.rolling(7).mean())

I understand that every time code is run, it will recompute the whole avg column. Meanwhile my data is incrementally added on a daily basis. Is there a way to only compute the latest date avg using data from last 7 days including the latest added data for each ID? Note that the rolling average is for each ID. Will this approach be more efficient than my original approach of recomputing the whole avg column every time it is run?

Original Q&A

There are 1 best solutions below

Pawan Tolani On 28 March 2024 at 10:07

I think calculating rolling mean for last 7 days will definitely be faster than recomputing for the entire dataframe. Here is the code I tried with some dummy data

lst_seven_dys_df=full_df.tail(7).copy()
lst_seven_dys_df['roll']=lst_seven_dys_df['Score'].rolling(window=7).mean()
new_df=pd.concat([full_df.iloc[0:-1,:],lst_seven_dys_df.tail(1)]).reset_index(drop=True)
new_df

Incremental moving average

There are 1 best solutions below

Related Questions in PANDAS

Related Questions in ROLLING-COMPUTATION

Trending Questions

Popular # Hahtags

Popular Questions