Compute average of past k-values in a data stream with limited memory

55 Views Asked by NikolayBlagoev At 29 July 2025 at 06:16

I am not looking for particular code, more of general theory of what could be done for the problem.

We have a stream of incoming data (data entries are quite large, thus it is infeasible to store more than a few M of them in memory). The goal is to find the average of the last K entries we have seen in the stream, with K >> M. We have some statistical information about the data (which we can realistically compute online as data comes) such as mean and variance. Incoming data is a result of several unknown stochastic processes.

How could we approximate the last K elements-mean in such a setting?

Of course if M were close enough to K or greater than K, the problem can be trivially solved, by simply storing past M values and approximating the mean based on them. But if you assume a great difference between the two values (M = 4, K = 400), this approximation quickly degrades in quality. Compressing the data points to reduce the memory they take is not a sufficient enough increase (M' <= 3M, where M' is the the number of compressed entries we can store). A linear/quadratic/cubic approximation of the past K entries (based on just the last value we have seen), results in error approximately as bad as just using the past M entries.

Original Q&A

Compute average of past k-values in a data stream with limited memory

There are 0 best solutions below

Related Questions in ALGORITHM

Related Questions in MEAN

Related Questions in DATA-STREAM

Trending Questions

Popular # Hahtags

Popular Questions