Consider an infinite stream of numbers. I want to calculate the mean and standard deviation of the stream for the latest N numbers in O(1) memory complexity (because N is larger than the memory that I have). On top of that, the stream is distributed and the average and standard deviation could only be calculated in a distributed manner as well (there's no single point that the whole stream passes through it).
I understand this is impossible. But how about approximate solutions?
To calculate the mean and standard deviation of a sequence of numbers, All you need to collect is:
It's easy to collect these values for your whole stream in constant space. If the stream is partitioned into substreams, then you can even collect them independently for each substream, and then add them together.
Your difficulty is that you only want a sliding window, so you have to subtract the sums for all the values older than N samples ago. This takes O(N) memory if you need to update every sample, because you need to remember all the sums to subtract.
If N is very large, however, then usually you don't need updates every sample. If you only need updates every N/1000 samples, then you only need to remember the count and sums for 1000 blocks of N/1000 samples, and that is constant space.