I have a large amount of images and want to calculate the variance (of each channel) across all of them. I'm having problem finding an efficient (and even correct) algorithm for that.
I read on of the Welford's online algorithm but it is way to slow as it is does not vectorize across a single image or a batch of images.
So I'm wondering how to improve the speed of it to by using vectorization or making use of inbuilt variance algorithms.
These are the two functions needed to update/combine the mean and variances of two batches. Both functions can be used with vectors (the 3 color channels) and the mean and variance can be acquired from inbuilt methods like
batch.var()
.Equations taken from: https://notmatthancock.github.io/2017/03/23/simple-batch-stat-updates.html
As you see one can simplify them a bit by reusing some calculations like
m+n
but keeping it in this imo better to understand version.