Why does LayerNorm use a biased standard deviation estimator?

179 Views Asked by At

The LayerNorm computation in the original paper Layer Normalization uses a biased estimator of standard deviation (see equation 3 below). Why does it use a biased estimator instead of an unbiased estimator?

Layer Norm

0

There are 0 best solutions below