I wanted to use the means, stds from training rather than batch stats since it seems if I use batch statistics my model diverges (as outline here When should one call .eval() and .train() when doing MAML with the PyTorch higher library?). How does one do that?
I am asking since my model seems to have them be zero despite no training having been done yet:
Out[1]: BatchNorm2d(32, eps=0.001, momentum=0.95, affine=True, track_running_stats=True)
args.base_model.model.features.norm1.running_mean
Out[2]:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0.])
are these not saved in a ckpt after training? Should they have been saved?
Docs say they should have (https://pytorch.org/docs/stable/_modules/torch/nn/modules/batchnorm.html#BatchNorm2d, https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html):
Also by default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation.
by running means are zero vectors... :/ ?
related: