// Initialize 2-D array, each entry is some neural network
phi = [[None] * n for _ in range(m)]
for i in range(m):
for j in range(n):
phi[i][j] = NeuralNetwork()
// Let k, i be arbitrary indices
p1 = torch.nn.utils.parameters_to_vector(phi[k][i - 1].parameters())
p2 = torch.nn.utils.parameters_to_vector(mean of phi[:][i-1])
I want to basically compute the mean squared error between the parameters phi[k][i-1]
and average of the entire column phi[:][i-1]
i.e. ((p1 - p2)**2).sum()
I tried in the following way:
tmp = [x.parameters() for x in self.phi[:][i - 1]]
mean_params = torch.mean(torch.stack(tmp), dim=0)
p2 = torch.nn.utils.parameters_to_vector(mean_params)
But this doesn't work out because tmp is a list of generator objects. More specifically, I guess my problem is to compute the mean from that generator object.
First we can define a function that computes the average parameters for a list of models. To avoid creating a copy of the parameters of each model all at the same time we probably want to compute this as a running sum. For example
Then you can just create
p1
andp2
and compute the mean-squared errorIf you really want a one-line solution that's also possibly the fastest you could compute this by making a single tensor containing all the parameters of the models in
phi[:][i - 1]
, then mean reducing them. But as mentioned earlier this will significantly increase memory usage, especially if your models have millions of parameters as is often the case.On the other extreme, if you're very concerned about memory usage then you could compute the average of each individual parameter at a time.