Backpropagation through composition of neural networks where one has fixed parameters

31 Views Asked by cdmath At 22 February 2024 at 18:34

I am currently writing some code that involves two neural networks, $z=NN_1(x)$ and $y=NN_2(z)$ . In this structure, I am giving batch data $\lbrace\hat{x}_i,\hat{y}_i\rbrace_{i=1:N}$ , where I am trying to train $y=(NN_2\circ NN_1)(x)$ through some loss function $\mathfrak{L}((NN_2\circ NN_1)(\hat{x}_i),\hat{y}_i)$ . For this problem, $y=NN_2(z)$ has already been pretrained and much larger relative to $z=NN_1(x)$ . I do not want to perform any updates on $y=NN_2(z)$ , I simply want to train $z=NN_1(x)$ . The issue I am having is that performing backpropagation to update the parameters of $z=NN_1(x)$ is taking much longer than expected, it is on the same order as what it took for $y=NN_2(z)$ . I am now thinking that since $z=NN_1(x)$ is being fed into $y=NN_2(z)$ , computing backpropagation needs to run through both networks even though $y=NN_2(z)$ won't be updated. Is this how backpropagation works and could it be the source of my slow code? And if so, are there any work arounds so I don't need to compute the backpropagation through the $y=NN_2(z)$ ?

Original Q&A

There are 1 best solutions below

mkissel On 26 February 2024 at 20:32

In backpropagation, the errors are (back-) propagated from the outputs to the neurons of the network in order to compute the gradients. In your arrangement, the neural network, which is not trained, is closer to the outputs than the neural network, which you intend to train. Hence, you need to propagate the errors through this network in order to compute the gradients for the neural network, which you intend to train. So as far as I know, you can not avoid these computations in your setting, if you are using the backpropagation algorithm for training.

If this is a problem for you, then you could try to use another, non-gradient based training algorithm. Possible methods for this are genetic or evolutionary algorithms.

Backpropagation through composition of neural networks where one has fixed parameters

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in COMPOSITION

Related Questions in BACKPROPAGATION

Trending Questions

Popular # Hahtags

Popular Questions