dropout mask in back propagation

83 Views Asked by crucio At 03 October 2023 at 06:08

I have a query regarding the implementation of forward propagation and backward propagation using dropouts.

I understand the logic behind the implementation of dropouts during forward propagation. However, in back propagation, why do we have to multiply the same dropouts masks to the gradients of the layers that were masked during forward propagation?

For e.g. in a neural network with 3 layers, and layer 1 and layer 2 being applied with the dropout mask, the end result A3 would have been computed by these masked layers. During back propagation, we will be using this dropout-affected A3 to initiate the calculation of the various gradients.

dZ3 = (dropout-affected A3) - Y
dW3 = 1./m * np.dot(dZ3, A2.T) 
dA2 = np.dot(W3.T, dZ3)
dA2 = dA2*D2 # Why do I need this line?

I expected dZ3 which was computed from the dropout-affected A3 to have shutoff the relevant nodes in W3.T.

Original Q&A

dropout mask in back propagation

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in DEEP-LEARNING

Related Questions in NEURAL-NETWORK

Related Questions in DROPOUT

Trending Questions

Popular # Hahtags

Popular Questions