For ReLU implemented by PyTorch, how to handle derivative at the point of ZERO?

827 Views Asked by WYYAHYT At 10 May 2023 at 14:01

ReLU is not derivable at 0, but in the implementation of PyTorch, it should be handled. So, the derivative at 0 is set to be 0 by default or?

I tried to set the weights and bias (the input of ReLU) to be zero while backpropagation, and the gradient of weights is 0, but not zero with the last conv layer in the residual block

Original Q&A

There are 1 best solutions below

Brock Brown On 10 May 2023 at 14:11 BEST ANSWER

Pytorch implements the derivative of ReLU at x = 0 by outputting zero. According to this article:

Even though the ReLU activation function is non-differentiable at 0, autograd libraries such as PyTorch or TensorFlow implement its derivative with ReLU'(0) = 0.

The article also goes on to elaborate on the differences in outcomes between using ReLU'(0) = 0 and ReLU'(0) = 1, and notes that the effect is stronger if the numbers are lower precision.

For ReLU implemented by PyTorch, how to handle derivative at the point of ZERO?

There are 1 best solutions below

Related Questions in PYTORCH

Related Questions in RELU

Trending Questions

Popular # Hahtags

Popular Questions