When is Momentum Applied in Tensorflow Gradient Tape?

79 Views Asked by Joe At 07 June 2025 at 10:23

I've been playing around with automatic gradients in tensorflow and I had a question. If we are updating an optimizer, say ADAM, when is the momentum algorithm applied to the gradient? Is it applied when we call tape.gradient(loss,model.trainable_variables) or when we call model.optimizer.apply_gradients(zip(dtf_network,model.trainable_variables))?

Thanks!

Original Q&A

There are 1 best solutions below

xdurch0 On 17 September 2020 at 09:27

tape.gradient computes the gradients straightforwardly without reference to an optimizer. Since momentum is part of the optimizer, the tape does not include it. AFAIK momentum is usually implemented by adding extra variables in the optimizer that store the running average. All of this is handled in optimizer.apply_gradients.

When is Momentum Applied in Tensorflow Gradient Tape?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in ADAM

Trending Questions

Popular # Hahtags

Popular Questions