Get name of slots for momentum optimizer in tensorflow

1.7k Views Asked by At

I want to get names of slots from Momentum optimizer in tensorflow with using get_slot_names as it is explained here in tensorflow webpage. I am using following line in my code to get them:

slots=tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,).get_slot_names()

I ran my graph and then when I print slots it return me only an empty list. Any help will be appreciated.

By the way, my network is working fine in minimizing the loss or for other things. Also, I tried it with other optimizers, but it have same problem.

I am using tf 1.3 in ubuntu 14. Thanks,

2

There are 2 best solutions below

3
On

The only problem in the code is that slot_variables are created during minimize call (actually in apply_gradients). And since you call get_slot_variables() before - they are empty.

So instead of

slots=tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,).get_slot_names() 

you should do

opt = tf.train.MomentumOptimizer(learning_rate=1e-3, momentum=0.9,)

train_op = opt.minimize(your_loss) # or anything else

slota = opt.get_slot_names() # this will work

The reason for that is really simple - many slots are variables specific, for example methods like Adam will create one slot per each optimised variable, and before calling .minimize - optimiser does not know which variables it will be optimising.

In particular, for MomentumOptimizer you keep accumulated gradients per each variable. Consequently they cannot be computed prior to calling minimize. These accumulated gradients are stored in "momentum" slot (quite bad choice for the name, but this where they are in TF).

3
On

Slot_names are optimizer specific. If you want to get slot for each trainable variables you can use get_slot method with the correct slot_name. Slot name created (by default) for momentum_optimizer is momentum. Below is a simple example to illustrate the points.

x_input = np.linspace(0, 30, 200)
y_input = 4 * x_input + 6
W = tf.Variable(0.0, name="weight")
b = tf.Variable(0.0, name="bias")
X = tf.placeholder(tf.float32, name='InputX')
Y = tf.placeholder(tf.float32, name='InputY')
Y_pred = X * W + b

loss = tf.reduce_mean(tf.square(Y_pred - Y))

# define the optimizer
optimizer = tf.train.MomentumOptimizer(learning_rate=0.001, momentum=.9)

# training op
train_op = optimizer.minimize(loss)

# print the optimizer slot name
print(optimizer.get_slot_names())

# Results:['momentum']

# print out slot created for each trainable variables using the slot_name from above result
for v in tf.trainable_variables():
    print(optimizer.get_slot(v, 'momentum'))

# Results: <tf.Variable 'weight/Momentum:0' shape=() dtype=float32_ref>
           <tf.Variable 'bias/Momentum:0' shape=() dtype=float32_ref>