tf.gradients() returns a list of [None]

478 Views Asked by At

Sorry if this sounds like a repeat. I have been through all of the related questions and found no suitable solutions to my problem's context.

I am trying to build a generative model that outputs probabilities for each tracked day of COVID to input into an SEIR-based epidemiology model.

The generation works. However, I cannot figure out how to train the model. I have to write a custom loss function that runs the day-by-day parameters through a step function for the epidemiology model with will populate a dataset of "confirmed" and "removed" for each day. I then compare that data to the recorded "confirmed" and "removed" from John Hopkin's COVID dataset on GitHub.

I use Mean Absolute Error to calculate a loss between the "confirmed" and "removed" based on the generated probabilities and the actual values from the JHU dataset. The issue I am running into is when I call the tf.gradient() function it returns a list of Nones. I am stuck here and any assistance would be greatly appreciated.

Here is the code I am using:

Training Step

# Define function to train the model based on one input
loss_fn = MeanAbsoluteError()
optimizer = Adam(learning_rate=0.005)

@tf.function
def train_step(x, y):

  y_pred = np.zeros((3, latent_dim))

  N = tf.constant(int(7_000_000_000), dtype=tf.float64)
  E0 = tf.Variable(int(1000), trainable=False, dtype=tf.float64)
  I0 = tf.Variable(covid_df.iloc[0]["Confirmed"], trainable=False, dtype=tf.float64)
  R0 = tf.Variable(covid_df.iloc[0]["Removed"], trainable=False, dtype=tf.float64)
  S0 = tf.Variable(N - E0 - I0 - R0, trainable=False, dtype=tf.float64)
  u0 = tf.Variable(0, trainable=False, dtype=tf.float64)

  SuEIRs = tf.stack([S0,u0,E0,I0,R0])

  with tf.GradientTape() as tape:
    logits = generator(tf.reshape(x, (batch_size, 4, latent_dim)), training=True)

    betas = logits[0][0]
    sigmas = logits[0][1]
    mus = logits[0][2]
    gammas = logits[0][3]

    for t in range(latent_dim):
      SuEIR_diffs = SuEIR_step(SuEIRs, t, N, betas, sigmas, mus, gammas)

      SuEIRs = SuEIRs + SuEIR_diffs

      confirmed = SuEIRs[3]
      removed = SuEIRs[4]

      # update y_pred
      y_pred[0,t] = float(t+1)
      y_pred[1,t] = confirmed.numpy()
      y_pred[2,t] = removed.numpy()

    # Convert predictions
    y_pred = tf.convert_to_tensor(y_pred)

    # Calculate loss
    loss_value = loss_fn(y[1], y_pred[1]) + loss_fn(y[2], y_pred[2])

  # Calculate the gradient
  grads = tape.gradient(loss_value, generator.trainable_weights)

  print(grads) ##==>> outputs [None, None, None, None]

  # Apply gradients to model
  optimizer.apply_gradients(zip(grads, generator.trainable_weights))
  return loss_value

Training Loop

import time

epochs = 2
for epoch in range(epochs):
  print("\nStart of epoch %d" % (epoch,))
  start_time = time.time()

  # Iterate over the batches of the dataset.
  for step in range(sample_size):
    loss_value = train_step(x_input[step], y_true)

    # Log every 5 batches.
    if step % 5 == 0:
      print(
        "Training loss (for one batch) at step %d: %.4f"
        % (step, float(loss_value))
      )
    print("Time taken: %.2fs" % (time.time() - start_time))

Error output

ValueError: No gradients provided for any variable: ['dense/kernel:0', 'dense/bias:0', 'dense_1/kernel:0', 'dense_1/bias:0'].

loss_value and generator.trainable_weights are populated as expected.

EDIT: Updated code to reflect the suggestions of Myrl Marmarelis and the architecture of TensorFlow's custom training loop guide. Still having the same issue of gradients being a list of None's.

1

There are 1 best solutions below

1
On

Try changing your calls to np.array(...) before calculating the loss (especially on y_pred) to tf.convert_to_tensor(...). You need to build a proper symbolic graph by keeping everything as tf.Tensors. In fact, make sure you are not converting anything to a non-Tensor anywhere along the chain of computation between the model parameters and the loss.

I would also suggest wrapping your training procedure in a @tf.function so that Tensorflow may compile it into a static graph.