Tuning batch size using Keras Tuner wtih custom loss function which could be incorrect

29 Views Asked by At

I have a loss function that could be incorrect

@tf.keras.saving.register_keras_serializable('my_package')
def nan_mse(y_true2, y_pred2):
  squared_difference = tf.where(tf.math.equal(y_true2, tf.constant(-1.)),
                  tf.zeros_like(y_true2),
                  tf.math.square(tf.math.subtract(y_true2, y_pred2)))
  nonnan = tf.math.not_equal(y_true2, tf.constant(-1.))
  num_nonnan = tf.math.reduce_sum(tf.cast(nonnan, tf.float32))
  return tf.math.reduce_sum(squared_difference)/num_nonnan

I defined an another MSE function for evaluation.

def nan_mse2(actual, predicted):
  actual2 = np.where(actual == -1, np.nan, actual)
  mse1 = np.nanmean(np.square(actual2 - predicted))
  return mse1

Then I tuned the batch size using Keras Hyperband Tuner: When I evaluate the the best model obtained by the tuner, I made two computations for the validation set (I have 40 samples in the validation set):

loss_val2 = (nan_mse2(y_val[0:8,0,:], y_val_pred[0:8,:])*8/5 +\
            nan_mse2(y_val[8:16,0,:], y_val_pred[8:16,:])*8/5 +\
            nan_mse2(y_val[16:24,0,:], y_val_pred[16:24,:])*8/5 +\
            nan_mse2(y_val[24:32,0,:], y_val_pred[24:32,:])*8/5 +\
            nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8/5)

The result is closer to `loss_val = best_model.evaluate(X_val, y_val, batch_size=8, steps=5)

loss_val2 = (nan_mse2(y_val[0:16,0,:], y_val_pred[0:16,:])*16*16/40 +\
            nan_mse2(y_val[16:32,0,:], y_val_pred[16:32,:])*16*16/40 +\
            nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8*8/40)

The result is closer to `loss_val = best_model.evaluate(X_val, y_val, batch_size=16, steps=3)

When I arrange the first compution:

loss_val2 = (nan_mse2(y_val[0:8,0,:], y_val_pred[0:8,:])*8/5 +\
            nan_mse2(y_val[8:16,0,:], y_val_pred[8:16,:])*8/5 +\
            nan_mse2(y_val[16:24,0,:], y_val_pred[16:24,:])*8/5 +\
            nan_mse2(y_val[24:32,0,:], y_val_pred[24:32,:])*8/5 +\
            nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8/5)

          = (nan_mse2(y_val[0:16,0,:], y_val_pred[0:16,:])*16/5 +\
            nan_mse2(y_val[16:32,0,:], y_val_pred[16:32,:])*16/5 +\
            nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8*8/40)

Could my custom loss function be used to tune the batch size? How could I correct the custom loss function?

Thank you!

According to Keras Losses, I might need to add axis=-1 in tf.math.reduce_sum, which seems to calculate the MSE for each sample. The corrected custom loss function is:

def nan_mse(y_true2, y_pred2):
  squared_difference = tf.where(tf.math.equal(y_true2, tf.constant(-1.)),
                  tf.zeros_like(y_true2),
                  tf.math.square(tf.math.subtract(y_true2, y_pred2)))
  nonnan = tf.math.not_equal(y_true2, tf.constant(-1.))
  num_nonnan = tf.math.reduce_sum(tf.cast(nonnan, tf.float32), axis=-1)
  return tf.math.divide(tf.math.reduce_sum(squared_difference, axis=-1), num_nonnan)

After I have run the test hyperparameter search, I executed

loss_val = best_model.evaluate(X_val, y_val, batch_size=16, steps=3)
loss_val_2 = best_model.evaluate(X_val, y_val, batch_size=8, steps=5)
loss_val_3 = best_model.evaluate(X_val, y_val)
loss_val3 = 0.
for i in range(40):
  loss_val3 += nan_mse2(y_val[i:i+1,0,:], y_val_pred[i:i+1,:])
loss_val3 = loss_val3/40
print(loss_val)
print(loss_val_2)
print(loss_val_3)
print(loss_val3)

The results are

0.01044310349971056
0.01044310349971056
0.01044310349971056
0.010443102978751995

The loss seems to be the average of MSE of each sample, and the problem seems to be corrected!

0

There are 0 best solutions below