I have a loss function that could be incorrect
@tf.keras.saving.register_keras_serializable('my_package')
def nan_mse(y_true2, y_pred2):
squared_difference = tf.where(tf.math.equal(y_true2, tf.constant(-1.)),
tf.zeros_like(y_true2),
tf.math.square(tf.math.subtract(y_true2, y_pred2)))
nonnan = tf.math.not_equal(y_true2, tf.constant(-1.))
num_nonnan = tf.math.reduce_sum(tf.cast(nonnan, tf.float32))
return tf.math.reduce_sum(squared_difference)/num_nonnan
I defined an another MSE function for evaluation.
def nan_mse2(actual, predicted):
actual2 = np.where(actual == -1, np.nan, actual)
mse1 = np.nanmean(np.square(actual2 - predicted))
return mse1
Then I tuned the batch size using Keras Hyperband Tuner: When I evaluate the the best model obtained by the tuner, I made two computations for the validation set (I have 40 samples in the validation set):
loss_val2 = (nan_mse2(y_val[0:8,0,:], y_val_pred[0:8,:])*8/5 +\
nan_mse2(y_val[8:16,0,:], y_val_pred[8:16,:])*8/5 +\
nan_mse2(y_val[16:24,0,:], y_val_pred[16:24,:])*8/5 +\
nan_mse2(y_val[24:32,0,:], y_val_pred[24:32,:])*8/5 +\
nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8/5)
The result is closer to `loss_val = best_model.evaluate(X_val, y_val, batch_size=8, steps=5)
loss_val2 = (nan_mse2(y_val[0:16,0,:], y_val_pred[0:16,:])*16*16/40 +\
nan_mse2(y_val[16:32,0,:], y_val_pred[16:32,:])*16*16/40 +\
nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8*8/40)
The result is closer to `loss_val = best_model.evaluate(X_val, y_val, batch_size=16, steps=3)
When I arrange the first compution:
loss_val2 = (nan_mse2(y_val[0:8,0,:], y_val_pred[0:8,:])*8/5 +\
nan_mse2(y_val[8:16,0,:], y_val_pred[8:16,:])*8/5 +\
nan_mse2(y_val[16:24,0,:], y_val_pred[16:24,:])*8/5 +\
nan_mse2(y_val[24:32,0,:], y_val_pred[24:32,:])*8/5 +\
nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8/5)
= (nan_mse2(y_val[0:16,0,:], y_val_pred[0:16,:])*16/5 +\
nan_mse2(y_val[16:32,0,:], y_val_pred[16:32,:])*16/5 +\
nan_mse2(y_val[32:40,0,:], y_val_pred[32:40,:])*8*8/40)
Could my custom loss function be used to tune the batch size? How could I correct the custom loss function?
Thank you!
According to Keras Losses, I might need to add axis=-1 in tf.math.reduce_sum, which seems to calculate the MSE for each sample. The corrected custom loss function is:
def nan_mse(y_true2, y_pred2):
squared_difference = tf.where(tf.math.equal(y_true2, tf.constant(-1.)),
tf.zeros_like(y_true2),
tf.math.square(tf.math.subtract(y_true2, y_pred2)))
nonnan = tf.math.not_equal(y_true2, tf.constant(-1.))
num_nonnan = tf.math.reduce_sum(tf.cast(nonnan, tf.float32), axis=-1)
return tf.math.divide(tf.math.reduce_sum(squared_difference, axis=-1), num_nonnan)
After I have run the test hyperparameter search, I executed
loss_val = best_model.evaluate(X_val, y_val, batch_size=16, steps=3)
loss_val_2 = best_model.evaluate(X_val, y_val, batch_size=8, steps=5)
loss_val_3 = best_model.evaluate(X_val, y_val)
loss_val3 = 0.
for i in range(40):
loss_val3 += nan_mse2(y_val[i:i+1,0,:], y_val_pred[i:i+1,:])
loss_val3 = loss_val3/40
print(loss_val)
print(loss_val_2)
print(loss_val_3)
print(loss_val3)
The results are
0.01044310349971056
0.01044310349971056
0.01044310349971056
0.010443102978751995
The loss seems to be the average of MSE of each sample, and the problem seems to be corrected!