Why tensorflow skip val_acc check when I use big bacthsize?

26 Views Asked by At

I train my tensorflow model based on many tfrecord files. And I read the data_train and data_val as follows:

data    = get_dataset(trainFile, btsize*10)                                                                                                                                       
data_train = data.take(btsize*8)                                                                                                                                                     
data_val   = data.skip(btsize*8)

and I set a cp_callback to save the max val_binary_accuracy checkpoint.

      cp_callback = tf.keras.callbacks.ModelCheckpoint(                                                                                                                                        
          filepath=checkpoint_path,                                                                                                                                                            
          verbose=1,                                                                                                                                                                           
          save_weights_only=False,                                                                                                                                                             
          monitor='val_binary_accuracy',                                                                                                                                                       
          mode='max',                                                                                                                                                                          
          save_best_only=True                                                                                                                                                                  
          )                                                                                                                                                                                    
                                                                                                                                                                                               
      history=model.fit(data_train, epochs=nEpc, batch_size=btsize,                                                                                                                            
              shuffle=True,                                                                                                                                                                    
              #class_weight=class_weight,                                                                                                                                                      
              callbacks=[cp_callback],                                                                                                                                                         
              validation_data=data_val,                                                                                                                                                        
              #validation_split=0.1,                                                                                                                                                           
              verbose=1)                                                                                                                                                                       

Then I found that when I set the btsize >=256, tensorflow will not output the validation result. so it output one WARNING message

WARNING:tensorflow:Can save best model only with val_binary_accuracy available, skipping.

and the btsize<=128 is OK. So, Is this WHY?

0

There are 0 best solutions below