Validation accuracy is getting lower than the training accuracy

760 Views Asked by At

This model is still training. And the Validation accuracy is getting lower than the training. This shows overfitting ? How I can overcome this ? I have used the MobileNet model. Can I do it by reducing the learning rate?

  Epoch 10/50
6539/6539 [==============================] - 3386s 518ms/step - loss: 0.8198 - accuracy: 0.7470 - top3_acc: 0.9199 - top5_acc: 0.9645 - val_loss: 1.0399 - val_accuracy: 0.6940 - val_top3_acc: 0.8842 - val_top5_acc: 0.9406
Epoch 11/50
6539/6539 [==============================] - 3377s 516ms/step - loss: 0.7939 - accuracy: 0.7558 - top3_acc: 0.9248 - top5_acc: 0.9669 - val_loss: 1.0379 - val_accuracy: 0.6953 - val_top3_acc: 0.8844 - val_top5_acc: 0.9411
Epoch 12/50
6539/6539 [==============================] - 3386s 518ms/step - loss: 0.7593 - accuracy: 0.7644 - top3_acc: 0.9304 - top5_acc: 0.9702 - val_loss: 1.0454 - val_accuracy: 0.6953 - val_top3_acc: 0.8831 - val_top5_acc: 0.9410
Epoch 13/50
6539/6539 [==============================] - 3394s 519ms/step - loss: 0.7365 - accuracy: 0.7735 - top3_acc: 0.9340 - top5_acc: 0.9713 - val_loss: 1.0476 - val_accuracy: 0.6938 - val_top3_acc: 0.8856 - val_top5_acc: 0.9411
Epoch 14/50
6539/6539 [==============================] - 3386s 518ms/step - loss: 0.7049 - accuracy: 0.7824 - top3_acc: 0.9387 - top5_acc: 0.9739 - val_loss: 1.0561 - val_accuracy: 0.6935 - val_top3_acc: 0.8841 - val_top5_acc: 0.9398
Epoch 15/50
6539/6539 [==============================] - 3390s 518ms/step - loss: 0.6801 - accuracy: 0.7901 - top3_acc: 0.9421 - top5_acc: 0.9755 - val_loss: 1.0673 - val_accuracy: 0.6923 - val_top3_acc: 0.8828 - val_top5_acc: 0.9391
Epoch 16/50
6539/6539 [==============================] - 3635s 556ms/step - loss: 0.6516 - accuracy: 0.7991 - top3_acc: 0.9462 - top5_acc: 0.9772 - val_loss: 1.0747 - val_accuracy: 0.6905 - val_top3_acc: 0.8825 - val_top5_acc: 0.9388
Epoch 17/50
6539/6539 [==============================] - 4070s 622ms/step - loss: 0.6200 - accuracy: 0.8082 - top3_acc: 0.9502 - top5_acc: 0.9805 - val_loss: 1.0859 - val_accuracy: 0.6883 - val_top3_acc: 0.8814 - val_top5_acc: 0.9373
Epoch 18/50
6539/6539 [==============================] - 4092s 626ms/step - loss: 0.5896 - accuracy: 0.8182 - top3_acc: 0.9550 - top5_acc: 0.9822 - val_loss: 1.1029 - val_accuracy: 0.6849 - val_top3_acc: 0.8788 - val_top5_acc: 0.9367
Epoch 19/50
6539/6539 [==============================] - 4087s 625ms/step - loss: 0.5595 - accuracy: 0.8291 - top3_acc: 0.9589 - top5_acc: 0.9834 - val_loss: 1.1147 - val_accuracy: 0.6872 - val_top3_acc: 0.8797 - val_top5_acc: 0.9367
Epoch 20/50
6539/6539 [==============================] - 4015s 614ms/step - loss: 0.5361 - accuracy: 0.8367 - top3_acc: 0.9617 - top5_acc: 0.9852 - val_loss: 1.1325 - val_accuracy: 0.6833 - val_top3_acc: 0.8773 - val_top5_acc: 0.9361
Epoch 21/50
6539/6539 [==============================] - 4093s 626ms/step - loss: 0.5023 - accuracy: 0.8472 - top3_acc: 0.9661 - top5_acc: 0.9870 - val_loss: 1.1484 - val_accuracy: 0.6844 - val_top3_acc: 0.8773 - val_top5_acc: 0.9363
Epoch 22/50
6539/6539 [==============================] - 4094s 626ms/step - loss: 0.4691 - accuracy: 0.8570 - top3_acc: 0.9703 - top5_acc: 0.9892 - val_loss: 1.1730 - val_accuracy: 0.6802 - val_top3_acc: 0.8765 - val_top5_acc: 0.9337
Epoch 23/50
6539/6539 [==============================] - 4091s 626ms/step - loss: 0.4387 - accuracy: 0.8676 - top3_acc: 0.9737 - top5_acc: 0.9904 - val_loss: 1.1986 - val_accuracy: 0.6774 - val_top3_acc: 0.8735 - val_top5_acc: 0.9320
Epoch 24/50
6539/6539 [==============================] - 4033s 617ms/step - loss: 0.4122 - accuracy: 0.8752 - top3_acc: 0.9764 - top5_acc: 0.9915 - val_loss: 1.2157 - val_accuracy: 0.6782 - val_top3_acc: 0.8755 - val_top5_acc: 0.9322
Epoch 25/50
6539/6539 [==============================] - 4105s 628ms/step - loss: 0.3838 - accuracy: 0.8861 - top3_acc: 0.9794 - top5_acc: 0.9927 - val_loss: 1.2419 - val_accuracy: 0.6746 - val_top3_acc: 0.8711 - val_top5_acc: 0.9309
Epoch 26/50
6539/6539 [==============================] - 4098s 627ms/step - loss: 0.3551 - accuracy: 0.8964 - top3_acc: 0.9824 - top5_acc: 0.9938 - val_loss: 1.2719 - val_accuracy: 0.6741 - val_top3_acc: 0.8722 - val_top5_acc: 0.9294
Epoch 27/50
6539/6539 [==============================] - 4101s 627ms/step - loss: 0.3266 - accuracy: 0.9051 - top3_acc: 0.9846 - top5_acc: 0.9950 - val_loss: 1.2877 - val_accuracy: 0.6723 - val_top3_acc: 0.8709 - val_top5_acc: 0.9288
Epoch 28/50
6539/6539 [==============================] - 4007s 613ms/step - loss: 0.3022 - accuracy: 0.9147 - top3_acc: 0.9866 - top5_acc: 0.9955 - val_loss: 1.3156 - val_accuracy: 0.6687 - val_top3_acc: 0.8667 - val_top5_acc: 0.9266
Epoch 29/50
6539/6539 [==============================] - 3410s 521ms/step - loss: 0.2797 - accuracy: 0.9208 - top3_acc: 0.9886 - top5_acc: 0.9962 - val_loss: 1.3409 - val_accuracy: 0.6712 - val_top3_acc: 0.8682 - val_top5_acc: 0.9270
Epoch 30/50
6539/6539 [==============================] - 3398s 520ms/step - loss: 0.2555 - accuracy: 0.9292 - top3_acc: 0.9907 - top5_acc: 0.9969 - val_loss: 1.3703 - val_accuracy: 0.6684 - val_top3_acc: 0.8661 - val_top5_acc: 0.9252
Epoch 31/50
6539/6539 [==============================] - 3401s 520ms/step - loss: 0.2365 - accuracy: 0.9358 - top3_acc: 0.9926 - top5_acc: 0.9975 - val_loss: 1.3945 - val_accuracy: 0.6660 - val_top3_acc: 0.8659 - val_top5_acc: 0.9270
Epoch 32/50
6539/6539 [==============================] - 3387s 518ms/step - loss: 0.2174 - accuracy: 0.9414 - top3_acc: 0.9934 - top5_acc: 0.9979 - val_loss: 1.4218 - val_accuracy: 0.6687 - val_top3_acc: 0.8650 - val_top5_acc: 0.9229
Epoch 33/50
6539/6539 [==============================] - 3397s 519ms/step - loss: 0.1986 - accuracy: 0.9478 - top3_acc: 0.9948 - top5_acc: 0.9983 - val_loss: 1.4513 - val_accuracy: 0.6641 - val_top3_acc: 0.8620 - val_top5_acc: 0.9217
Epoch 34/50
6539/6539 [==============================] - 3394s 519ms/step - loss: 0.1814 - accuracy: 0.9533 - top3_acc: 0.9956 - top5_acc: 0.9986 - val_loss: 1.4752 - val_accuracy: 0.6656 - val_top3_acc: 0.8612 - val_top5_acc: 0.9207

This is my Code. I have used the DeepFashion dataset and use 209222 images for training. and also have used the SGD optimizer with learning_rate=0.001.

mobile = tf.keras.applications.mobilenet.MobileNet(weights='imagenet')

x = mobile.layers[-6].input

if True:
    x = Reshape([7*7,1024])(x)
    att = MultiHeadsAttModel(l=7*7, d=1024 , dv=64, dout=1024, nv = 16 )
    x = att([x,x,x])
    x = Reshape([7,7,1024])(x)   
    x = BatchNormalization()(x)

x = mobile.get_layer('global_average_pooling2d')(x)
x = mobile.get_layer('reshape_1')(x)
x = mobile.get_layer('dropout')(x)
x = mobile.get_layer('conv_preds')(x)
x = mobile.get_layer('reshape_2')(x)
output = Dense(units=50, activation='softmax')(x)

model = Model(inputs=mobile.input, outputs=output)
1

There are 1 best solutions below

1
On BEST ANSWER

Your validation loss is got increased while the training loss tends to get smaller in each iteration. This is a classic case of overfitting.

I am not familiar with "MobileNet model" but it would help if you share the architecture or a link to the architecture details.

I can blindly suggest adding dropouts ( https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout) in order to regularize your model ( I guess you do not have dropouts in the model). Honestly, I cannot see how changing the 'learning rate' might help to overcome overfitting, so I do not advise that.

Since you did not share any information about the dataset size, I am not sure how big and diverse the data set is. Anyway, If the dataset is relatively small, you can augment your dataset, you have a better chance to reduce the overfitting.