Keras. EfficientNetV2 doesn't converge while EfficientNet does

693 Views Asked by At

Using transfer learning with EfficientNet (B4) for the image classification yielded decent results. Trying to run the same with the V2 gets stuck with no learning.

Any idea what should be done to solve it?

Thanks

This converges just fine starting from the epoch 1:

efficientnetB4 = tf.keras.applications.EfficientNetB4(
    input_shape=(224, 224, 3),
    include_top=False,
    weights='imagenet',
    pooling=None
) 

This gets stuck with no accuracy improvement for several epochs.

efficientnetV2S = tf.keras.applications.EfficientNetV2S(
    input_shape=(224, 224, 3),
    include_top=False,
    weights='imagenet',
    pooling=None
) 
3

There are 3 best solutions below

0
On

Appears reducing the initial learning rate from 1e-3 to 1e-4 solves the problem. The training starts converging from epoch 1.

0
On

I can't comment yet, just wanted to add to Omids answer, that at least for EfficientNet (not V2), the input gets preprocessed automatically, so the images have to be in the range of [0, 255]. The images also get rescaled automatically. For EfficientNetV2, preprocessing can be disabled with include_preprocessing=False, then the input images should be in range [-1, 1] also scaled appropriately. Also EfficientNetB4 exists as a version for V1, but not for V2.

Edit: I was wrong about the image rescaling. EfficientNet just scales the pixel values from [0, 255] to its appropiate scale and normalizes. You have to manually set the image resolution. For EfficientNet (v1), you can find a table for resolutions at https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/
As of now, I can't find such table for EfficientNetV2. In the paper (https://arxiv.org/pdf/2104.00298.pdf) the authors mentioned a maximum image size of 480, and training with different image sizes to compare accurcy. There is also a section about training with progressivly larger images as a regularization, resolution ranging from 128-300 (EfficientNet-V2S) to 128-380 (EfficientNet-V2M/L).

0
On

First of all, I didn't find B4 as a version of efficientnet. use b3 or other versions. Also, You definitely picked the wrong input shape. (224,224,3) is appropriate for EfficientNetV2B0 not B4! Search about recommended input shape for efficientnetv2. In addition, check your data. every record (pixel if you working on images) must be normalized. check Keras and TensorFlow documentary about it. I think you must normalize your pixels between -1 and +1 instead of 0 and 255. I think The easiest way to do this normalization is include_preprocessing=true. It can do it automatically as a part of efficientnetv2. check this out: https://www.tensorflow.org/api_docs/python/tf/keras/applications/efficientnet_v2/EfficientNetV2B3 I hope you can solve it with these two desired tips. But your explanation is really brief and short. Maybe you have some other mistakes ...