What happens if my Dropout is too high? what Dropout to use on my 2048-neuron-dense layer? (very little data)

2.6k Views Asked by At

I am pretty new to this and I am writing my bachelor thesis in keras. I have this big CNN, built similar to vgg but a bit different, because I have bigger resolution images and I pool a little more. I added a 2048 dense layer on top. What Dropout do I use. I wanna go with a high dropout since I have very little data (read below) and I added many neurons. But what happens when it is too high?

I am asking because I have limited time and the network takes like 3 days to train. If anyone knows answers or tips in any way, Id be very grateful. Any other recommendations/propositions on what to change or do, what has worked for you, are also very welcome.

thanks in advance! here's how I build my model:

model = Sequential()
model.add(Conv2D(64, (3, 3), strides=1, activation='swish', input_shape = input_shape, trainable=True))
model.add(MaxPooling2D((2, 2), name='pool0'))
model.add(Conv2D(64, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2), name='pool1'))

model.add(Conv2D(128, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(128, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool2'))

model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(256, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool3'))

model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool4'))

model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(Conv2D(512, (3, 3), strides=1, activation='swish', trainable=True))
model.add(MaxPooling2D((2, 2),name='pool5')) 

model.add(Flatten())
model.add(Dense(2048,activation='swish', name='vgg_int'))
model.add(Dropout(0.65))
model.add(Dense(17,activation='softmax')) 

I also wanna add that I have very little data to train from. that is why I want the big dropout. I have around 100 pics per class. sometimes even only 60, sometimes 200:


Found 1807 images belonging to 17 classes.
Found 170 images belonging to 17 classes.

I am confident it can go over 90% on validation-set, but what is the best way to go here, I dont really know. What happens if I go 90% dropout? I currently run 60% but with a smaller model, only 1024 neurons on that top:

Epoch 19/50
226/226 [==============================] - 4966s 22s/step - loss: 0.5661 - accuracy: 0.8307 - val_loss: 0.5752 - val_accuracy: 0.8412
Epoch 20/50
226/226 [==============================] - 4157s 18s/step - loss: 0.5511 - accuracy: 0.8329 - val_loss: 0.5042 - val_accuracy: 0.8647

I am running batch_size = 8 and: optimizer=optimizers.Adam(learning_rate=0.0000015)

again, thanks a lot!

1

There are 1 best solutions below

1
On BEST ANSWER

Dropout is used to prevent overfitting of the model. I can understand why you would want to use high dropout as your dataset is really small. But using a high dropout value is detrimental to your model and will get in the way of your model learning properly. Since you have a validation set, use it to understand whether your model is overfitting. You can stop training your model when there is a large gap between training accuracy and validation accuracy. I recommend you start with a Dropout of 0.5 and gradually increase it, if you feel unsatisfied with your model's performance.