I am having a problem implementing dropout as a regularization method in my dense NN model. It appears that adding a dropout value above 0 just scales down the predicted value, in a way makes me think something is not being accounted for correctly after individual weights are being set to zero. I'm sure I am implementing something incorrectly, but I can't seem to figure out what.
The code to build this model was taken directly from a tensorflow page (https://www.tensorflow.org/tutorials/keras/overfit_and_underfit), but occurs no matter what architecture I use to build the model.
model = tf.keras.Sequential([
layers.Dense(512, activation='relu', input_shape=[len(X_train[0])]),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(512, activation='relu'),
layers.Dropout(0.5),
layers.Dense(1)
])
Any help would be much appreciated!
It's perfectly normal to decrease accuracy in
training set
when adding Dropout. You usually do this as a trade-off to increase accuracy in unseen data (test set) and thus, generalization properties.However, try to decrease Dropout rate to
0.10
or0.20
. You will get better results. Also, unless you are dealing with hundreds of millions of examples, try to decrease the neurons from your neural net, like from 512 to 128. With a complex neural net the backpropagation gradients won't reach an optimum level. With a neural net that is too simple, the gradients will saturate and won't learn, either.Other point, you may want to apply
pd.get_dummies
to your output (Y) and increase last layer toDense(2)
and normalize input data.