Very low Error but a low Accuracy in a keras RNN

1.7k Views Asked by At

More of a theoretical question than anything. If I have a near zero cross entropy loss in a binary classification where the last layer is a softmax and the input layer is an LSTM, does it make sense that the accuracy tops out at 54% on the train set? I would have assumed that it would overfit the data and if I had such a low loss then I would have an extremely overfit function with high accuracy.

I also have tried different learning rates, 0.01, 0.001, 0.0001 all with the exact same result. I have also added a second LSTM layer under the first LSTM to increase model complexity to overfit the model on purpose but that didn't do anything either.

What theoretical concept am I missing?

model.add(LSTM(64, input_shape=(100000,26), return_sequences = True, activation = 'relu'))
model.add(Dropout(0.3))
model.add(LSTM(32, return_sequences = True, activation = 'relu'))
model.add(Dense(2))
model.add(Activation('softmax'))
opt1 = keras.optimizers.RMSprop(lr=0.01, rho=0.9, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer=opt1, metrics=['accuracy'])

filepath="model_check-{epoch:02d}-{loss:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, 
                             save_best_only=True, mode='min')
callbacks_list = [checkpoint]

model.fit(df_matrix2, y_matrix2, epochs=5, batch_size=2, verbose=2)

And setting up the matrices in the beginning. There is some overlap as I was testing other things earlier.

df_matrix = df_model.as_matrix()
df_matrix = np.reshape(df_matrix,(-1,588425,26))
y_matrix = y.as_matrix()
y_matrix = np.reshape(y_matrix,(-1,588425,1))
df_matrix2 = df_matrix[:,0:100000,:]
df_matrix2 = np.reshape(df_matrix2, (-1,100000,26))
y_matrix2 = y_matrix[:,0:100000,:]
y_matrix2 = np.reshape(y_matrix2,(-1,100000,1))
y_matrix2 = keras.utils.np_utils.to_categorical(y_matrix2,2)
y_matrix2 = np.reshape(y_matrix2,(-1,100000,2))

This is stock data, but I created a classifier so it's just a 0 or 1 based on whether or not it is higher or lower in 60 minutes. So there is a lot of randomness to it to begin with. I just assumed that the LSTM would overfit and I'd get a high accuracy.

Epoch 1/5
194s - loss: 0.0571 - acc: 0.5436
Epoch 2/5
193s - loss: 1.1921e-07 - acc: 0.5440
Epoch 3/5
192s - loss: 1.1921e-07 - acc: 0.5440
Epoch 4/5

Those are my losses and accuracy.

1

There are 1 best solutions below

0
On

Note that 50% means no accuracy as it's as good as coin toss. Generally speaking when you have 54% accuracy it means that it's totally random. It's very hard to see "coin toss" randomness with neural network, that's where the 4% come from. At least in Keras with Tensorflow backend for binary prediction it goes like this.

Getting this result is a clear indicator that your model is not working. One of the common reasons for this is your features (either dependent variables or outcome variable). It will be easier for people to help you if you post your model, the code you use to transform your data and a head of your data.