I have 15,000 datapoints and 5 classes. When I run this part of my code:
model.eval()
inputs, labels = next(iter(test_dataloader))
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
print("Predicted classes", outputs.argmax(-1))
print("Actual classes", labels)
I get the following output:
Predicted classes tensor([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2])
Actual classes tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Actual classes tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
In the tensors, there are 32 items since my batch size is 32.
0, 1, 2, 3 and 4 each correspond to 'healthy', 'mild npdr', 'moderate npdr', 'severe npdr', and 'pdr', respectively. I have 5382 datapoints in my healthy class, 2443 in mild, 5292 in moderate, 1049 in severe and 708 in pdr.
What I've also found is after a couple epochs, the accuracy starts to become the exact same:
>>> Epoch 1 train loss: 1.4067111531252503 train accuracy: 0.35078578031767377
>>> Epoch 1 test loss: 1.3799789259510655 test accuracy: 0.35126050420168065
Validation loss decreased (inf --> 1.379979). Saving model ...
epoch=1, learning rate=0.0010
>>> Epoch 2 train loss: 1.3790244847856543 train accuracy: 0.35750903437263637
>>> Epoch 2 test loss: 1.3944347340573546 test accuracy: 0.36168067226890754
EarlyStopping counter: 1 out of 3
epoch=2, learning rate=0.0010
>>> Epoch 3 train loss: 1.3741713842397094 train accuracy: 0.3529708378855366
>>> Epoch 3 test loss: 1.372046325796394 test accuracy: 0.36168067226890754
Validation loss decreased (1.379979 --> 1.372046). Saving model ...
epoch=3, learning rate=0.0010
>>> Epoch 4 train loss: 1.371311823206563 train accuracy: 0.3564165055887049
>>> Epoch 4 test loss: 1.367056119826532 test accuracy: 0.35126050420168065
Validation loss decreased (1.372046 --> 1.367056). Saving model ...
epoch=4, learning rate=0.0010
>>> Epoch 5 train loss: 1.369389453241902 train accuracy: 0.3569207496428271
>>> Epoch 5 test loss: 1.373937726020813 test accuracy: 0.36168067226890754
EarlyStopping counter: 1 out of 3
epoch=5, learning rate=0.0010
>>> Epoch 6 train loss: 1.3699962490348405 train accuracy: 0.36162702748130093
>>> Epoch 6 test loss: 1.3644213702089043 test accuracy: 0.36168067226890754
Validation loss decreased (1.367056 --> 1.364421). Saving model ...
epoch=6, learning rate=0.0010
>>> Epoch 7 train loss: 1.369478802847606 train accuracy: 0.3548197327506513
>>> Epoch 7 test loss: 1.3675817212750834 test accuracy: 0.36168067226890754
EarlyStopping counter: 1 out of 3
epoch=7, learning rate=0.0010
>>> Epoch 8 train loss: 1.3686860168492923 train accuracy: 0.35876964450794185
>>> Epoch 8 test loss: 1.3695218537443428 test accuracy: 0.36168067226890754
EarlyStopping counter: 2 out of 3
Epoch 00015: reducing learning rate of group 0 to 2.0000e-04.
epoch=8, learning rate=0.0002
>>> Epoch 9 train loss: 1.366326274730826 train accuracy: 0.3552399361290865
>>> Epoch 9 test loss: 1.3646431353784376 test accuracy: 0.36168067226890754
EarlyStopping counter: 3 out of 3
Early stopping
Furthermore, when I print my outputs:
print(outputs)
I get a matrix of dimensions batch size, num_classes (32, 5) with numbers in the columns relatively similar to each other:
tensor([[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557],
[ 0.4999, -0.3836, 0.4566, -1.7039, -1.3557]],
grad_fn=<AddmmBackward0>)
When i print the outputs on a subset of data, the values in the matrix are a bit varied. After training on a couple epochs, after the accuracy starts to become the same, the output matrix has the same numbers per row.
This error has been occurring since the start of this project. I personally think it is an underfitting issue, which is why I added EarlyStopping and a learning rate scheduler. After adding these things, I haven't gotten much progress in resolving this issue. This issue is most likely a bug involved with my inputs and labels, but I'm not certain yet.
**The code for this can be found in my GitHub repository: ** https://github.com/HydraulicSponge/VisionTransformer/blob/main/main.py
Any help in debugging would be appreciated. Also, please let me know if there is anything wrong with my code in general (mostly regarding the training/eval phases, EarlyStopping + lr scheduler, as well as the ViT, Attention, PositionalEncoding and Patch Embedding classes)
I may also just be calculating loss and accuracy wrong in my code. Please check over that and let me know.
Btw, This is how my folder is structured:
root_dir data training_data pdr severe npdr mild npdr moderate npdr healthy testing_data pdr severe npdr mild npdr moderate npdr healthy
This is the dataset I used for training:
https://www.kaggle.com/datasets/amanneo/diabetic-retinopathy-resized-arranged
I made sure to delete a lot of the "healthy" class images, as there were too many datapoints in the class in comparison to the other classes.
For testing, I used this dataset:
https://www.kaggle.com/competitions/aptos2019-blindness-detection/data