I was training my model for medical image segmentation. The loss function is cross entropy. I use Dice and IoU as the primary metrics for the evaluation.
I was confused which metrics should be used for early stopping.
Most of answers I googled said accuracy or loss. But from my experience, the best model chosen when based on Dice performed much better compared to loss.
Here are some validation metrics. Although the loss curve seems to indicate that model overfitted, the model actually performed better when early stopping based on dice compared to loss.
the results on test set:
| ver | acc | sen | spe | iou | dice |
|---|---|---|---|---|---|
| loss | 0.916 | 0.782 | 0.956 | 0.678 | 0.784 |
| dice | 0.920 | 0.804 | 0.968 | 0.719 | 0.814 |