I need to confirm my understanding of the threshold moving process to find the best cost of misclassification (binary) for imbalanced dataset.
- Split data into train and test.
- Fit the model on train data set.
- Obtain the predicted probabilities for train data
- Perform threshold moving to get the best threshold giving the least misclassification cost and compute confusion matrix.
- With the selected best threshold , predict class on test data probabilities and compute the test cost.
- Repeat steps 1 to 5 , for 'n' folds and compute the average test cost.
Can somebody please confirm this is the right way of threshold moving ?
Thanks !
Edit: When I cross validated with 5 folds , noticed that threshold that gives the least cost is not the same for all folds. So then , how should I proceed ? I am finding the average cost across the 5 folds, but how do I interpret the different thresholds ?