How can I code in r for the threshold of a predictive model to automatically be a value such that the sensitivity is a particular proportion/value for all runs of the model?
For example, given the following scenarios:
- At threshold of 0.2; True positive = 20, False negative = 60 i.e. sensitivity of 0.25
- At threshold of 0.35; True positive = 60, False negative = 20 i.e. sensitivity of 0.8
How do I write an r code that automatically always picks the threshold for sensitivity 0.8 i.e. scenario 2 from above? For context, I'm using the caret modelling framework.
These links on threshold optimization did not help much:
http://topepo.github.io/caret/using-your-own-model-in-train.html#Illustration5
(1)
Say you have a data with values and true labels. Here, 5 false and 5 true
At threshold 9, 9 and 10 were detected as true positive, sensitivity = 40% At threshold 6 (or anything between 5 and 6), (6,7,9,10) were detected, sensitivity = 80%
To see the ROC curve, you can use the pROC package
If you want percentage, do below
and replace 0.8 with 80 in below.
You can get the thresholds from the roc object
You might see it says 4.5 and 5.5
You may also use roc.demo$sensitivities > 0.79 & roc.demo$sensitivities < 0.81
(2)
Alternatively, if you just want a threshold and don't care about the specificity, you may try the quantile function
probs=0.20 corresponds to 80% sensitivity
Anything threshold between 4 and 6 is what you are looking for. You may change the probs as you need.
Hopefully, it helps.