Is correct cohen_kappa_score
outputs 0.0
when only 2% of the labels are not in agreement?
from sklearn.metrics import cohen_kappa_score
y1 = 100 * [1]
y2 = 100 * [1]
y2[0]=0
y2[1]=0
cohen_kappa_score(y1,y2)
#0.0
or did I miss something?
Is correct cohen_kappa_score
outputs 0.0
when only 2% of the labels are not in agreement?
from sklearn.metrics import cohen_kappa_score
y1 = 100 * [1]
y2 = 100 * [1]
y2[0]=0
y2[1]=0
cohen_kappa_score(y1,y2)
#0.0
or did I miss something?
Copyright © 2021 Jogjafile Inc.
The calculation is correct. This is an unfortunate downside to using this metric of agreement. If one class is predicted 100% of the time by at least one classifier, then the result will always be zero. If you have a few minutes, I encourage you to try calculating it yourself based on the example on Wikipedia.
As this paper's abstract puts it,
The full text describes the problem more fully with an example and concludes that
Another useful reference is Interrater reliability: the kappa statistic which advocates using both percent agreement and Cohen's kappa for a fuller picture of agreement.