Cohen's Kappa for 11 categories, 2 raters in R

147 Views Asked by At

Using R. I have a dataset of 30 observations. Each observation is scored 1 or 0 for 11 different categories (so each observation will have 11 codes associated with it, across the different categories). Each observation has been scored by two different raters. Right now, the data are rows of observations, and columns of category scores for rater 1 (columns 1-11) and rater 2 (columns 12-23).

Is the most efficient way to calculate Cohen's Kappa to generate a new dataframe for each category, that has the 30 observations and then the 11 scores from rater 1 and 11 scores from rater 2? And just make 11 different data frames-- one data frame for each category that includes both raters scores? That seems inefficient, but so far is the only fix I have.

rater1_cat1  rater1_cat2  rater1_cat3... rater2_cat1  rater2_cat2  rater2_cat3...rater2_cat11
1     0             0            1               0           1           0              1   
2     0             0            1               0           1           0              1   
3     0             0            1               0           1           0              1   
4     0             0            1               0           1           0              1   
5     0             0            1               0           1           0              1   
...
30

That is what the data are like. And what I want is to calculate Cohen's Kappa between each rater for each respective category in R. Thanks for any help!

Tried a new data frame for each rater and each category. I got a data frame that can be used to calculate Cohen's Kappa, but not satisfied since this seems silly to do for very large datasets.

1

There are 1 best solutions below

0
On

You can vectorize the calculation over the whole dataset:

cm <- colMeans(data)

data.frame(
  cat = paste0("cat", 1:11),
  kappa = unname(
    1 - colMeans(data[,1:11] != data[,12:22])/(cm[1:11] + cm[12:22] - 2*cm[1:11]*cm[12:22])
  )
)
#>      cat        kappa
#> 1   cat1 -0.071428571
#> 2   cat2 -0.412556054
#> 3   cat3  0.189189189
#> 4   cat4 -0.081081081
#> 5   cat5  0.054054054
#> 6   cat6 -0.126760563
#> 7   cat7 -0.200000000
#> 8   cat8  0.004149378
#> 9   cat9  0.109589041
#> 10 cat10 -0.034482759
#> 11 cat11 -0.071428571

Data:

set.seed(1233573815)
data <- matrix(sample(0:1, 30*22, 1), 30, 22, 0, list(NULL, paste0("rater", rep(1:2, each = 11), "_cat", rep(1:11, 2))))