Deriving a contingency truth table from two columns with yes and no values in R

375 Views Asked by At

I've been trying to manipulate my data in R to something similar here How to Find False Positive Prediction Count using R Script , but difficult doing it owing to lack of minimal example. My dataframe (called info) is like this:

obs sim no no no no no no no yes yes yes yes yes yes no no no no no no no no no yes yes NA yes no yes yes yes yes yes yes yes

What I would like to obtain is a truth table that drops any row with NA in either column, with the result as follows:

   obs  sim 
     yes no  
yes    6 2  
 no    1 7 
2

There are 2 best solutions below

2
On BEST ANSWER

We can use complete.cases to create a logical index that will give FALSE if there is any NA in a row to subset the rows and then apply table

table(info[complete.cases(info),])
#    sim
#obs    no yes
#  no    7   2
#   yes  1   6

Or with na.omit

table(na.omit(info))

data

info <- structure(list(obs = c("no", "no", "no", "no", "yes", "yes", 
 "yes", "no", "no", "no", "no", "yes", NA, "no", "yes", "yes", 
 "yes"), sim = c("no", "no", "no", "yes", "yes", "yes", "no", 
 "no", "no", "no", "no", "yes", "yes", "yes", "yes", "yes", "yes"
 )), class = "data.frame", row.names = c(NA, -17L))
0
On

This will show you how to reassemble that ambiguous data presentation into one possible version of your truth.

dat <- scan(text=" no no no no no no no yes yes yes yes yes yes no no no no no no no no no yes yes NA yes no yes yes yes yes yes yes yes", what="")
Read 34 items
mdat <- matrix( dat, ncol=2, dimnames=list(NULL, c("obs","sim")))

 mdat
#------------
      obs   sim  
 [1,] "no"  "no" 
 [2,] "no"  "no" 
 [3,] "no"  "no" 
 [4,] "no"  "no" 
 [5,] "no"  "no" 
 [6,] "no"  "yes"
 [7,] "no"  "yes"
 [8,] "yes" NA   
 [9,] "yes" "yes"
[10,] "yes" "no" 
[11,] "yes" "yes"
[12,] "yes" "yes"
[13,] "yes" "yes"
[14,] "no"  "yes"
[15,] "no"  "yes"
[16,] "no"  "yes"
[17,] "no"  "yes"

 ?table
 table(mdat[,1],mdat[,2], dnn =list("obs","sim"))
#--------------
     sim
obs   no yes
  no   5   6
  yes  1   4

The table function automatically removes the NA rows from the calculation.