Adapt method cor.test for each data frame in a list

331 Views Asked by At

I want to adapt the method in cor.test in R for each data frame in a list of data frames.

data(iris)
iris.lst <- split(iris[, 1:2], iris$Species)
options(scipen=999)

normality1 <- lapply(iris.lst, function(x) shapiro.test(x[,1]))
p1 <- as.numeric(unlist(lapply(normality1, "[", c("p.value"))))
normality2 <- lapply(iris.lst, function(x)shapiro.test(x[,2]))
p2 <- as.numeric(unlist(lapply(normality2, "[", c("p.value"))))
try <- ifelse (p1 > 0.05 | p2 > 0.05, "spearman", "pearson")

# Because all of them are spearman:
try[3] <- "pearson"
for (i in 1: length(try)){
   results.lst <- lapply(iris.lst, function(x) cor.test(x[, 1], x[, 2], method=try[i]))
   results.stats <- lapply(results.lst, "[", c("estimate", "conf.int", "p.value"))
   stats <- do.call(rbind, lapply(results.stats, unlist))
   stats
}

But it does not compute for each data frame individual cor.test...

cor.test(iris.lst$versicolor[, 1], iris.lst$versicolor[, 2], method="pearson")`
stats
# Should be spearman corr.coefficient but is pearson

Any advice?

1

There are 1 best solutions below

3
On BEST ANSWER

Let me check if I understand what you want to achieve. You have a list of data frames and a list of corresponding methods you want to apply (one methood for each dataframe). If my assumpution is correct, then you need to do something like this (instead of your for loop):

for (i in 1: length(try)){
  results.lst <- cor.test(iris.lst[[i]][, 1], iris.lst[[i]][, 2], method=try[i])
  print(results.lst)
}

Edit: There are many ways to get your stats, here's one. But first a couple of notes:

  • I would find a way to make sure that I'm using the right method with the right dataset, in what follows I use named lists.
  • As far as I can tell only the "pearson" method has a confidence interval, which we have to deal with when creating the stats, or you can just look at the p-value and estimate.
  • we'll use sapply instead of a for loop to get the stats immediatly as a table, and
  • the function t to transpose the table
names(try) <- names(iris.lst)
t(
  sapply(names(try), 
       function(i) {
         result <- cor.test(iris.lst[[i]][, 1], iris.lst[[i]][, 2], method=try[[i]])
         to_return <- result[c("estimate", "p.value")]
         to_return["conf.int1"] <- ifelse(is.null(result[["conf.int"]]), NA, result[["conf.int"]][1])
         to_return["conf.int2"] <- ifelse(is.null(result[["conf.int"]]), NA, result[["conf.int"]][2])
         return(to_return)
         }
       )
  )

output:

           estimate  p.value           conf.int1 conf.int2
setosa     0.7553375 0.000000000231671 NA        NA       
versicolor 0.517606  0.0001183863      NA        NA       
virginica  0.4572278 0.0008434625      0.2049657 0.6525292