iterate dependent variable using purrr and map (tidyverse)

1k Views Asked by At

I have a simple dataset that I want to iterate the dependent variable using aov and tidyverse. From those outputs I then want to compute Tukey HSD tests. I have this working in a for loop structure, but am trying my hardest to migrate from that mentality. I saw this post on iterating aovfunctions with the independent variables. Tried to incorporate this logic into my workflow, but not working out so well. Any tidyverse aficionados that could steer me in the right direction here?

library(tidyverse)
library(data.table)

pfuel <- fread("data/CFL.csv") %>%
  mutate(AFCL = AFCL*10,
         LCW = LCW*10,
         DCW = DCW*10,
         LiDe = ifelse(Status == "Li", "Live", "Dead")) %>%
  filter(S.F == "S") %>%
  group_by(Site, Year, Age, Plot) %>%
  select(LiFol, DeFol, Li.1hr, De.1hr, Li.10hr, De.10hr, Li.100hr, De.100hr) %>%
  summarise_all(sum) %>%
  ungroup() %>%
  mutate(sb_age = paste0(Year, Age))

aov.models = pfuel %>%
  select (-c(Year, Age)) %>%
  select(LiFol, DeFol, Li.1hr, De.1hr, Li.10hr, De.10hr, Li.100hr, De.100hr, Site, Plot, sb_age) %>%
  map(~ aov(.x ~ sb_age + Site/Plot, data = pfuel))

When the aov.models runs I generate this error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y'
In addition: Warning message:
In model.response(mf, "numeric") : NAs introduced by coercion

I haven't gotten to the Tukey test yet, as I cannot get past the aov function. Any suggestions would be GREATLY appreciated!

You can find the data here: https://www.dropbox.com/s/yb8rh860fc7fff2/CFL.csv?dl=0

Thanks!

2

There are 2 best solutions below

1
On BEST ANSWER

It may be easier to convert the data to long form, split by response, then fit models and feed the output to the HSD.test function, e.g.,

aov.models <- pfuel %>%
  select(-Year, -Age) %>%
  gather(variable, value, -sb_age, -Site, -Plot) %>%
  split(.$variable) %>%
  map(~ aov(value ~ sb_age + Site/Plot, data = .x)) %>%
  map(HSD.test, trt = 'sb_age')

I also removed one of the select() statements, as it was selecting all of the columns.

1
On

@Z.Lin With your guidance I figured out a solution to the first part of my question. Probably not the most elegant, but it is at least working now! Any refinement would be welcomed, but thank you.

pfuel_var <- pfuel %>%
  select(Site, Plot, sb_age) %>%
  mutate(Site = as.factor(Site),
         Plot = as.factor( Plot),
         sb_age = as.factor(sb_age))

aov.models <- pfuel %>%
  select(LiFol, DeFol, Li.1hr, De.1hr, Li.10hr, De.10hr, Li.100hr, De.100hr) %>%
  map(~ aov(.x ~ pfuel_var$sb_age + pfuel_var$Site/pfuel_var$Plot, data = pfuel))

The second part of my question was how to feed this output into HSD.test from the agricolae package. Anyone have thoughts on that?

What I was thinking would be:

t <- aov.models %>% 
   map(~ HSD.test(.x, "pfuel_var$sb_age", alpha=0.1))

But that is not working properly. Thoughts very much appreciated.