ggplot in a function: variable not found

1.4k Views Asked by At

I have an issue trying to create a function to creat a plot using ggplot. Here is some code:

y1<- sample(1:30,45,replace = T)
x1 <- rep(rep(c("a1","a2","a3","a4","a5"),3),each=3)
x2 <- rep(rep(c("b1","b2","b3","b4","b5"),3),each=3)
df <- data.frame(y1,x1,x2)
library(Rmisc)
dfsum <- summarySE(data=df, measurevar="y1",groupvars=c("x1","x2"))
myplot <- function(d,v, w,g) {
  pd <- position_dodge(.1)
  localenv <- environment()
  ggplot(data=d, aes(x=v,y=w,group=g),environment = localenv) + 
  geom_errorbar(data=d,aes(ymin=d$w-d$se, ymax=d$w+d$se,col=d$g), width=.4, position=pd,environment = localenv) +
  geom_line(position=pd,linetype="dotted") +
  geom_point(data=d,position=pd,aes(col=g))
}
myplot(dfsum,x1,y1,x2)

As I was looking for similar questions, I found that specifying the local environment should solve the issue. However it did not help in my case.

Thank you

2

There are 2 best solutions below

1
On BEST ANSWER

Preliminary Note

When looking at your data.frame, the group variable does not make any sense, as it is perfectly confounded with the x variable. Hence I adapted your data a bit, to show a full example:

Data

library(Rmisc)
library(ggplot2)
d <- expand.grid(x1 = paste0("a", 1:5),
                 x2 = paste0("b", 1:5))
d <- d[rep(1:NROW(d), each = 3), ]
d$y1 <- rnorm(NROW(d))
dfsum <- summarySE(d, measurevar = "y1", groupvars = paste0("x", 1:2))

Plot Function

myplot <- function(mydat, xvar, yvar, grpvar) {
   mydat$ymin <- mydat[[yvar]] - mydat$se
   mydat$ymax <- mydat[[yvar]] + mydat$se
   pd <- position_dodge(width = .5)
   ggplot(mydat, aes_string(x = xvar, y = yvar, group = grpvar,
                            ymin = "ymin", ymax = "ymax", color = grpvar)) +
      geom_errorbar(width = .4, position = pd) +
      geom_point(position = pd) + 
      geom_line(position = pd, linetype = "dashed")
}
myplot(dfsum, "x1", "y1", "x2")

Explanation

Your problem occurs because the scope of x1 x2 and y1 was ambiguous. As you defined these variables also at the top environmnet, R did not complain in the first place. If you had added a rm(x1, x2, y1)in your original code right after you created your data.frame you would have seen the problem already eralier.

ggplot looks in the data.frame you provide for all the variables you want to map to certain aesthetics. If you want to create a function, where you specify the name of the aesthatics as arguments, you should use aes_string instead of aes, as the former expects a string giving the name of the variable rather than the variable itself.

With this approach however, you cannot do calculations on the spot, so you need to create the variables yminand ymaxbeforehand in your data.frame. Furthermore, you do not need to provide the data argument for each geom if it is the same as provided to ggplot.

1
On

I've got it plotting something, let me know if this isn't the expected output.

plot

The changes I've made to the code to get it working are:

  • Load the ggplot2 library
  • Remove the d$ from the geom_errorbar call to w and g, as these are function arguments rather than columns in d.

I've also removed the data=d calls from all layers except the main ggplot one as these aren't necessary.

library(ggplot2)
myplot <- function(d,v, w,g) {
  pd <- position_dodge(.1)
  localenv <- environment()
  ggplot(data=d, aes(x=v,y=w,group=g),environment = localenv) +
    geom_errorbar(aes(ymin=w-se, ymax=w+se,col=g), width=.4,
              position=pd,environment = localenv) +
    geom_line(position=pd,linetype="dotted") +
    geom_point(position=pd,aes(col=g))
}
myplot(dfsum,x1,y1,x2)