Plotting equation and r-squared on separate lines within plot using substitute

1.4k Views Asked by At

There are plenty of questions and answers on SO regarding the annotation of a plot to include a linear regression's equation and r-squared. Many are versions of the code from this question, which annotates a ggplot2 plot. I'd like to have the these regression terms to be included as separate lines on the plot. Instead of:

y = b + mx, r2 = 0.xxx 

as shown on the plot below, I'd prefer:

y = b + mx 
r2 = 0.xxx

Is there a way to use substitute to produce line breaks? I've attempted to insert an \n or "\n" instead of the ",", but these were not successful. If not, is there another similar method to produce such results? Admittedly, I've been largely unsuccessful in determining the syntax used with the substitute code below. ~ appears to be insert a space, I don't know what the * does, etc.

# https://stackoverflow.com/q/7549694/1670053
p <- ggplot(data = cars, aes(x = speed, y = dist)) +
  geom_smooth(method =lm, se=F) + geom_point()

lm_eqn = function(m) {

  l <- list(a = format(coef(m)[1], digits = 2),
            b = format(abs(coef(m)[2]), digits = 2),
            r2 = format(summary(m)$r.squared, digits = 3));

  if (coef(m)[2] >= 0)  {
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
  } else {
    eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l)    
  }

  as.character(as.expression(eq));                 
}

p1 <- p + annotate("text", x = 7.5, y = 100, label = lm_eqn(lm(dist ~ speed, cars)), 
                  colour="black", size = 5, parse=TRUE)

plot from example code produced using ggplot2 and the cars dataset

1

There are 1 best solutions below

0
On BEST ANSWER

If there is an issue with the plotmath engine as BondedDust suggested, I guess the below could be a work around. It uses two label functions: one for the equation and one for the r2.

p <- ggplot(data = cars, aes(x = speed, y = dist)) +
  geom_smooth(method =lm, se=F) + geom_point()

# lm equation
lm_eqn = function(m) {
  l <- list(a = format(coef(m)[1], digits = 2),
            b = format(abs(coef(m)[2]), digits = 2));
  if (coef(m)[2] >= 0)  {
    eq <- substitute(italic(y) == a + b %.% italic(x),l)
  } else {
    eq <- substitute(italic(y) == a - b %.% italic(x),l)    
  }
  as.character(as.expression(eq));                 
}

# r2
lm_eqn2 = function(m) {
  l <- list(r2 = format(summary(m)$r.squared, digits = 3));
  eq <- substitute(italic(r)^2~"="~r2,l)
  as.character(as.expression(eq));                 
}

p1 <- p + annotate("text", x = 7.1, y = 100, label = lm_eqn(lm(dist ~ speed, cars)), 
                  colour="black", size = 5, parse=TRUE) 
p2 <- p1 + annotate("text", x = 6.5, y = 90, label = lm_eqn2(lm(dist ~ speed, cars)), 
                    colour="black", size = 5, parse=TRUE) 

However, to get the two lines left-aligned, it took some trial and error. This solution might not be an advantage over just annotating the text without a function.

plot produced by r code

This code works slightly better, but you still need to adjust the y value in annotate to your dataset.

p1 <- p + annotate("text", x=min(cars$speed), y=max(cars$dist), 
                   label = lm_eqn(lm(dist ~ speed, cars)), 
                   parse=T, hjust = 0, vjust = 1) + 
                   annotate("text", x = min(cars$speed), y = (max(cars$dist)-10), 
                   label = lm_eqn2(lm(dist ~ speed, cars)), 
                   parse=T, hjust = 0)

ggplot2 with annotated text using max and min value r2

I also tried using the Inf / -Inf to set the x and y values, but things don't quite line up the way you might like. It was take some trial and error to get both line of text line up away from the x margin.

p1 <- p + annotate("text", x=-Inf, y=Inf, 
                   label = lm_eqn(lm(dist ~ speed, cars)), 
                   parse=T, hjust = 0, vjust = 1) + 
  annotate("text", x = -Inf, y = Inf, 
           label = lm_eqn2(lm(dist ~ speed, cars)), 
           parse=T, hjust = 0, vjust=2)

ggplot2 plot annotated using Inf for x and y values