Add direct labels to geom_smooth rather than geom_line

6.5k Views Asked by At

I recognize that this question is a close duplicate of this one, but the solution there no longer works (using method="last.qp"), so I'm asking it again.

The basic issue is that I'd like to use directlabels (or equivalent) to label smoothed means for each group (from stat_smooth()), rather than the actual data. The example below shows as close as I've gotten, but the labels aren't recognizing the grouping, or even the smoothed line. Instead, I'm getting the label at the last point. What I'd like is colour-coordinated text at the end of each stat_smooth(), rather than the legend on the right of the plot. This post provides an approach for labelling the last data point (the behaviour I'm seeing), but I'm looking for an approach to label automatically-generated summaries, if possible.

Here's an example:

library(ggplot2)
library(directlabels)

## Data
set.seed(10)
d <- data.frame(x=seq(1,100,1), y=rnorm(100, 3, 0.5))
d$z <- ifelse(d$y>3,1,0)

## Plot
p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, se=F, span=0.8, show.legend = T) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110)) +
  geom_dl(label="text", method="maxvar.points", inherit.aes=T)
p

which makes this plot: enter image description here

2

There are 2 best solutions below

1
Tung On BEST ANSWER

A solution using ggrepel package based on this answer

library(tidyverse)
library(ggrepel)

set.seed(123456789)

d <- data.frame(x = seq(1, 100, 1), y = rnorm(100, 3, 0.5))
d$z <- ifelse(d$y > 3, 1, 0)

labelInfo <-
  split(d, d$z) %>%
  lapply(function(t) {
    data.frame(
      predAtMax = loess(y ~ x, span = 0.8, data = t) %>%
        predict(newdata = data.frame(x = max(t$x)))
      , max = max(t$x)
    )}) %>%
  bind_rows

labelInfo$label = levels(factor(d$z))
labelInfo

#>   predAtMax max label
#> 1  2.538433  99     0
#> 2  3.293859 100     1

ggplot(d, aes(x = x, y = y, color = factor(z))) + 
  geom_point(shape = 1) +
  geom_line(colour = "grey50") +
  stat_smooth(inherit.aes = TRUE, se = FALSE, span = 0.8, show.legend = TRUE) +
  geom_label_repel(data = labelInfo, 
                   aes(x = max, y = predAtMax, 
                       label = label, 
                       color = label), 
                   nudge_x = 5) +
  theme_classic()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2018-06-11 by the reprex package (v0.2.0).

3
M-- On

You need to tell geom_dl what you want to appear on you plot. The code below should simply address your needs;

p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, se=F, span=0.8, method = "loess", show.legend = F) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110)) +
  geom_dl(label=as.factor(d$z), method="maxvar.points", inherit.aes=T)

If you want different text rather than 0 and 1 you just need to make it based on d$z and put that instead of as.factor(d$z).

enter image description here

In order to put the labels beside last points of geom_smooth rather than last datapoints, I could not find any of the methods in geom_dl to do so, therefore, came up with a workaround:

p <- ggplot(d, aes(x=x, y=y, colour=as.factor(z))) +
  stat_smooth(inherit.aes=T, aes(label=as.factor(z)), se=F, 
              span=0.8, method = "loess", show.legend = F) +
  geom_line(colour="grey50") +
  scale_x_continuous(limits=c(0,110))


library(data.table)
smooth_dat <- setDT(ggplot_build(p)$data[[1]])
smooth_lab <- smooth_dat[smooth_dat[, .I[x == max(x)], by=group]$V1]


p + annotate("text", x = smooth_lab$x, y=smooth_lab$y, 
             label=smooth_lab$label,colour=smooth_lab$colour,
             hjust=-1)

enter image description here