geom_ribbon with confidence intervals

743 Views Asked by At

I would expect the following snippet to print the 95% confidence intervals of the length of the sepals:

ggplot(iris,aes(x=Species,y=Sepal.Length)) +
  stat_summary(geom='ribbon',
               fun=mean_cl_normal, 
               fun.args=list(conf.int=0.95))

Which additional diagnostics could I run to elucidate why the plot stays empty?

Edit: I was using the 'ribbon' geometry, because it would be important for me to indicate the confidence intervals as a shaded area.
For a categorical x variable, the 'ribbon' geometry doesn't make too much sense, as suggested in the helpful answers.
Indeed, my variable on the x axis is actually continuous and I had been a bit unfortunate in choosing the iris dataset as a minimal reproducible example.
It would therefore make more sense to choose a minimal example like the following:

  ggplot(data.frame(x=rep(1:3,each=3),y=c(1:3,4:6,7:9))) +
    stat_summary(aes(x=x,y=y),
                 geom='ribbon',
                 fun=mean_cl_normal, 
                 fun.args=list(conf.int=0.95))
3

There are 3 best solutions below

4
tjebo On BEST ANSWER

What you're trying to visualise doesn't really make sense. You have a categorical variable x for which you have measurements y with a different variance for each value of x. What exactly is a ribbon between those x values supposed to signify?

Users Z.Lin and IRTFM have made a very valid point with using fun.data (+1)- and this is the correct way to show your data.

However, it is technically feasible to draw a ribbon, for which you then need to additionally specify group = 1, so that geom_ribbon draws between the categorical values. (Plot 1)

But I guess what you really want, is to draw the mean as a line and confidence intervals as a ribbon. For this, geom_ribbon will not be enough. You might use geom_smooth instead which draws a line and a ribbon, thus can deal with the three values which the mean_cl_normal function produces. (Plot 2)

library(tidyverse)
library(patchwork) ## loading just for demonstration 

## Plot 1 - using geom_ribbon
p1 <- ggplot(iris, aes(x = Species, y = Sepal.Length)) +
  stat_summary(
    geom = "ribbon",
    fun.data = mean_cl_normal,
    fun.args = list(conf.int = 0.95), group = 1
  ) +
  ggtitle("Plot 1")

## with geom_smooth
p2 <-
  ggplot(iris, aes(x = Species, y = Sepal.Length)) +
  stat_summary(
    geom = "smooth",
    fun.data = mean_cl_normal,
    fun.args = list(conf.int = 0.95),
    group = 1,
    alpha = .5,
    color = "black",
    se = TRUE
  ) +
  ggtitle("Plot 2")

p1 + p2

Created on 2023-04-09 with reprex v2.0.2

1
Quinten On

You could also use two stat_summary with first the mean point for each specie and after that use the mean_cl_normal confidence limits with errorbars like this:

library(ggplot2)
ggplot(iris,aes(x=Species,y=Sepal.Length)) +
  stat_summary(fun = mean, geom = "point") +
  stat_summary(fun.data = mean_cl_normal,
               geom = "errorbar")

Created on 2023-04-09 with reprex v2.0.2

1
IRTFM On

This is probably what you should have wanted:

ggplot(iris,aes(y=Sepal.Length, x=(Species) )) +
   stat_summary(
                fun.data=mean_cl_normal)

It follows the pattern of the first example on the help page:

enter image description here