How to draw 2D quantile-based densities of a scatterplot for the outcome variable in R?

Question

How to draw 2D quantile-based densities of a scatterplot for the outcome variable in R?

84 Views Asked by Shawn Hemelstrand At 27 July 2025 at 10:21

I realize this question has been asked in similar ways multiple times here. I am not asking for a scatterplot which includes a density heatmap of the data, as this captures the density of both variables as a smooth function. What I am looking for is something like this, which takes a "slice" of the distribution of the outcome variable and overlays it on the scatterplot:

The best I could come up with something like this:

#### Load Library ####
library(tidyverse)

#### Get IQR ####
q <- quantile(iris$Sepal.Length, 
              probs = c(.25,.5,.75))
q

#### Label Quantile Regions ####
qiris <- iris %>% 
  mutate(qs = ifelse(Sepal.Length >= q[3],
                     "Q75",
                     ifelse(Sepal.Length >= q[2],
                            "Q50","Q25")))

#### Plot Density and Scatter ####
ggplot()+
  geom_point(aes(x=Sepal.Width,
                 y=Sepal.Length),
             data=iris)+
  geom_density(aes(y=Sepal.Length,
                   fill=qs),
               data=qiris)

But predictably this fails because it doesn't place the "slice" of the distribution in relation to the predictor values.

I then came up with a slightly better solution that locates the distribution of values correctly:

library(ggridges)

ggplot(qiris, 
       aes(x = Sepal.Length,
           y = qs)) + 
  stat_density_ridges(quantiles = c(0.25,0.5,0.75),
                      geom="density_ridges_gradient",
                      jittered_points = TRUE,
                      position = "raincloud", 
                      alpha = 0.6, 
                      scale = 0.6)+
  coord_flip()

Which gives me this:

However there are still three issues here. First, I can't fit a regression line through it. Second, I would prefer the data points to be next to each other like a normal scatter plot rather than separated spatially by quantile so they're too far away. Third, this isn't including the other variable at all, which is important.

Edit

The answer by Allen looks good at first, but I think there is something I'm not seeing in his code. To try to figure this out, I tried using another dataset and saving the inputs as objects in R to make it easier to swap everything. When I do this, I get flat lines across the plot.

#### Load Library ####
library(tidyverse)

#### Save Objects ####
dfy <- mtcars$mpg # y var
dfx <- mtcars$hp # x var
data <- mtcars # dataset

#### QDATA ####
qdata <- data %>% 
  mutate(cut_group = cut(dfy, 
                         quantile(dfy, c(0.125, 0.375, 0.625, 0.875)),
                         labels = c('Q25', 'Q50', 'Q75')),
         baseline = quantile(dfy, 
                             c(0.25, 0.5, 0.75))[as.numeric(cut_group)]) %>%
  filter(complete.cases(.)) %>%
  group_by(cut_group) %>%
  reframe(dfxx = density(dfx)$x,
          dfy = first(baseline) - density(dfx, bw = 0.5)$y/3) %>%
  rename(dfx = dfxx) 

ggplot(data,
       aes(dfy,
           dfx)) +
  geom_smooth(method = 'lm', 
              color = 'gray',
              se = FALSE) +
  geom_point(color = 'navy',
             shape = 21,
             fill = NA) +
  geom_path(data = qdata,
            aes(group = cut_group), 
            color = 'darkgreen',
            linewidth = 1.5) +
  theme_classic() +
  theme(panel.border = element_rect(fill = NA, 
                                    linewidth = 1))

Like so:

Original Q&A

There are 1 best solutions below

**Allan Cameron** · Answer 1

I would probably do this by precalculating density of the quantiles and drawing them on as geom_path:

quartiles <- quantile(iris$Sepal.Width)
midpoints <- quartiles[-5] + 0.5 * diff(quartiles)

qiris <- iris %>% 
  mutate(Q = cut(Sepal.Width, quartiles, labels = paste0('Q', 1:4)),
         baseline = midpoints[as.numeric(Q)]) %>%
  filter(complete.cases(.)) %>%
  group_by(Q) %>%
  reframe(SepalLength = density(Sepal.Length)$x,
          Sepal.Width = first(baseline) - density(Sepal.Length, bw = 0.5)$y/3) %>%
  rename(Sepal.Length = SepalLength) 

ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
  annotate('rect', xmin = quartiles[-5], xmax = quartiles[-1], ymin = -Inf,
           ymax = Inf, fill = c('gray', NA, 'gray', NA), alpha = 0.2) +
  annotate('text', x = midpoints, y = 9, label = paste0('Q', 1:4)) +
  geom_smooth(method = 'lm', color = 'gray', se = FALSE) +
  geom_point(color = 'navy', shape = 21, fill = NA) +
  geom_path(data = qiris, aes(group = Q), color = 'darkgreen',
            linewidth = 1.5, alpha = 0.5) +
  theme_classic() +
  theme(panel.border = element_rect(fill = NA, linewidth = 1))

For the mtcars example, you will need to select a different bandwidth and multiplier for the density to get it into approximately the same scale as the existing variables:

quartiles <- quantile(mtcars$mpg)
midpoints <- quartiles[-5] + 0.5 * diff(quartiles)

qmtcars <- mtcars %>% 
  mutate(Q = cut(mpg, quartiles, labels = paste0('Q', 1:4)),
         baseline = midpoints[as.numeric(Q)]) %>%
  filter(complete.cases(.)) %>%
  group_by(Q) %>%
  reframe(HP = density(hp)$x,
          mpg = first(baseline) - density(hp, bw = 100)$y * 500) %>%
  rename(hp = HP) 

ggplot(mtcars, aes(mpg, hp)) +
  annotate('rect', xmin = quartiles[-5], xmax = quartiles[-1], ymin = -Inf,
           ymax = Inf, fill = c('gray', NA, 'gray', NA), alpha = 0.2) +
  annotate('text', x = midpoints, y = 450, label = paste0('Q', 1:4)) +
  geom_smooth(method = 'lm', color = 'gray', se = FALSE) +
  geom_point(color = 'navy', shape = 21, fill = NA) +
  geom_path(data = qmtcars, aes(group = Q), color = 'darkgreen',
            linewidth = 1.5, alpha = 0.5) +
  theme_classic() +
  theme(panel.border = element_rect(fill = NA, linewidth = 1))

How to draw 2D quantile-based densities of a scatterplot for the outcome variable in R?

Edit

There are 1 best solutions below

Related Questions in R

Related Questions in GGPLOT2

Related Questions in SCATTER-PLOT

Related Questions in DENSITY-PLOT

Related Questions in GGRIDGES

Trending Questions

Popular # Hahtags

Popular Questions