geom_smooth line different color based on Y intercept line

Question

geom_smooth line different color based on Y intercept line

81 Views Asked by Jishan At 06 March 2024 at 23:32

My data is like below. Two column, serial number (SL) and the expression value (log)

> df

SL  log
1   1.5
2   -2.5
3   1.0
4   2.5
5   -1.

> ggplot(df, aes(x = SL, y = log)) +
  geom_point(size = 0.5, alpha = 0.6, shape = 19, color = "gray") +
  geom_smooth(method = "loess", se = FALSE, linewidth = 0.5, span = 0.09) +
  geom_hline(yintercept = 0, color = "black", lwd = 0.5)

However, I want to fill the geome_smooth loess line to be filled with red color if it is above the Y=0 intercept line and green color if below Y=0 intercept line. Example figure below.

How can I do that?

Original Q&A

There are 3 best solutions below

**Jon Spring** · Answer 1 · 2024-03-07T00:24:14.520000

I don't think there's a super simple answer, since you first need to extract the loess line to use it with geom_area (which can't vary in fill) or geom_ribbon (which won't automatically separate regions of the same sign).

I like using ggbraid once we've extracted the loess line, as it will make a nice clean interpolation with separate fill regions.

Given some fake data:

df <- data.frame(x = 1:100, 
                 y = sin(seq(0,8, length.out = 100)) + sin(1:100))

We could make a plot and extract the loess curve using the approach here:

p <- ggplot(df, aes(x, y)) +
  geom_point() +
  geom_smooth()
df2 <- ggplot_build(p)[[1]][[2]][,c("x","y")]

Then we have a slightly tricky problem involving interpolation, as detailed here and here. One simple solution is to use the ggbraid package, which identifies the exact crossing points and cleanly separates the fill areas.

# using ggbraid from remotes::install_github("nsgrantham/ggbraid")
ggplot(df, aes(x, y)) +
  ggbraid::stat_braid(aes(x, ymin = 0, ymax = y, fill = y < 0),
                      data = df2) +
  geom_point() +
  geom_line(data = df2) +
  geom_hline(yintercept = 0) +
  scale_fill_manual(values = c("red", "green"))

**Jishan** · Answer 2 · 2024-03-07T19:35:37.320000

Thanks for all your answers. Following code worked perfectly:

smoothed_values <- predict(loess(log ~ SL, data = df, span = 0.09)) 
fill_color <- ifelse(smoothed_values >= 0, "red", "green")

plot <- ggplot(df, aes(x = SL, y = log)) +
  geom_point(size = 0.5, alpha = 0.6, shape = 19, color = "gray") +
  ggbraid::geom_braid(aes(SL, ymin = 0, ymax = smoothed_values, fill = fill_color), 
                      alpha = 0.5, lwd = 0.5, color = "gray3") +
  geom_hline(yintercept = 0, lwd = 0.25, color = "black") +
  labs(x = "Gene Order", y = "logFC") +
  scale_fill_identity(name = "Gene Expression", 
                      labels = c("Down", "Up"), guide = "legend", 
                      breaks = c("green", "red"), limits = c("green", "red")) +
  theme_minimal() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line = element_line(colour = "black")) +
  scale_y_continuous(breaks = seq(-5, 5, by = 0.5))

**M--** · Answer 3 · 2024-03-08T01:54:10.247000

Here's an approach by interpolating the values between the rows where sign(y) changes, then creating two groups for stat_smooth data, one for positive and one for negative values, and finally recreating the plot with the new data.

First, we need to make a plot with the smoothed data.

library(ggplot2)
library(data.table)
# library(tidyr) ## we only need tidyr::fill

## example dataset
set.seed(123)
df.ex <- data.frame(s = 1:600, l = round(rnorm(600, sd = 2), 1))

## creating the first plot with stat_smooth; same fill color for + & -
ggplot(df.ex, aes(x = s, y = l)) +
  geom_point(size = 0.5, alpha = 0.6, shape = 19, color = "gray") +
  geom_hline(yintercept = 0, color = "black", lwd = 0.5) +
  stat_smooth(method = "loess", geom = "area", fill = "pink",
              se = FALSE, linewidth = 0.5, span = 0.09, 
              color = "blue",  alpha = 0.8) -> p; p

Then we extract the "smoothed data" from ggplot2 object, will find the x coordinates of where geom_smooth goes from positive y to negative or vice versa (i.e. y == 0), and will modify the data for the final plot.

## ggplot build to extract the smooth data
q <- ggplot_build(p)
#> `geom_smooth()` using formula = 'y ~ x'

smdat <- copy(q[[1]][[3]])
setDT(smdat)

## set the fill colors and group for negative and positive values
smdat[ , group := ifelse(y > 0 , 1, 2)][, fill := ifelse(y < 0 , "green", "red")]

## interpolation; taken from https://stackoverflow.com/a/27137211/6461462
smdat_grp <- smdat[ , {
  ix = .I[c(FALSE, abs(diff(sign(smdat$y))) == 2)]
  if(length(ix)){
    pred_x = sapply(ix, function(i) approx(x = y[c(i-1, i)], 
                                           y = x[c(i-1, i)], xout = 0)$y)
    rbindlist(.(.SD, data.table(x = pred_x, y = 0, group = 1, ymax = 0),
                     data.table(x = pred_x, y = 0, group = 2, ymax = 0)), 
              fill = TRUE)} else .SD}][order(x)]

## filling the NA values in the remaining columns 
lapply(split(smdat_grp, smdat_grp$group), \(dat) 
       tidyr::fill(dat, everything(), .direction = "downup")) |> 
    rbindlist() |> 
    as.data.frame() -> q[[1]][[3]]

## plot the modified ggplot
pq <- ggplot_gtable(q)

plot(pq)

As you can see, using ggbraid::geom_braid or ggbraid::stat_braid, although very close, gives us slightly different results from what we get from stat_smooth/geom_smooth.

smoothed_values <- predict(loess(l ~ s, data = df.ex, span = 0.09)) 
fill_color <- ifelse(smoothed_values >= 0, "red", "green")

ggplot(df.ex, aes(x = s, y = l)) +
  geom_point(size = 0.5, alpha = 0.6, shape = 19, color = "gray") +
  
  ggbraid::geom_braid(aes(s, ymin = 0, ymax = smoothed_values, fill = fill_color), 
                      alpha = 0.5, lwd = 0.5, color = "gray3") +
  geom_hline(yintercept = 0, lwd = 0.25, color = "black") +
  geom_smooth(method = "loess", se = FALSE, linewidth = 1, span = 0.09) +
  theme(legend.position = "none") +
  coord_cartesian(ylim = c(-1, 1)) ## zoom to see the difference
#> `geom_braid()` using method = 'line'
#> `geom_smooth()` using formula = 'y ~ x'

^{Created on 2024-03-07 with reprex v2.0.2}

geom_smooth line different color based on Y intercept line

There are 3 best solutions below

Related Questions in R

Related Questions in GGPLOT2

Related Questions in PLOT

Related Questions in LOESS

Trending Questions

Popular # Hahtags

Popular Questions