Picking joint bandwidth of NaN within ggridge

1.8k Views Asked by At

I am trying to generate ridge plot like the one described here but error "Picking joint bandwidth of NaN" keeps showing up. What's wrong with this? Thankful for any pointer, tip. Best

toplot = structure(list(Year = c("2000", "2000", "2001", "2001", "2002", 
"2002", "2003", "2003", "2004", "2004", "2005", "2005", "2006", 
"2006", "2007", "2007", "2008", "2008", "2009", "2009", "2010", 
"2010", "2011", "2011", "2012", "2012", "2013", "2013", "2014", 
"2014", "2015", "2015", "2016", "2016", "2017", "2017", "2018", 
"2018", "2019", "2019", "2020", "2020", "2021", "2021"), genes = c("DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", 
"DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", 
"DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2"), n = c(2L, 
0L, 2L, 0L, 2L, 0L, 3L, 0L, 5L, 0L, 5L, 0L, 4L, 0L, 6L, 0L, 2L, 
0L, 4L, 0L, 13L, 0L, 7L, 0L, 7L, 0L, 169L, 1L, 182L, 0L, 215L, 
56L, 147L, 11L, 165L, 115L, 10L, 62L, 13L, 74L, 14L, 59L, 67L, 
44L)), row.names = c(NA, -44L), class = c("tbl_df", "tbl", "data.frame"
))


toplot %>%
  mutate(YearFct = fct_rev(as.factor(Year))) %>%
  ggplot(aes(y = YearFct)) +
  geom_density_ridges(
    aes(x = n, fill = paste(YearFct, genes)), 
    alpha = .8
  ) +
  labs(
    x = "No_Patent",
    y = "Year"
  ) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_x_continuous(expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  theme_ridges(grid = FALSE)
1

There are 1 best solutions below

0
On BEST ANSWER

Your code is fine. You just don't have enough data for this kind of plot. You only have a single measurement for each gene in each year. You are therefore trying to create a density estimate based on a single point each year, which doesn't work. You need at least two points each year to get an automatic bandwidth selection.

If you simulate plentiful data, you will see your code works well enough.

set.seed(1)

toplot <- data.frame(Year = rep(2000:2021, each = 20),
                     genes = rep(c("DAO", "IDH2"), 220),
                     n = round(rexp(440, rep(c(0.9, 0.05), 220))))

library(tidyverse)
library(ggridges)

toplot %>%
  mutate(YearFct = fct_rev(as.factor(Year))) %>%
  ggplot(aes(y = YearFct)) +
  geom_density_ridges(
    aes(x = n, fill = genes), 
    alpha = .8
  ) +
  labs(
    x = "No_Patent",
    y = "Year"
  ) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_x_continuous(expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  theme_ridges(grid = FALSE)
#> Picking joint bandwidth of 3.71

One possible solution is to create your own density curves using dnorm, with n being the mean and the groupwise standard deviation being the sd. This gives you some idea of the uncertainty involved. It does at least produce a kind-of informative plot, though is probably less honest than a simple dodged bar plot, which would be the obvious way to plot this data set

toplot %>%
  group_by(genes) %>%
  mutate(sd = sd(n)) %>%
  group_by(Year, genes) %>%
  summarise(x = seq(0, 300, length = 1000),
            dens = dnorm(x, n, sd),
            dens = dens/max(dens)) %>%
  mutate(YearFct = fct_rev(as.factor(Year))) %>%
  ggplot(aes(y = YearFct, x = x)) +
  geom_ridgeline(
    aes(height = dens, fill = genes), 
    alpha = .8
  ) +
  labs(
    x = "No_Patent",
    y = "Year"
  ) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_x_continuous(expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  theme_ridges(grid = FALSE)

enter image description here

I think this is harder to interpret, less honest, and arguably less attractive than a simple line plot.

toplot %>%
  ggplot(aes(factor(Year), n, color = genes, group = genes)) +
  geom_line(size = 1.5) +
  geom_point(size = 4, shape = 21, fill = "white") +
  scale_color_manual(values = c("deepskyblue4", "orange")) +
  labs(x = "Year") +
  theme_light(base_size = 16)

enter image description here

Created on 2022-04-24 by the reprex package (v2.0.1)