Looking for a way to graph density ridges on pre aggregated data. Is there a way to feed pre aggregated data to geom_density_ridges or something similar? Particularly looking for a solution that doesn't entail blowing out replicates/rows from the count totals.
Example:
library(dplyr)
library(ggridges)
library(zoo)
library(magrittr)
library(ggplot2)
dat <- data.frame(qtr = rep(c(as.yearqtr(Sys.Date()),
as.yearqtr(Sys.Date()-94),
as.yearqtr(Sys.Date()-185),
as.yearqtr(Sys.Date()-280)), 1000) %>%
factor(ordered = TRUE),
age = rbeta(4000, 8, 8) %>%
multiply_by(100) %>%
ceiling)
with subject-level records
dat %>%
ggplot(aes(x = age, y = qtr, fill = qtr)) +
geom_density_ridges(quantile_lines=TRUE,
quantile_fun=function(x,...) quantile(x, .5),
alpha = .5)
pre-summarised count data looking to use
dat_summy <- dat %>%
group_by(qtr, age) %>%
summarise(count = n())
The underlying
stats::density()function used by ggridges for density estimation is able to accept aweightsparameter, so this is certainly possible. I took a look at the package's GH repository, where this has been an open request for a while now, so it's unlikely that it will be changed just yet.But if you are comfortable with hacking the underlying ggproto objects, this is certainly possible for the use case you outlined here.
p.s. Do note that the data still needs to be temporarily "blown up" during density estimation in order to make appropriate calculations for bandwidth, standard deviation, etc., because the functions used for these do not accept a weight parameter, and it seems more robust to blow up the dataset than to find weighted alternatives from other, probably less known R packages.
Plotting code
Comparison of results
Data
Code used to define StatDensityRidges2