I’m fitting a logistic binomial model where the response variable is the sum of how many times a target picture was looked at during a certain time period out of how many times all pictures were looked at during that period (sum | trials(N) ~ x). This kind of response variable falls under “addition-terms” according to the brms
documentation. The model estimates are plausible and the fit is good, but the posterior predictive check isn’t as good as if I fit a regular logistic binomial model to the same data but unaggregated (y ~ x). Below is a dummy example of both models.
My questions are:
- The posterior predictive check of the addition-terms model isn’t actually bad, but is it expected that it would not be as clean as the regular model? And if so, why?
- Out of curiosity, is there any way to do a predictive check with the addition-terms model on the binomial scale rather than predicting "sum"?
# fake data
## long format
(dat_long <- data.frame(
subj = rep(1:10, each = 100),
item = rep(1:10, each = 10),
bin = rep(1:10, times = 10),
cond = c(-.5,.5),
pTarget = rbinom(1000, 1, .6)
))
## aggregated over bins/items
(dat_aggregated <- dat_long %>%
dplyr::group_by(subj, cond) %>%
dplyr::summarise(sum = sum(pTarget), N = length(pTarget)))
# model using addition-terms
m_aggregated <- brm(formula = sum | trials(N) ~ cond,
family = binomial(),
iter = 5000,
prior = priors,
data = dat_aggregated)
# regular model
m_long <- brm(formula = pTarget ~ cond,
iter = 1000,
family = binomial(),
prior = priors,
data = dat_long)
# posterior predictive checks
pp_check(m_aggregated)
pp_check(m_long)
The pp_check plots: