How to simulate quantities of interest using arm or rstanarm packages in R?

1.2k Views Asked by At

I would like to know how to simulate quantities of interest out of a regression model estimated using either the arm or the rstanarm packages in R. I am a newbie in Bayesian methods and R and have been using the Zelig package for some time. I asked a similar question before, but I would like to know if it is possible to simulate those quantities using the posterior distribution estimated by those packages.

In Zelig you can set the values you want for the independent values and it calculates the results for the outcome variable (expected value, probability, etc). An example:

# Creating a dataset:
set.seed(10)
x <- rnorm(100,20,10)
z <- rnorm(100,10,5)
e <- rnorm(100,0,1)
y <- 2*x+3*z+e
df <- data.frame(x,z,e,y)

# Loading Zelig
require(Zelig)

# Model
m1.zelig <- zelig(y ~ x + z, model="ls", data=df)
summary(m1.zelig)

# Simulating z = 10
s1 <- setx(m1.zelig, z = 10)
simulation <- sim(m1.zelig, x = s1)
summary(simulation)

So Zelig keeps x at its mean (20.56), and simulates the quantity of interest with z = 10. In this case, y is approximately 71.

The same model using arm:

# Model
require(arm)
m1.arm <- bayesglm(y ~ x + z, data=df)
summary(m1.arm)

And using rstanarm:

# Model
require(rstanarm)
m1.stan <- stanlm(y ~ x + z, data=df)
print(m1.stan)

Is there any way to simulate z = 10 and x equals to its mean with the posterior distribution estimated by those two packages and get the expected value of y? Thank you very much!

1

There are 1 best solutions below

2
On BEST ANSWER

In the case of bayesglm, you could do

sims <- arm::sim(m1.arm, n = 1000)
y_sim <- rnorm(n = 1000, mean = sims@coef %*% t(as.matrix(s1)), sd = sims@sigma)
mean(y_sim)

For the (unreleased) rstanarm, it would be similar

sims <- as.matrix(m1.stan)
y_sim <- rnorm(n = nrow(sims), mean = sims[,1:(ncol(sims)-1)] %*% t(as.matrix(s1)), 
               sd = sims[,ncol(sims)])
mean(y_sim)

In general for Stan, you could pass s1 as a row_vector and utilize it in a generated quantities block of a .stan file like

generated quantities {
  real y_sim;
  y_sim <- normal_rng(s1 * beta, sigma);
}

in which case the posterior distribution of y_sim would appear when you print the posterior summary.