I'm trying to do a bayesian gamma regression with stan. I know the correct link function is the inverse canonical link, but if i dont use a log link parameters can be negative, and enter in a gamma distribution with a negative value, that obviously can't be possible. how can i deal with it?
parameters {
vector[K] betas; //the regression parameters
real beta0;
real<lower=0.001, upper=100 > shape; //the variance parameter
}
transformed parameters {
vector<lower=0>[N] eta; //the expected values (linear predictor)
vector[N] alpha; //shape parameter for the gamma distribution
vector[N] beta; //rate parameter for the gamma distribution
eta <- beta0 + X*betas; //using the log link
}
model {
beta0 ~ normal( 0 , 2^2 );
for(i in 2:K)
betas[i] ~ normal( 0 , 2^2 );
y ~ gamma(shape,shape * eta);
}
I was struggling with this a couple weeks ago, and while I don't consider this a definitive answer, hopefully it is still helpful. For what it's worth, McCullagh and Nelder directly acknowledge this inappropriate support of the canonical link function. They advise that one must constrain the
beta
s to properly match the support. Here's the relevant passageIt depends on your
X
values, but as far as I can tell (please correct me someone!) in an MCMC-based Bayesian case, you can achieve this by either using a truncated prior on thebeta
s or a strong enough prior on your intercept to make the inappropriate regions numerically impossible to reach.In my case, I ultimately used an identity link with a strong positive prior intercept and that was sufficient and yielded reasonable results.
Also, the choice of link really depends on your
X
. As the passage above implies, the use of the canonical link assumes that your linear model is in rate space. Using log or identity link functions appear to be also very common, and ultimately it's about providing a space that offers a sufficient span for the linear function to capture the response.