Bayesian Gamma regression, what is the correct link function?

1.1k Views Asked by At

I'm trying to do a bayesian gamma regression with stan. I know the correct link function is the inverse canonical link, but if i dont use a log link parameters can be negative, and enter in a gamma distribution with a negative value, that obviously can't be possible. how can i deal with it?

parameters {
vector[K] betas; //the regression parameters
real beta0;
real<lower=0.001, upper=100 > shape; //the variance parameter
}
transformed parameters {
vector<lower=0>[N] eta; //the expected values (linear predictor)
vector[N] alpha; //shape parameter for the gamma distribution
vector[N] beta; //rate parameter for the gamma distribution

eta <- beta0 + X*betas; //using the log link 
}
model {  
beta0 ~ normal( 0 , 2^2 ); 

for(i in 2:K)
betas[i] ~ normal( 0 , 2^2 ); 

y ~ gamma(shape,shape * eta);
}
1

There are 1 best solutions below

5
On

I was struggling with this a couple weeks ago, and while I don't consider this a definitive answer, hopefully it is still helpful. For what it's worth, McCullagh and Nelder directly acknowledge this inappropriate support of the canonical link function. They advise that one must constrain the betas to properly match the support. Here's the relevant passage

The canonical link function yields sufficient statistics which are linear functions of the data and it is given by η = 1/μ. Unlike the canonical links for the Poisson and binomial distributions, the reciprocal transformation, which is often interpretable as the rate of a process, does not map the range of μ onto the whole real line. Thus the requirement that η > 0 implies restrictions on the βs in any linear model. Suitable precautions must be taken in computing β_hat so that negative values of μ_hat are avoided.

-- McCullagh and Nelder (1989). Generalized Linear Models. p. 291

It depends on your X values, but as far as I can tell (please correct me someone!) in an MCMC-based Bayesian case, you can achieve this by either using a truncated prior on the betas or a strong enough prior on your intercept to make the inappropriate regions numerically impossible to reach.

In my case, I ultimately used an identity link with a strong positive prior intercept and that was sufficient and yielded reasonable results.

Also, the choice of link really depends on your X. As the passage above implies, the use of the canonical link assumes that your linear model is in rate space. Using log or identity link functions appear to be also very common, and ultimately it's about providing a space that offers a sufficient span for the linear function to capture the response.