Two Ways of Computing MSE for Gamma Regression Model in R Do Not Agree

51 Views Asked by At

I've seen a couple of questions asking about computing MSE in R here and here, but they do not answer my question. I'm trying a glm with the Gamma distribution (so, Gamma regression), and trying to compute the MSE for the model two different ways. Here's a MWE:

set.seed(39756934)                           
x = rnorm(100)
y = rnorm(100) + 0.5 * x + 3  # Need the +3 for strictly positive y values.
my_data = data.frame(x, y)
my_mod = glm(y~x, data=my_data, family=Gamma())  # Default link is "inverse"
mean(my_mod$residuals^2)
    0.01388825
mean((my_data$y - predict(my_mod))^2)
    6.954654

Using the log link gives different numbers, but the two methods still do not agree.

As far as I know, these two methods should give the same results, but they're obviously not. What's going on? Is there a bug somewhere? Or is one of these methods of computing the MSE simply wrong for Gamma regression?

I don't think this question belongs on CrossValidated.SE, because it's more of a code/programming thing - unless the answer is that one of these methods of computing the MSE is invalid for a Gamma regression. But I don't know that in advance. This question would certainly be closed on CrossValidated.SE.

1

There are 1 best solutions below

3
G. Grothendieck On BEST ANSWER

Specify type = "response" because one could either consider the residuals and predictions on the linear predictor or the response scale.

mean(resid(my_mod, type = "response")^2)
## [1] 0.7419272

mean((my_data$y - predict(my_mod, type = "response"))^2)
## [1] 0.7419272

There are several different possibilities for type= as shown here but only "response" is available in both

p.type <- c("link", "response", "terms")
r.type <- c("deviance", "pearson", "working", "response", "partial")
 
sapply(r.type, \(x) mean(resid(my_mod, type = x)^2))
##   deviance    pearson    working   response    partial 
## 0.10746632 0.09651097 0.01388825 0.74192724 0.01685553 


sapply(p.type, \(x) mean((my_data$y - predict(my_mod, type = x))^2))
##      link  response     terms 
## 6.9546540 0.7419272 8.8680964