I would like to manually calculate the Variance Inflation Factor (VIF) for mtcars dataset. When I do it using the function from package I have the following result:
install.packages('car') library(car)
fit <- lm(mtcars[,1] ~ ., mtcars[,-1])
summary(fit)
cyl disp hp drat wt qsec
15.373833 21.620241 9.832037 3.374620 15.164887 7.527958
vs am gear carb
4.965873 4.648487 5.357452 7.908747
Then I do it one by one for each column:
For the first one cyl
it's perfectly ok:
fit <- lm(mtcars[,2] ~ ., mtcars[,- c(1:2)])
summary(fit)$r.squared
1/(1-summary(fit)$r.squared)
[1] 15.37383
While for disp
there's a difference already.
fit2 <- lm(mtcars[,3] ~ ., mtcars[,- c(1:3)])
summary(fit2)$r.squared
1/(1-summary(fit2)$r.squared)
[1] 20.08864 # but it must be 21.620241
What's wrong?
The correct formula for VIF is:
Note that
1:3
meansc(1,2,3)
. We don't want to exclude the second column in the design matrix. Therefore, we should specifyc(1,3)
instead.