How to drop specific instances of a factor variable in R from a regression

106 Views Asked by At

I'm running this regression.

model1 <- lm(DV ~ IV1 + IV1 + IV3 + SubjectID, data = df)

I'm checking for multicollinearity between the variables. The SubjectID is the ID of each subject. Each subject has 8 observations and there are about 300 subjects. When I run the model above I get no errors. When I run car::vif I get an error that indicates there is multicollinearity in the model. I check the regression results and the model is saying three of the SubjectIDs are multicollinear. This really surprised me, my understanding is that only one of the SubjectIDs should be linearly dependent on the rest. Regardless, assume I know that SubjectID2, SubjectID3, and SubjectID4 are multicollinear, how do I drop them? My understanding is that if I simply subset them then R will describe other factors as linearly dependent, so I can't simply drop them.

0

There are 0 best solutions below