linear regression: Would these variables be considered multicollinear?

49 Views Asked by At

My linear model would be Score ~ Age + Collection1 + Collection3

I transform the Collection Column into dummy variables and I don't have Collection5 column to prevent the dummy variable trap.

For Collection 1, 3, and 5, I am sampling the same people and the Collection period takes place at the same time (Collection 1 = year 2000, Collection 3 = year 2002, Collection 5 = year 2004) hence the +2 in age for the same contacts.

Would the variables Age, Collection1, and Collection3 be multicollinear? On one hand, I feel like the increase in age is correlated with a higher collection but since Collection is transformed into multiple dummy variables of 1s and 0s, it shouldn't matter.

Is there a logical explanation as to why it should or shouldn't be multicollinear or break other assumptions?

Example Data

0

There are 0 best solutions below