I have a small data set (it was not possible to test more because the sample was very specific) with one dependent variable and two independent variables (x1 and x2).The two independent variables represent two tasks, each participant performed the two tasks, and the two tasks are identical except for one parameter we manipulated (cognitive load: high versus low).
The two tasks are significantly different (t-test) and related (r =.68).
I ran a linear regression analysis (despite small sample size) and the overall model is significant (p = .013; R2 = .45). The individual predictors are not significant (x1: p = .41 and x2: p = .35). I thought this was a problem of multicollinearity and checked the VIF which were low for both predictors (x1 = 1.97 and x2 = 1.88).
I then ran a Ridge regression (glmnet package). The coefficients were for x1 = .246 and x2 = .249 (see Figure 1).
I also ran a Lasso regression. The coefficients were for x1 = .296 and for x2 = .306 (see Figure 2).
I also ran a hierarchical linear regression analysis including x1 in the null model to see if x2 could explain additional variance. The null model was significant but the two predictors in second model were not.
My problem is that I am afraid of misinterpreting this data. Based on the data, I would assume that even though the VIF is low, both tasks are very similar (x1 and x2 have a similar effect on Y and the high load task can be dropped from the model). What I do not understand is why Ridge regression, which can handle multicolinearity, cannot distinguish between the two tasks (x1 and x2) and why Lasso regression includes both tasks in the final model. The point I want to make here is that I do not understand whether (i) I can/should exclude one task from the model (because they seem to measure "the same" think - accordingly I would exclude x2), or (ii) both tasks explain something different (because the coefficients are similar in the Ridge regression and both tasks are included in the minimum lambda Lasso regression) and it is necessary that both predictors are included in the model? How am I going to make sense of all this data?
Thank you very much!