I've been using the glmnet R package to build a LASSO regression model for one target variable Y (numeric) and 762 covariates. I use the glmnet() function and then coef(fit, s = 0.056360)
to get the coefficient values for that specific value of lambda.
What I now need is the variable selection order, i.e. which of the selected covariates is selected first (enters the model first), second, third and so on.
When using plot(fit, label = TRUE)
I can theoretically see the order via the plotted paths, however, there are too many covariates for the labels to be legible.
You can see from the image that the first covariate is 267 (green path), then comes 12, but the rest is illegible.
Your coefficients are stored under:
You can get a similar plot like this:
So let's say I use the lambda value s31 :
We pull out the matrix up to that lambda value:
And write a function to return the index of the first non-zero coefficient, or return the last if all are zeros:
Apply this to every row and we get the index where they first enter: