Collapsing categorical variables with many levels with respect to target

43 Views Asked by At

I am trying to apply a methodology to collapse/fuse (not just select part of them) some levels of categorical variables, considering their impact on a continuos target variable. I've come across this great post on CrossValidated, where the accepted answer provides an interesting technique based on Lasso penalties that I would like to apply. The methodology is explained in detail in the paper Sparse modeling of categorial explanatory variables (Gertheiss and Tutz, 2010).

However, I have not found any code/libraries that would allow me to apply this methodology in Python (or eventually R). I was wondering if someone could share some hints or sources that I could use to replicate this technique.

I am also open to apply other techniques to assess my issue. Thanks.

I have tried to replicate from scratch the methodology from the paper but it seems too hard for me.

0

There are 0 best solutions below