Python - Interaction term in Lasso / LassoCV?

1.7k Views Asked by At

I'm searching for adding the interaction term in Lasso / LassoCV of scikit-learn. If it is the interaction between two continuous variables or between two categorical variable, I can add the columns corresponding to the multiplication of each element in the interaction. But when we have the interaction between a categorical variable and a continuous variable, I can not multiply them.

1

There are 1 best solutions below

0
On BEST ANSWER

You can absolutely take the interaction between a categorical variable and a continuous variable. But you must turn your categorical variable into a numeric. There are a few ways to do this but making a binary column for each unique category is a common way to do this. Once you create the new matrix, you can send that to your fit method in sklearn. See my very minimal example below

# create data with categorical and continuous variables
import pandas as pd
df = pd.DataFrame({'cat':['a','b','c'], 'cont':[4,1,10]})

Output

  cat  cont
0   a     4
1   b     1
2   c    10

Use pandas function get_dummies to create binary variables

df_new = pd.get_dummies(df)

Output of transformed data

   cont  cat_a  cat_b  cat_c
0     4      1      0      0
1     1      0      1      0
2    10      0      0      1

Now you can do simple operations

df['a_new'] = df['cont'] * df['cat_a']