How to perform feature selection on dataset with categorical and numerical features?

121 Views Asked by At

I am working on a dataset with 30 columns (29 numerical, 1 non-ordinal categorical). I hot-encoded the categorical feature and reached at 35 columns. To improve training efficiency, I want to perform feature selection on my dataset. However , I am confused with how to handle a dataset with categorical and numerical features combined.

  1. I read that it is not reasonable to apply PCA on dummies given they are discrete. Is it reasonable to apply PCA first on numerical features then concatenate them with dummies?
  2. I tried to implement recursive feature elemination with cross-validation (RFECV) to the entire feature space. But I don't think it is reasonable to remove some but not all dummy features given they are generated out of one category.

Any suggestions? Any help is appreciated.

0

There are 0 best solutions below