Is it possible to provide a scoring metric to sklearn RFE?

250 Views Asked by At

I need to get subsets of top 1, top 2, top 3, etc. features, and performance of my model on each of these subsets. Something like this:

Number of features Features Performance
1 A 0.7
2 A, D 0.72
3 A, D, B 0.75

I wanted to use RFE as a possible improvement over simply using feature importances from models.

In sklearn, the RFECV object has a ranking_ attribute, which would let me create the feature subsets. The problem is that all features below the number of features that RFECV found to be optimal are equal to 1, so the first k features are not ordered by importance.

I thought of using a simple RFE instead, but it doesn't accept the scoring parameter, and the default accuracy is not appropriate in my case where classes are very unbalanced.

Is there a way to either somehow provide a scoring metric to sklearn RFE, or to force RFECV (that does accept a scoring parameter) to evaluate ranking below the 'optimal' number of features?

I also considered using SFS but I have about 500 features and it takes days to run.

0

There are 0 best solutions below