scikit-learn: Issues on RFECV example

1.5k Views Asked by jatha At 29 July 2025 at 04:58

I'm having a difficulty in understanding the given RFECV example in current documentation. In the plot it's been written as "nb of misclassifications", so i expect it to be "lower the better". But in the example plot the best has been chosen as the highest cross-validation score. So i naturally expect it to be something related to accuracy (scoring says accuracy in the code anyways). But then how it becomes higher than 1?

I am a bit confused on how to interpret these results. I would appreciate any help on this.

Thanks!

Original Q&A

There are 1 best solutions below

alko On 16 January 2014 at 14:28

RFECV has a useful verbose option. Running with verbose=2, you can see, that for a 2-fold cross-value check, as in example, grid_scores_ holds sum of both folds scores.

In general, for a n-fold check, grid_scores_ is sum of folds scores divided by n-1, see in code. It seems to be a bug; see somewhat relevant issue on the tracker.

scikit-learn: Issues on RFECV example

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in FEATURE-SELECTION

Related Questions in RFE

Trending Questions

Popular # Hahtags

Popular Questions