In Scikit-learn RandomSearchCV
and GridSearchCV
require the cross validation object for the cv
argument, e.g. GroupKFold
or any other CV splitter from sklearn.model_selection
.
However, how can I use single, static validation set? I have very large training set, large validation set and I only need the interface of CV objects, not whole cross validation.
Specifically, I'm using Scikit-optimize and BayesSearchCV
(docs) and it requires the CV object (same interface as regular Scikit-learn SearchCV
objects). I want to use my chosen validation set with it, not whole CV.
The docs of the model selection objects of
scikit-learn
, e.g.GridSearchCV
, are maybe a bit clearer how to achieve this:So you need the arrays of indices for training and test samples as a tuple and then wrap them in an iterable, e.g. a list:
Pass this
cv
defined with a single tuple to the model selection object and it will always use the same samples for training and testing.