I have a dataset containing 100 features, which I want to analyze using mlr3.
I want to use XGBoost as a learner and Hyperband or MBO as tuners.
However, I run into errors when using the helper function auto_tuner(), but not when using tune()
learner <-
po("encode", method = "treatment", affect_columns = selector_type("factor")) %>>%
po("scale") %>>%
po("learner",
lrn("classif.xgboost", predict_type = "prob",
nrounds = to_tune(p_int(1, 5000, tags="budget")),
eta = to_tune(1e-4, 1, logscale = TRUE),
max_depth = to_tune(1, 20),
colsample_bytree = to_tune(1e-1, 1),
colsample_bylevel = to_tune(1e-1, 1),
lambda = to_tune(1e-3, 1e3, logscale = TRUE),
alpha = to_tune(1e-3, 1e3, logscale = TRUE),
subsample = to_tune(1e-1, 1)))
at <- auto_tuner(
tuner = tnr("hyperband", eta = 2),
learner = learner,
resampling = rsmp("cv", folds=2),
measure = msr("classif.auc"),
terminator = trm("none"),
store_models = TRUE,
evaluate_default = TRUE)
at$train(task_US)
Error in .__OptimInstance__eval_batch(self = self, private = private, :
Assertion on 'colnames(xdt)' failed: Names must include the elements {'classif.xgboost.nrounds','classif.xgboost.eta','classif.xgboost.max_depth','classif.xgboost.colsample_bytree','classif.xgboost.colsample_bylevel','classif.xgboost.lambda','classif.xgboost.alpha','classif.xgboost.subsample'}, but is missing elements {'classif.xgboost.nrounds'}.
When I use tune(), the error doesn't come up.
I've tried with
tuner = tnr("mbo")
and the same error comes up.
Thanks a lot for bringing up this to our attention, this is simply a bug! The culprit here is the line
evaluate_default = TRUE, if you disable it, everything works!I have created an issue for you here: https://github.com/mlr-org/mlr3tuning/issues/406
Created on 2024-01-25 with reprex v2.0.2