I'm trying to optimise an XGBoost model using BayesSearchCV from Scikit Optimizer, here is the code I am attempting to use:

from skopt import BayesSearchCV
import xgboost as xgb
from main import format_data_for_xgboost

x_train, x_test, y_train, y_test = format_data_for_xgboost() # function in sep script

opt = BayesSearchCV(
    xgb.XGBRegressor(objective='reg:squarederror', n_jobs=4),
    {
        'n_estimators': (1, 50),
        'max_depth': (1, 20),
        'learning_rate': (10**-5, 10**0, "log-uniform"),
        'min_child_weight': (1, 5),
        'max_delta_step': (1, 10)
    },
    n_iter=8,
    verbose=99
)

opt.fit(x_train, y_train)

It runs for the first few iterations, with score being decreased incrementally from -0.001 to -0.009.

After this run:

[CV]  learning_rate=0, max_delta_step=7, max_depth=4, min_child_weight=5, n_estimators=46, score=-0.009, total=   0.1s

it errors:

ValueError: Not all points are within the bounds of the space.

I'm pretty sure this is something to do with the "score", but when I tried to set score manually it said it couldn't accept a float as an argument for score.

I would appreciate any help understanding how to overcome this error. I don't think the dataframes are at fault, as I have successfully used them with xgb.cv and xgbRegressor now, it's just when I try to use the Bayesian optimisation I start having issues.

EDIT: when I add scoring='neg_mean_squared_error' as a parameter after verbose=99 it runs for longer, but I get the same error after:

[CV]  learning_rate=0, max_delta_step=8, max_depth=4, min_child_weight=5, n_estimators=34, score=-2654.978, total=   0.1s
1

There are 1 best solutions below

1
On

Running into this problem myself using Bayes Search on XGBoost.

I 'solved' the problem by brute forcing narrowing it.

  1. Cut down all the CV and iters to 1 to speed training time up.
  2. Comment out half of the hyperparameter ranges.
  3. Train.
  4. If skopt throws an error, the culprit is within the commented lines from #2. Uncomment and narrow down the problematic lines.

There must be a better way to debug this problem, but I found this is the easiest for me.