I am trying to fine-tune the XGBoost model and have two questions:
I want to keep some of the hyperparameters fixed, such as n_estimators=5000, max_depth=60, and learning_rate=0.0079883, while tuning others like min_child_weight, gamma, etc. I have implemented this in the code below. Is this the correct approach?
I also want my XGBoost model to utilize early stopping. However, I'm uncertain whether the current implementation triggers early stopping with the XGBoost algorithm or the Bayesian algorithm. Can someone clarify this?
xgb_model = xgb.XGBRegressor(n_estimators=5000, max_depth=60, learning_rate=0.0079883)
# fine-tuning min_child_weight using Bayesian optimization:
# Define the hyperparameter search space
search_spaces = {
'min_child_weight': Integer(1, 50),
'gamma': Real(0.01, 3.0, 'log-uniform'),
'subsample': Real(0.01, 1.0, 'uniform'),
'colsample_bytree': Real(0.01, 1.0, 'uniform'),
}
# Define the search
opt = BayesSearchCV(
xgb_model, # estimator
search_spaces, # hyperparameter space
scoring='neg_mean_squared_error', # negative mean squared error
cv=5, # cross-validation
n_jobs=-1, # number of jobs=-1, means use all processors
n_points=50,
n_iter=50, # number of iterations
verbose=False,
random_state=42
)
eval_set = [(X_test, y_test)]
# Perform the search
opt.fit(X_train, y_train, early_stopping_rounds=20, eval_metric='rmse', eval_set=eval_set, verbose=True)