Hyperopt being non deterministic with XGBRegressor

45 Views Asked by FAD At 25 October 2023 at 07:54

I ran a hyperparameter searching using hyperopt.

Defined the space and objective.

# Define the search space for hyperparameters
space = {
    'n_estimators': hp.quniform('n_estimators', 100, 500, 10),
    'max_depth': hp.choice('max_depth', range(1, 20, 2)),
    'min_child_weight': hp.quniform('min_child_weight', 1, 10, 1),
    'gamma': hp.uniform('gamma', 0, 1),
    'subsample': hp.uniform('subsample', 0.5, 1),
    'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1),
    'learning_rate': hp.loguniform('learning_rate', -4, 0),
    'reg_alpha': hp.uniform('reg_alpha', 0, 1),
    'reg_lambda': hp.uniform('reg_lambda', 0, 1)
}

def objective(params):
    params = {
        'n_estimators': int(params['n_estimators']),
        'max_depth': int(params['max_depth']),
        'min_child_weight': params['min_child_weight'],
        'gamma': params['gamma'],
        'subsample': params['subsample'],
        'colsample_bytree': params['colsample_bytree'],
        'learning_rate': params['learning_rate',]
        'reg_alpha': params['reg_alpha'],
        'reg_lambda': params['reg_lambda']
    }
    model = xgb.XGBRegressor(objective='reg:absoluteerror', enable_categorical=True, booster = 'gbtree', random_state = 42,  verbose=2,**params)
    model.fit(X_train, Y_train)
    Y_pred = model.predict(X_test)
    loss = abs(Y_test - Y_pred).mean()  # mean absolute error 
       
    return {'loss': loss, 'status': STATUS_OK}

Defined the trial parameters.


trials = Trials()
best = fmin(fn=objective,
            space=space,
            algo=tpe.suggest,
            max_evals=400,
            trials=trials)

print("Best: {}".format(best))

However, when I pass the same parameters in the best variable into XGB explicitly as follows.

### Experimental params here 
p = {'colsample_bytree': 0.7844038007128916,
  'gamma': 0.17639244252618205,
  'learning_rate': 0.9867329566851424,
  'max_depth': 9,
  'min_child_weight': 1.0,
  'n_estimators': 500,
  'reg_alpha': 0.7161477369968196,
  'reg_lambda': 0.04261895096213106,
  'subsample': 0.6494075038596621}

#objective ='reg:absoluteerror',enable_categorical = True, verbose = 2
model = xgb.XGBRegressor(objective='reg:absoluteerror', enable_categorical=True, booster = 'gbtree', random_state = 42,  verbose=2,**p)
model.fit(X_train, Y_train)

# Make prediction
Y_pred = model.predict(X_test)

I get a different mean_absolute_error.

Lets refer to the two losses as:

Loss A = Loss from the hyperparameter search.
Loss B = Loss when I explicitly pass parameters from the search into the model and compute the loss manually.

The training splits are not being redefined.
The model should be deterministic because of the random state. The difference in loss is far too large to explain randomness. (Loss B = 43 vs Loss A = 26)

Here is the part that gets interesting.

When I explicitly pass the p dict to the objective function, it returns Loss B.

Is some floating point shenanigans occuring?

Original Q&A

There are 1 best solutions below

FAD On 25 October 2023 at 09:36

fmin returns index for hp.choice. Not the value. This is quite unintuitive.

Hyperopt being non deterministic with XGBRegressor

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in SCIKIT-LEARN

Related Questions in XGBOOST

Related Questions in GRID-SEARCH

Trending Questions

Popular # Hahtags

Popular Questions