Imblearn Pipeline and HyperOpt Issue

Question

Imblearn Pipeline and HyperOpt Issue

455 Views Asked by eidoiruson At 06 June 2025 at 05:00

Currently I am trying to oversample with SMOTE and then run my XGBClassifier in the Pipeline. For some reason I cannot get HyperOpt to play nice with the Pipeline.

The two below examples both run properly:

smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])

cv = StratifiedKFold(n_splits = 5)

score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()

print(score)

model = XGBClassifier(random_state = 42)

def objective_pipe(params):
  model.set_params(**params)

  cv = StratifiedKFold(n_splits = 5)

  score = cross_val_score(model, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()

  return {'loss': -score, 'params':params, 'status':STATUS_OK}

trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))

However the moment I put the Pipeline inside the objective function I end up getting NaN values for the score.

smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])

def objective_pipe(params):
  pipe.set_params(**params)

  cv = StratifiedKFold(n_splits = 5)

  score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()

  return {'loss': -score, 'params':params, 'status':STATUS_OK}

trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))

Maybe I am just missing something really simple, but not really sure how to get by this issue. Any suggestions/help/resources are welcome.

Original Q&A

There are 1 best solutions below

**Sofia Cardoso Pereira** · Answer 1

Sofia Cardoso Pereira On 29 December 2020 at 10:31

I'm not exactly sure why but I had a similar issue and it went away by setting njobs=1. I think it has to do with the SMOTE's inability to run in a parallel fashion.

Imblearn Pipeline and HyperOpt Issue

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in HYPEROPT

Trending Questions

Popular # Hahtags

Popular Questions