Different values for roc_auc on test set when using roc_auc and roc_auc score?

314 Views Asked by At

I have the following data pipeline but am having some confusion interpreting the output. Any help is much appreciated.

# tune the hyperparameters via a cross-validated grid search

from sklearn.ensemble import RandomForestClassifier
print("[INFO] tuning hyperparameters via grid search")
params = {"max_depth": [3, None],
          "max_features": [1, 2, 3, 4],
          "min_samples_split": [2, 3, 10],
          "min_samples_leaf": [1, 3, 10],
          "bootstrap": [True, False],
          "criterion": ["gini", "entropy"]}

model = RandomForestClassifier(50)
grid = RandomizedSearchCV(model, params, cv=10, scoring = 'roc_auc')
start = time()
grid.fit(X_train, y_train)

# evaluate the best grid searched model on the testing data

print("[INFO] grid search took {:.2f} seconds".format(
    time() - start))
acc = grid.score(X_train, y_train)
print("[INFO] grid search accuracy: {:.2f}%".format(acc * 100))
print("[INFO] grid search best parameters: {}".format(
grid.best_params_))

Look at the cross validated training score:

rf_score_train = grid.score(X_train, y_train)
rf_score_train

0.87845540607872441

Now use this trained model to predict on the test set:

rf_score_test = grid.score(X_test, y_test)
rf_score_test

0.72482993197278911

However, when I look at this model's prediction as an array and use the external roc_auc_score metric to compare this prediction to the actual outcomes, I get a completely different score to the GridSearchCV 'roc_auc' score above on the test set.

model_prediction = grid.predict(X_test)
model_prediction

array([0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 
0, 0,0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 
0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 
0,0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0,0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 
0,0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 
0,0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0,0, 1, 0, 0, 0, 0, 0, 0])

Actual outcome:

actual_outcome = np.array(y_test)
actual_outcome

array([0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 
0, 0,0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 
1,1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
0,0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 
0,0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
1,0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 
0,0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 
0,0, 0, 1, 0, 0, 0, 1, 0])

Using roc_auc_score outside of the GridSearch:

from sklearn.metrics import roc_auc_score
roc_accuracy = roc_auc_score(actual_outcome, model_prediction)*100
roc_accuracy

59.243197278911566

So using the cross-validated 'roc_auc' score in the GridSearch I get around 72, yet when I use the 'roc_auc_score' on the same prediction externally I get 59. Which one is correct? I am confused. Am I doing something wrong here? Any help is much appreciated!

0

There are 0 best solutions below