Different AUC-PR scores using Logistic Regression with and without Pipeline

112 Views Asked by UniqueName At 19 August 2025 at 05:08

I'm trying to understand why I get different AUC-PR scores using Logistic Regression with and without Pipeline.

Here is my code with using Pipeline:

column_encoder = ColumnTransformer([
    ('ordinal_enc', OrdinalEncoder(), categorical_cols)
])

pipeline = Pipeline([
    ('column_encoder', column_enc),
    ('logreg', LogisticRegressionCV(random_state=777))
])

model = pipeline.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(f'AUC-PR with Pipeline: {average_precision_score(y_test, y_pred):.4f}')

And here is my code without Pipeline:

ord_enc = OrdinalEncoder()
ord_encoded_X_train = ord_enc.fit_transform(X_train[categorical_cols])
ord_encoded_X_test = ord_enc.transform(X_test[categorical_cols])

X_train_encoded = X_train.copy(deep=True)
X_test_encoded = X_test.copy(deep=True)

X_train_encoded.loc[:, categorical_cols] = copy.deepcopy(ord_encoded_X_train)
X_test_encoded.loc[:, categorical_cols] = copy.deepcopy(ord_encoded_X_test)

model = LogisticRegression(random_state=777, max_iter=2000)
model.fit(X_train_encoded, y_train)
y_pred = model.predict(X_test_encoded)

print(f'AUC-PR without Pipeline: {average_precision_score(y_test, y_pred):.4f}')

And finally:

AUC-PR with Pipeline:    0.1133
AUC-PR without Pipeline: 0.2406

So, why is that?

Original Q&A

There are 1 best solutions below

Ben Reiniger On 20 September 2021 at 19:04 BEST ANSWER

Your ColumnTransformer is dropping all the columns that aren't in categorical_cols, because the default for remainder is "drop". Add remainder="passthrough" to keep the non-categorical columns for the model.

Other quibbles:

You set max_iter in the second approach but not the first.
You are computing average_precision_score with hard class predictions, when you should use probability predictions.

Different AUC-PR scores using Logistic Regression with and without Pipeline

There are 1 best solutions below

Related Questions in SCIKIT-LEARN

Related Questions in LOGISTIC-REGRESSION

Related Questions in AUC

Related Questions in AVERAGE-PRECISION

Related Questions in SCIKIT-LEARN-PIPELINE

Trending Questions

Popular # Hahtags

Popular Questions