The dependent variable is binary, the unbalanced data is 1:10, the dataset has 70k rows, the scoring is the roc curve, and I'm trying to use LGBM + GridSearchCV to get a model. However, I'm struggling with the parameters as sometimes it doesn't recognize them even when I use the parameters as the documentation shows:
params = {'num_leaves': [10, 12, 14, 16],
'max_depth': [4, 5, 6, 8, 10],
'n_estimators': [50, 60, 70, 80],
'is_unbalance': [True]}
best_classifier = GridSearchCV(LGBMClassifier(), params, cv=3, scoring="roc_auc")
best_classifier.fit(X_train, y_train)
So:
- What is the difference between putting the parameters in the
GridsearchCV()
andparams
? - As it's unbalanced data, I'm trying to use the
roc_curve
as the scoring metric as it's a metric that considers the unbalanced data. Should I use the argumentscoring="roc_auc"
put it in theparams
argument?
The difference between putting the parameters in
GridsearchCV()
orparams
is mentioned in the docs of GridSearch:When you put it in
params
:And yes you can put the scoring also in the
params
.