custom metric in gridsearchcv

45 Views Asked by At

I have a churn dataset with a column named "CLTV" which is the client value for the company. I created a custom function :

`def penalty(y_test,y_pred):

   penalties = []

   for i in range(len(y_pred)):
      if y_pred[i]-y_test.iloc[i] == -1:
        penalties.append(df.loc[y_test.index, 'CLTV'].iloc[i]/df.loc[:,'CLTV'].median())
        
    
      else: 
        penalties.append(y_pred[i]-y_test.iloc[i])

   return(sum(penalties))`

The less is penalty, the better is the result so I made a custom scorer like:

  `from sklearn.metrics import make_scorer
   custom_score=make_scorer(penalty,greater_is_better=False)`

I used first a simple model with a class_weight coz' the data is imbalanced:

        `from sklearn.linear_model import LogisticRegression
         lr=LogisticRegression(class_weight="balanced",max_iter=500)`

And created a grid:

     `from sklearn.model_selection import GridSearchCV
      grid={'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]}
      grid_lr=GridSearchCV(estimator=lr,param_grid=grid,cv=10,scoring=custom_score,error_score='raise')
      grid_lr.fit(X_train,y_train)`

For grid_lr.best_score I had -128.49 (negativ is ok coz' it's a loss function)

But when I did that:

         `y_train_lr_grid=grid_lr.predict(X_train)
                 penalty(y_train,y_train_lr_grid)`

The result was 1260 !! ((1260 is so different from 128))

Someone can explain what I did wrong??? Thank you all

0

There are 0 best solutions below