Having problems tuning CatBoost hyperparameters

1.4k Views Asked by At

I am doing the Bulldozer-blue-book project from Kaggle. I am currently using CatBoost to see if I can improve my model. I instantiate CatBoost as so:

cat_regressor = CatBoostRegressor()

cat_regressor.fit(Xtrain[:100000], ytrain[:100000])

Then I am trying to tune hyperparameters using RandomizedSearchCV as such:

%%time

    from sklearn.model_selection import RandomizedSearchCV
    cat_grid = {
        'iterations': np.arange(10, 1000, 10),
        'depth': np.arange(2, 16, 2),
        'learning_rate': [0.01, 0.05, 0.1]
    }
    
    cat_model_rs = RandomizedSearchCV(estimator=cat_regressor,
                                     param_distributions=cat_grid,
                                     n_iter=250,
                                     cv=5,
                                     verbose=True)
    
    cat_model_rs.fit(Xtrain[:100000], ytrain[:100000])

Now, so far the computer is taking very long time to fit these parameters to the search (much longer than when I was tuning RandomForestRegressor). Yesterday I had "kernel stoppage" (don't remember exactly how Jupyter presented the error) while using the GPU. Today I am implementing the CPU. The search is still going on at full power, and at this point it feels like the model is stuck in an infinity loop, and I am just waiting for the kernel to stop. I have tried to use Google Colab as well, but the cell for finding hyperparameters also runs out on time there. I am at loss here.

I am new to using CatBoost, does anyone know if I missed a parameter or maybe RandomizedSearchCV isn't fully implemented for Catboost?

1

There are 1 best solutions below

0
On

Found out why this wasn't working. Apparently iterations can't take higher value than 500, so setting it down solved my issue.