No score method for MeanShift estimator - scikit-learn

646 Views Asked by At

I was trying to use GridSearch to iterate over different values of bandwidth for MeanShift algorithm and it shows this error; does any of you know how can I fix this? Thanks a lot!

# Using GridSearch for Algorithm Tuning
from sklearn.model_selection import GridSearchCV
meanshift=MeanShift()
C = range(48, 69) # For MeanShift bandwidth
param_grid = {"bandwidth": range(48, 69)}

mean_grid = GridSearchCV(estimator=meanshift, param_grid=param_grid, scoring=None)

mean_grid.fit(X)

And this is the error I get:

TypeError: If no scoring is specified, the estimator passed should have a 'score' method. The estimator MeanShift(bandwidth=None, bin_seeding=False, cluster_all=True, min_bin_freq=1,
     n_jobs=1, seeds=None) does not.
2

There are 2 best solutions below

1
On

It's because MeanShift algoritm does not contain score function. In this case you have to specify scoring in GridSearchCV. Here is a complete list.

From the documentation of GridSearchCV:

Parameters:

estimator : estimator object.

This is assumed to implement the scikit-learn estimator interface. Either estimator needs to provide a score function, or scoring must be passed.

0
On

You can't use GridSearch with an unsupervised method well.

The concept of grid search is to choose those parameters that have the best score when predicting on held out data. But since most clustering algorithms cannot predict on unseen data, this does not work.

It's not that straightforward to choose "optimal" parameters in unsupervised learning. That is why there isn't an easy automation like gridsearch available.