Error when trying to tune MLPClassifier hidden_layer_sizes using BayesSearchCV

1.5k Views Asked by At

When trying to tune the sklearn MLPClassifier hidden_layer_sizes hyper parameter, using BayesSearchCV, I get an error: ValueError: can only convert an array of size 1 to a Python scalar.

However, when I use GridSearchCV, it works great! What am I missing?

Here goes a reproducible example:

from skopt import BayesSearchCV
from skopt.space import Real, Categorical, Integer
from sklearn.datasets import load_iris
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV

X, y = load_iris(True)
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                 train_size=0.75,
                                                 random_state=0)
# this does not work!
opt_bs = BayesSearchCV(MLPClassifier(), 
                     {'learning_rate_init': Real(0.001, 0.05),
                        'solver': Categorical(["adam", 'sgd']), 
                        'hidden_layer_sizes': Categorical([(10,5), (15,10,5)])}, 
                     n_iter=32,
                     random_state=0)

# this one does :)
opt_gs = GridSearchCV(MLPClassifier(), 
                   {'learning_rate_init': [0.001, 0.05],
                        'solver': ["adam", 'sgd'], 
                        'hidden_layer_sizes': [(10,5), (15,10,5)]})
                   
# executes optimization using opt_gs or opt_bs
opt = opt_bs
res = opt.fit(X_train, y_train)
opt

Produces:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-64-78e6d29cae99> in <module>()
     27 # executes optimization using opt_gs or opt_bs
     28 opt = opt_bs
---> 29 res = opt.fit(X_train, y_train)
     30 opt

/usr/local/lib/python3.6/dist-packages/skopt/searchcv.py in fit(self, X, y, groups, callback)
    678                 optim_result = self._step(
    679                     X, y, search_space, optimizer,
--> 680                     groups=groups, n_points=n_points_adjusted
    681                 )
    682                 n_iter -= n_points

/usr/local/lib/python3.6/dist-packages/skopt/searchcv.py in _step(self, X, y, search_space, optimizer, groups, n_points)
    553 
    554         # convert parameters to python native types
--> 555         params = [[np.array(v).item() for v in p] for p in params]
    556 
    557         # make lists into dictionaries

/usr/local/lib/python3.6/dist-packages/skopt/searchcv.py in <listcomp>(.0)
    553 
    554         # convert parameters to python native types
--> 555         params = [[np.array(v).item() for v in p] for p in params]
    556 
    557         # make lists into dictionaries

/usr/local/lib/python3.6/dist-packages/skopt/searchcv.py in <listcomp>(.0)
    553 
    554         # convert parameters to python native types
--> 555         params = [[np.array(v).item() for v in p] for p in params]
    556 
    557         # make lists into dictionaries

ValueError: can only convert an array of size 1 to a Python scalar

1

There are 1 best solutions below

1
On BEST ANSWER

Unfortunately, BayesSearchCV accepts only parameters in Categorical, Integer, or Real type values. In your case, there is no issue w.r.t learning_rate_init and solver parameters as they are clearly defined as Real and Categorical respectively, the problem comes in the hidden_layer_sizes where you have declared the number of neurons as Categorical values which in this case are tuples, and BayesSearchCV is not yet equipped to handle search spaces in tuples, refer here for more details on this. However, as a temporary hack, you could just create your own wrapper around MLPClassifier to make the parameters of the estimator be recognized properly. Please refer to the following code snippet for a sample:

from skopt import BayesSearchCV
from skopt.space import Integer
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor, MLPClassifier
from sklearn.base import BaseEstimator, ClassifierMixin
import itertools

X, y = load_iris(True)
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                 train_size=0.75,
                                                 random_state=0)

class MLPWrapper(BaseEstimator, ClassifierMixin):
    def __init__(self, layer1=10, layer2=10, layer3=10):
        self.layer1 = layer1
        self.layer2 = layer2
        self.layer3 = layer3

    def fit(self, X, y):
        model = MLPClassifier(
            hidden_layer_sizes=[self.layer1, self.layer2, self.layer3]
        )
        model.fit(X, y)
        self.model = model
        return self

    def predict(self, X):
        return self.model.predict(X)

    def score(self, X, y):
        return self.model.score(X, y)


opt = BayesSearchCV(
    estimator=MLPWrapper(),
    search_spaces={
        'layer1': Integer(10, 100),
        'layer2': Integer(10, 100),
        'layer3': Integer(10, 100)
    },
    n_iter=11
)

opt.fit(X_train, y_train)
opt.score(X_test,y_test)
0.9736842105263158

Note: This assumes that you build an MLP network with three layers. You can modify it as per your need. Also, it becomes slightly tricky to create a class that constructs any MLP with an arbitrary number of layers.