Functional API/Multi Input model hyperparameter optimization

1.6k Views Asked by At

The only hyperparameter optimization library I've found that works with Keras functional API is Talos.

https://github.com/autonomio/talos/blob/9890d71d31451af3d7e8d91a75841bc7904db958/docs/Examples_Multiple_Inputs.md

Does anyone know any others that would work?

3

There are 3 best solutions below

1
On

Talos seems doesn't work with the current TensorFlow version, simplest way we can do is to use the nested for loops function, but we will sacrifice the cross-validation part.

0
On

With KERAS functional API you can actually have GridSearchCV worked. Since grid only works for sequential, you only need to wrap the model with sequential layer.

model = Sequential() 
model.add(Resnet50())

but do not forget to adjust your Keras model before adding and at the end add Dense model to map the results.

I have personally tried and run GridSearch and before finding this solution.

2
On

You can perform Keras Hyperparameter Tuning using Sklearn Grid Search with Cross Validation.

To perform Grid Search with Sequential Keras models (single-input only), you must turn these models into sklearn-compatible estimators by using Keras Wrappers for the Scikit-Learn API.

A sklearn estimator is a class object with fit(X,y) , predict(x)and scoremethods. (and optionnaly predict_proba method)

No need to do that from scratch, you can use Sequential Keras models as part of your Scikit-Learn workflow by implementing one of two wrappers from keras.wrappers.scikit_learnpackage:

KerasClassifier(build_fn=None, **sk_params): which implements the Scikit-Learn classifier interface.
KerasRegressor(build_fn=None, **sk_params): which implements the Scikit-Learn regressor interface.

Arguments

build_fn: callable function or class instance the should construct, compile and return a Keras model, which will then be used to fit/predict.
sk_params: model parameters & fitting parameters.

Note that like all other estimators in scikit-learn, build_fn should provide default values for its arguments, so that you could create the estimator without passing any values to sk_params.

Example: Simple binary classification with Keras

Implement Keras Model creator function

We want to fine-tune these hyperparameters: optimizer, dropout_rate, kernel_init method and dense_layer_sizes.

These parameters must be defined in the signature of create_model() function with default parameters. You can add other hyperparameters if you want such as learning_rate, ...

binary_crossentropy is perfect for Two-class classification problem.

def create_model(dense_layer_sizes, optimizer="adam", dropout=0.1, init='uniform', nbr_features=2500, dense_nparams=256):
    model = Sequential()
    model.add(Dense(dense_nparams, activation='relu', input_shape=(nbr_features,), kernel_initializer=init,)) 
    model.add(Dropout(dropout), )
    for layer_size in dense_layer_sizes:
        model.add(Dense(layer_size, activation='relu'))
        model.add(Dropout(dropout), )
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=["accuracy"])
    return model

Create sklearn-like estimator

It’s a classification problem so we are using KerasClassifier wrapper.

kears_estimator = KerasClassifier(build_fn=create_model, verbose=1)

Defining Hyperparamers Space

We define here our hyperparameters space including keras fit hyperparameters: epochs and batch_size:

# define the grid search parameters
param_grid = {
    epochs': [10, 100, ],
    dense_nparams': [32, 256, 512],
    init': [ 'uniform', 'zeros', 'normal', ], 
    batch_size':[2, 16, 32],
    optimizer':['RMSprop', 'Adam', 'Adamax', 'sgd'],
    dropout': [0.5, 0.4, 0.3, 0.2, 0.1, 0]
}

And Finally Performing Grid Search with KFold Cross Validation

kfold_splits = 5
grid = GridSearchCV(estimator=kears_estimator,  
                    n_jobs=-1, 
                    verbose=1,
                    return_train_score=True,
                    cv=kfold_splits,  #StratifiedKFold(n_splits=kfold_splits, shuffle=True)
                    param_grid=param_grid,)

grid_result = grid.fit(X, y, )

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))