I have this code for making optimization with Optuna:
n_trials = 25
def objective(trial):
params = {
"n_estimators": trial.suggest_int("n_estimators", 100, 900),
"max_depth": trial.suggest_int("max_depth", 5, 15),
"min_samples_split": trial.suggest_int("min_samples_split", 2 ,6)
}
rfc = RandomForestClassifier(**params)
rfc.fit(X_train_res, y_train_res)
y_pred = rfc.predict(X_test)
score = metrics.r2_score(y_test, y_pred)
return score
study = optuna.create_study(study_name = "RandomForestRegressor", direction="maximize")
study.optimize(objective, n_trials)
print("Number of completed trials: {}".format(len(study.trials)))
print("Best trial:")
trial = study.best_trial
print("\tBest Score: {}".format(trial.value))
However it has resulted in this problem
---> 19 study.optimize(objective, n_trials)
20
21 print("Number of completed trials: {}".format(len(study.trials)))
9 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in __array__(self, dtype)
855 dtype='datetime64[ns]')
856 """
--> 857 return np.asarray(self._values, dtype)
858
859 # ----------------------------------------------------------------------
ValueError: could not convert string to float: 'Technology
I have technology value in my y_train_res, here is how it looksenter image description here
Are you using
sklearn.metrics.r2_score
?I think r2_score doesn't accept an array of string, it must be one of number.
https://github.com/scikit-learn/scikit-learn/blob/baf828ca126bcb2c0ad813226963621cafe38adb/sklearn/metrics/_regression.py#L805-L808