I am trying to use Ray Tune's implementation of BOHB to hyperparameter tune a PPO model. If I set the amount of iterations to e.g. 100 it works fine - however it already samples new hyperparameter values after only one iteration of a sample. Thereby, the bayesian search for new parameters becomes somewhat obsolete. Is there a way to define this inital search size?
My current setup is as follows:
bohb_search = TuneBOHB(
space=hyperparams,
metric="episode_reward_mean",
mode="max",
bohb_config={
# I assume the setting goes here, but I am unable to find documentation on allowed dict keys.
}
)
bohb_search = tune.search.ConcurrencyLimiter(bohb_search, max_concurrent=2)
bohb_hyperband = HyperBandForBOHB(
time_attr="training_iteration",
max_t=TRAINING_ITERATIONS,
reduction_factor=2,
metric="episode_reward_mean",
mode="max",
stop_last_trials=False,
)
tuner = tune.Tuner(
"PPO",
run_config=air.RunConfig(
name="BOHB_exp_1",
storage_path=os.path.join("~", "ray_results", "tuning"),
stop={"training_iteration": TRAINING_ITERATIONS},
),
tune_config=tune.TuneConfig(
search_alg=bohb_search,
scheduler=bohb_hyperband,
num_samples=NUM_SAMPLES,
),
param_space={
"env": "biopharma_env",
"model": {
"custom_model": "action_mask_model",
"vf_share_layers": True,
},
"framework": "tf2",
"eager_tracing": True,
"use_kl_loss": False,
"num_gpus": 0,
"num_rollout_workers": 3,
"vf_clip_param": np.inf,
"train_batch_size": 2048,
},
)
I am guessing that the setting I am looking for should be specified under bohb_config
in the TuneBOHB
method, but I haven't been able to find the documentation for allowed keys (neither in RayTune or HpBandSter documentation).
Does anyone know how I can specify this setting?