save the model and checkpointing for algorithm-Trainers in ray-rllib

406 Views Asked by At

Does anyone know how can I do checkpointing and saving the model for algorithm-Trainer models in ray-rllib?

I know that that is available for ray.tune, but it seems that it is not directly possible to do so for the rllib algorithms.

1

There are 1 best solutions below

0
On

The trainer class has a save_checkpoint method as well as a load_checkpoint one.

 @override(Trainable)
def save_checkpoint(self, checkpoint_dir: str) -> str:
    checkpoint_path = os.path.join(
        checkpoint_dir, "checkpoint-{}".format(self.iteration)
    )
    pickle.dump(self.__getstate__(), open(checkpoint_path, "wb"))

    return checkpoint_path

@override(Trainable)
def load_checkpoint(self, checkpoint_path: str) -> None:
    extra_data = pickle.load(open(checkpoint_path, "rb"))
    self.__setstate__(extra_data)