Instantiating a Trainer
object from the transformers
library requires passing a model (or a model_init
function that returns a model). After instantiating a Trainer
with a model, if I call trainer.train(resume_from_checkpoint="path/to/checkpoints")
, the docs say that "training will resume from the model/optimizer/scheduler states loaded here."
But what happens to the original model I passed to the Trainer
constructor? Does it even matter what I pass?
Or am I misinterpreting the docs and the model passed to the constructor is what is trained, but the checkpoint just determines how many epochs are left etc.?
Similarly, if the checkpoint has a different set of TrainingArguments
than are passed to the Trainer
constructor, which has precedence? Do I even need to pass TrainingArguments
to Trainer
when resuming from checkpoint?