Hugging Face - What is the difference between epochs in optimizer and TrainingArguments?

27 Views Asked by At

Following the image classification tutorial, there are two places where the epochs are set. How are these related, or should the same epoch count be the same for both? In other TF models, I've only seen epochs within the model.fit_generator() param.

from transformers import create_optimizer

batch_size = 16
num_epochs = 3
num_train_steps = len(dataset["train"]) * num_epochs
learning_rate = 3e-5
weight_decay_rate = 0.01

optimizer, lr_schedule = create_optimizer(
    init_lr=learning_rate,
    num_train_steps=num_train_steps,
    weight_decay_rate=weight_decay_rate,
    num_warmup_steps=0,
)

training_args = TrainingArguments(
    output_dir="my_awesome_food_model",
    remove_unused_columns=False,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    learning_rate=5e-5,
    per_device_train_batch_size=16,
    gradient_accumulation_steps=4,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    warmup_ratio=0.1,
    logging_steps=10,
    load_best_model_at_end=True,
    metric_for_best_model="accuracy",
    push_to_hub=False,
)

https://huggingface.co/docs/transformers/tasks/image_classification#preprocess

I don't understand why I need to pass in num_train_steps. Wouldn't that be calculated based on the num_train_epochs?

0

There are 0 best solutions below