Following the image classification tutorial, there are two places where the epochs are set. How are these related, or should the same epoch count be the same for both? In other TF models, I've only seen epochs within the model.fit_generator() param.
from transformers import create_optimizer
batch_size = 16
num_epochs = 3
num_train_steps = len(dataset["train"]) * num_epochs
learning_rate = 3e-5
weight_decay_rate = 0.01
optimizer, lr_schedule = create_optimizer(
init_lr=learning_rate,
num_train_steps=num_train_steps,
weight_decay_rate=weight_decay_rate,
num_warmup_steps=0,
)
training_args = TrainingArguments(
output_dir="my_awesome_food_model",
remove_unused_columns=False,
evaluation_strategy="epoch",
save_strategy="epoch",
learning_rate=5e-5,
per_device_train_batch_size=16,
gradient_accumulation_steps=4,
per_device_eval_batch_size=16,
num_train_epochs=3,
warmup_ratio=0.1,
logging_steps=10,
load_best_model_at_end=True,
metric_for_best_model="accuracy",
push_to_hub=False,
)
https://huggingface.co/docs/transformers/tasks/image_classification#preprocess
I don't understand why I need to pass in num_train_steps. Wouldn't that be calculated based on the num_train_epochs?