I am using fairseq (version: 1.0.0a0+14c5bd0) to fine-tune a model as per this link. However, there are lots of parameters used that I cannot find in the docs nor when I run fairseq-train --help
. Examples include:
--warmup-updates
--encoder-normalize-before
--label-smoothing
Are they replaced by some other params?
When you train your models, you can call general training parameters (documented in the CLI help) or component-specific parameters. You often need to look for the latter using the search bar on the top left of the documentation site.
Concerning the specific ones you highlighted, some are documented with their components in the documentation:
--warmup-updates
is an attribute of the learning rate scheduler (doc)--encoder-normalize-before
is a Transformer model parameter (doc)And some are documented only in the code (if at all):
--label-smoothing
is a parameter of the label-smoothed cross-entropy loss (code)