Early stopping based on BLEU in FairSeq

465 Views Asked by Nikhil Jaiswal At 05 January 2023 at 03:40

My goal is to use BLEU as early stopping metric while training a translation model in FairSeq.

Following the documentation, I am adding the following arguments to my training script:

--eval-bleu --eval-bleu-args --eval-bleu-detok --eval-bleu-remove-bpe

I am getting the following error:

fairseq-train: error: unrecognized arguments: --eval-bleu --eval-bleu-args --eval-bleu-detok --eval-bleu-remove-bpe

System information:

fairseq version: 0.10.2
torch: 1.10.1+cu113

More Details:

When I am trying to finetune M2M100 model, I am getting error as:

KeyError: 'bleu'

when using following:

CUDA_VISIBLE_DEVICES=0,1,2,3 fairseq-train \
    $path_2_data --ddp-backend=no_c10d \
    --best-checkpoint-metric bleu \
    --maximize-best-checkpoint-metric \
    --max-tokens 2048 --no-epoch-checkpoints \
    --finetune-from-model $pretrained_model \
    --save-dir $checkpoint --task translation_multi_simple_epoch \
    --encoder-normalize-before \
    --langs 'af,am,ar,ast,az,ba,be,bg,bn,br,bs,ca,ceb,cs,cy,da,de,el,en,es,et,fa,ff,fi,fr,fy,ga,gd,gl,gu,ha,he,hi,hr,ht,hu,hy,id,ig,ilo,is,it,ja,jv,ka,kk,km,kn,ko,lb,lg,ln,lo,lt,lv,mg,mk,ml,mn,mr,ms,my,ne,nl,no,ns,oc,or,pa,pl,ps,pt,ro,ru,sd,si,sk,sl,so,sq,sr,ss,su,sv,sw,ta,th,tl,tn,tr,uk,ur,uz,vi,wo,xh,yi,yo,zh,zu' \
    --lang-pairs $lang_pairs \
    --decoder-normalize-before --sampling-method temperature \
    --sampling-temperature 1.5 --encoder-langtok src \
    --decoder-langtok --criterion label_smoothed_cross_entropy \
    --label-smoothing 0.2 --optimizer adam --adam-eps 1e-06
    --adam-betas '(0.9, 0.98)' --lr-scheduler inverse_sqrt \
    --lr 3e-05 --warmup-updates 2500 --max-update 400000 \
    --dropout 0.3 --attention-dropout 0.1 \
    --weight-decay 0.0 --update-freq 2 --save-interval 1 \
    --save-interval-updates 5000 --keep-interval-updates 10 \
    --seed 222 --log-format simple --log-interval 2 --patience 5  \
    --arch transformer_wmt_en_de_big --encoder-layers 24 \
    --decoder-layers 24 --encoder-ffn-embed-dim 8192 \
    --decoder-ffn-embed-dim 8192 --encoder-layerdrop 0.05 \
    --decoder-layerdrop 0.05 --share-decoder-input-output-embed \
    --share-all-embeddings --fixed-dictionary $fix_dict --fp16 \
    --skip-invalid-size-inputs-valid-test

Original Q&A

There are 1 best solutions below

Jindřich On 05 January 2023 at 08:01 BEST ANSWER

The task that you are using translation_multi_simple_epoch does not have these arguments; they are specific for translation task.

Note that some of the arguments that you are using require values.

--eval-bleu-args expects a path to a configuration JSON for SacreBLEU. If you want to you the default 4-gram BLEU, you should skip this.
--eval-bleu-detok expects a specification of how you want to detokenize the model output. The default value is space which does not do anything.

For more details, see the documentation of the translation task in FairSeq.

Early stopping based on BLEU in FairSeq

There are 1 best solutions below

Related Questions in DEEP-LEARNING

Related Questions in NLP

Related Questions in MACHINE-TRANSLATION

Related Questions in BLEU

Related Questions in FAIRSEQ

Trending Questions

Popular # Hahtags

Popular Questions