GIZA++ :Forbidden zero sentence length 0

249 Views Asked by At

I have been using GIZA++ for translation of sentence when I used on test dataset an error is displayed "ERROR: Forbidden zero sentence length 0". IS there any way to avoid this error

1

There are 1 best solutions below

0
On BEST ANSWER

I had the same problem with the en-vi corpus. (English-Vietnamese) Because your corpus data is too long or not clean.

You should clean up your corpus data.

It will limit sentence length to 80. This is the command with Moses tools.

~/mosesdecoder/scripts/training/clean-corpus-n.perl 
~/corpus/train en vi 
~/corpus/train.clean 1 80

Or you can adjust manually.

Try to cut down the length of each line less than 100 characters or 80 words.