Build language on SRILM

119 Views Asked by At

Now i want to create a 3-gram models. a word can be many way to pronounce so i want to add wrong word, correct word in to LM like that

1.5051  CHE&3NE FOURCHU
-1.5051 CHE&3NE FOURCHU*,
-0.1072 CHE&3NE FOURCHU,
-1.5051 CHE&3NE LE
-0.7782 CHE&3NE* FOURCHU
-0.7782 CHE&3NE* FOURCHU,
-0.7782 CHE&3NE* SUR

CHE&3NE*, FOURCHU* are words that have wrong pronunciations. Can anyone help me to do this in SRILM?

1

There are 1 best solutions below

0
On

Create a text with phrases you need one per line, in text editor or with a script. In case you want to introduce mistakes randomly you can write a script in Python or other scripting language

CHE&3NE FOURCHU
CHE&3NE FOURCHU*,
CHE&3NE FOURCHU,
CHE&3NE LE
CHE&3NE* FOURCHU
CHE&3NE* FOURCHU,
CHE&3NE* SUR

Run the training command

ngram-count -text your.txt -lm your.lm

It will create the language model you need