Adding new language mappings in CLD2

32 Views Asked by At

We're looking to add languages and pseudo-languages to CLD2. Mostly to support Romanized forms like Hinglish (Hindi in Latin script) or translit (Cyrillic strings using Latin script), but not only. (Yes, we know CLD3 supports these; it's not applicable.)

It seems that we need to add a set of strings mapped to probabilities. cld_generated_cjk_delta_bi_32.cc and cld_generated_cjk_uni_prop_80.cc seem to contain some sort of mappings but it's unclear what exactly.

Any ideas?

0

There are 0 best solutions below