How to combine icu4x word segmenter with additional dictionary

70 Views Asked by At

The icu4x icu_segmenter::WordSegmenter seems like the best word segmenter out there.

I don't understand how data providers work with word segmentation at all. It seems very complicated to me and I couldn't find any example.

I need it for Thai. I guess it uses the LSTM segmenter by default. It's better than anything I've seen before by default. It still has trouble with a lot of exotic names. Which is why I'd like to add my dictionary to it for personal use.

How to do that?

0

There are 0 best solutions below