BreakIterator API Java

103 Views Asked by At

The documentation for BreakIterator.getWordInstance() has options to use it with the Locale parameter, presumably because different locales' end results may vary for methods like (WordInstance, LineInstance, SentenceInstance, CharacterInstance)

But, when I do not use this parameter, I still get the same results as I get when calling it with any Locale in getAvailableLocales().

Is there some pattern, String, or Locale which actually causes these methods to give different results?

1

There are 1 best solutions below

0
On

I believe all "western" languages have the same rules.

Cursory scan shows that locale th (Thai) has it's own rules, given in file /sun/text/resources/th/WordBreakIteratorData_th inside .../jre/lib/ext/localedata.jar.

It's a binary file, so I don't know what it says, and even if I could understand the file, not knowing Thai, I still wouldn't understand it.