Java Break Iterator With Parentheses

179 Views Asked by At

Using a Java BreakIterator, I am able to extract words from a string. However, given the following string that uses parenthesis to indicate that a word could be plural, the parentheses are recognized as their own word.

String test = "Please enter the number of dependent(s).";

BreakIterator iterator = BreakIterator.getWordInstance(Locale.US);
iterator.setText(test);

int start = iterator.first();
for (int end = iterator.next(); end != BreakIterator.DONE; start = end, end = iterator.next()) {
    System.out.println(test.substring(start, end));
}

Outputs:

Please
 
enter
 
the
 
number
 
of
 
dependent
(
s
)
.

When I would expect:

Please
 
enter
 
the
 
number
 
of
 
dependent(s)
.

Is it possible to use a custom implementation of a break iterator so that a word with an "optional plural" is in fact treated as one word?

0

There are 0 best solutions below