I am trying to use XRegExp to test if a string is a valid word according to these criteria:
- The string begins with one or more Unicode letters, followed by
- an apostrophe (
'
) followed by one or more Unicode letters, repeated 0 or more times. - The string ends immediately after the matched pattern.
That is, it will match these terms
Hello can't Alah'u'u'v'oo O'reilly
but not these
eatin' 'sup 'til
I am trying this pattern,
^(\\p{L})+('(\\p{L})+)*$
but it won't match any words that contain apostrophes. What am I doing wrong?
EDIT: The code using the regex
var separateWords = function(text) {
var word = XRegExp("(\\p{L})+('(\\p{L})+)*$");
var splits = [];
for (var i = 0; i < text.length; i++) {
var item = text[i];
while (i + 1 < text.length && word.test(item + text[i + 1])) {
item += text[i + 1];
i++;
}
splits.push(item);
}
return splits;
};
Try this regex:
First it checks to ensure the first character is not an apostrophe. Then it either gets any number of word characters or apostrophes followed by anything other than an apostrophe, or it gets nothing (one-letter word).