XRegExp: negate expression

144 Views Asked by At

What is the negate expression of following XRegExp expression?

[\\p{Alphabetic}\\p{Nd}\\{Pc}\\p{M}]+

I used matchChain() to get the words out of the sentence using above expression.

Now I'm going to use split() using negate expression to get the same result but with containing delimiters for each words.

2

There are 2 best solutions below

1
On

To negate \p{…}, use \P{…}. For example, the inverse of \p{L} is \P{L}.

Assuming you made a typo in your original regular expression, and \\{Pc} should be \\p{Pc}, this becomes:

[\\p{Alphabetic}\\p{Nd}\\p{Pc}\\p{M}]+

To negate this, just uppercase \\p{…} into \\P{…}:

[\\P{Alphabetic}\\P{Nd}\\P{Pc}\\P{M}]+

It should be possible to just do this as well:

[^\\p{Alphabetic}\\p{Nd}\\p{Pc}\\p{M}]+
0
On

To negate a character class you can add a ^ at the beginning of the class, so in your example

[^\\p{Alphabetic}\\p{Nd}\\p{Pc}\\p{M}]+

Note that \p{…} can be negated as \P{…} but it has some pitfalls:

[^\\p{Nd}] is the same of [\\P{Nd}]

but

[\\P{Nd}\\P{Pc}] // wrong

will match anything! because a number (Nd) is absolutely a non punctuation (Pc)