What are non-word boundary in regex (\B), compared to word-boundary?
What are non-word boundary in regex (\B), compared to word-boundary?
19.3k Views Asked by DarkLightA At
2
There are 2 best solutions below
0
On
The basic purpose of non-word-boundary is to created a regex that says:
if we are at the beginning/end of a
word char(\w=[a-zA-Z0-9_]) make sure the previous/next character is also aword char,e.g.:
"a\B."~"a\w":"ab","a4","a_", ... but not"a ","a."if we are at the beginning/end of a
non-word char(\W=[^a-zA-Z0-9_]) make sure the previous/next character is also anon-word char,e.g.:
"-\B."~"-\W":"-.","- ","--", ... but not"-a","-1"
For word-boundary it's similar but instead of making sure that the adjacent characters are of the same class (word char/non-word car) they need to differ, hence the name word's boundary.
A word boundary (
\b) is a zero width match that can match:\w) and a non-word character (\W) orIn Javascript the definition of
\wis[A-Za-z0-9_]and\Wis anything else.The negated version of
\b, written\B, is a zero width match where the above does not hold. Therefore it can match:For example if the string is
"Hello, world!"then\bmatches in the following places:And
\Bmatches those places where\bdoesn't match: