I'm new to Jison and I want to tokenize Bangla Digits ০-৯ as numbers. I've tried the regex below but it's not working with it: Regular Expression: (^[\u09E6-\u09EF])+("."[\u09E6-\u09EF])\b
On testing ৭+১ It showing expected... 'NUMBER' GOT 'Invalid' Expected result : NUMBER '+' NUMBER
Please help me out!! ❤️
Good question.
The problem is the
\b
word boundary assertion. For some reason, javascript's regular expression engine specification does not consider Bangla digits to be word characters. For\w
and\b
, only ascii letters and digits count as word characters.Consequently, a Bangla digit followed by a plus sign (which is certainly not a word character) is not considered a word boundary, and thus doesn't match the assertion.
If you just drop the
\b
, it should work (although I would also drop the^
: Jison patterns are always anchored, so there's no need to insist).