I'm creating a parser for formulas. I encountered unexpected behavior of IDENT (see code below). Theoretically, it should match 1 or more occurrences of Upper/Lower case letters + digits + '_'. But in practice, it matches only single lowercase letters + '_' and fails with single uppercase letters and only digits.
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment LETTER: (UPPER | LOWER)+;
fragment DIGIT : '0'..'9';
IDENT : (LETTER | DIGIT | '_')+ ;
table
: IDENT '['! '@'? IDENT ']'!
;
I'm using AntlrWorks Interpreter for testing. Failed tests below manage to build tree up to tableName + [ + <NoViableAltException>
Here are some tests:
Table[a] - works
Table[A] - NoViableAltException
Table[1] - NoViableAltException
Table[_] - works
Table[1a] - works
Table[a1] - works
Table[1A] - works
Table[A1] - NoViableAltException
Table[AA] - NoViableAltException
Table[AAA] - works
Table[11] - NoViableAltException
Table[1111] - NoViableAltException
Table[111222] - NoViableAltException
Table[111222a] - works
Table[111222A] - works
Table[A111222] - NoViableAltException
I'm new to ANTLR, but I can't really wrap my head around why, at first sight, similar properties work differently.