Antlr4 parser ignoring lexer rules and generating implicit tokens

224 Views Asked by At

As a simple example, let's say I have the following.

parse_rule0 : parse_rule1 ',' parse_rule2 ';' ;

pare_rule1 : ... ;

parse_rule2: ... ;

PUNCTUATION : ',' | '.'| ';' | ':' ;

When Antlr4 (specifically the antlr4 vscode exentions) goes to generate tokens, it ignores my punctuation rule (just an example) and creates an implicit token, T_1 , for example. I can't seem to find any resources online for looking for a specific token out of a more general lexical rule. It seems senseless to create a lexical rule for every possible literal you might want to look for while parsing.

I've searched high and low for a solution to this problem. If I want to parse for a specific character that's defined in a lexical rule, how do I prevent Antlr from just generating an implicit token, and actually read my lexical rules?

1

There are 1 best solutions below

0
Ivan Kochurkin On BEST ANSWER

Explicit use of token

parse_rule0 : parse_rule1 PUNCTUATION parse_rule2 PUNCTUATION ;

pare_rule1 : ... ;

parse_rule2: ... ;

PUNCTUATION : ',' | '.'| ';' | ':' ;

Lexical rule for every literal

parse_rule0 : parse_rule1 ',' parse_rule2 ';' ;

pare_rule1 : ... ;

parse_rule2: ... ;

COMMA: ',';
DOT: '.';
SEMI: ';';
COLON: ':';

Unfortunately, ANTLR can't match literals with complex lexer rules that include even only alternatives.