Antlr mismatched '>' for include macro

71 Views Asked by At

I started to work with antlr a few days ago. I'd like to use it to parse #include macros in c. Only includes are to my interest, all other parts are irrelevant. here i wrote a simple grammar file:

... parser part omitted...

INCLUDE : '#include';
INCLUDE_FILE_QUOTE:  '"'FILE_NAME'"';
INCLUDE_FILE_ANGLE:  '<'FILE_NAME'>';

fragment
FILE_NAME: ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|' ')+;

MACROS: '#'('if' | 'ifdef' | 'define' | 'endif' | 'undef' | 'elif' | 'else' );
//MACROS: '#'('a'..'z'|'A'..'Z')+;

OPERATORS: ('+'|'-'|'*'|'/'|'='|'=='|'!='|'>'|'>='|'<'|'<='|'>>'|'<<'|'<<<'|'|'|'&'|','|';'|'.'|'->'|'#');

... other supporting tokens like ID, WS and COMMENT ...

This grammar produces ambiguity when such statement are encountered:

(;i<listLength;i++)

output: mismatched character ';' expecting '>'

Seems it's trying to match INCLUDE_FILE_ANGLE instead of treating the ";" as OPERATORS.

I heard there's an operator called syntactic predicate, but im not sure how to properly use it in this case.

How can i solve this problem in an Antlr encouraged way?

1

There are 1 best solutions below

0
On

Looks like there's not lots of activity about antlr here.

Anyway i figured this out.

INCLUDE_MACRO: ('#include')=>'#include';
VERSION_MACRO: ('#version')=>'#version';
OTHER_MACRO:   
     (
     |('#if')=>'#if'
     |('#ifndef')=>'#ifndef'
     |('#ifdef')=>'#ifdef'
     |('#else')=>'#else'
     |('#elif')=>'#elif'
     |('#endif')=>'#endif'
     );

This only solves first half of the problem. Secondly, one cannot use the INCLUDE_FILE_ANGLE to match the desired string in the #include directive. The '<'FILE_NAME'>' stuffs creates ambiguity and must be broken down to basic tokens from lexer or use more advanced context-aware checks. Im not familiar with the later technique, So i wrote this in the parser rule:

include_statement : 
    INCLUDE_MACRO include_file
    -> ^(INCLUDE_MACRO include_file);

include_file 
    : STRING
    | LEFT_ANGLE(INT|ID|OPERATORS)+RIGHT_ANGLE
    ;

Though this works , but it admittedly looks ugly. I hope experienced users can comment with much better solution.