How can I make lexer backtrack based on predicate in Antlr 3 (or 4)?

361 Views Asked by At

I have the following (simplified) problem in Antlr3. I have a set of lexer rules for a grammar with special strings and regular strings. Both are single-quoted. Special strings conform to a pattern (for example, suppose they only contain letters). There's also a function that determines if a matched a string is special. Whitespace is ignored.

Now suppose for example that isSpecial returns true only for string "foo". If I'm looking at "'foo' 'bar' '123'", the lexer should produce 1 special string token for foo, and then 2 regular string tokens.

I am trying to do something like this:

SpecialLiteral : { isSpecial(getText()) }?=> '\'' (Letter)+ '\'' ;
StringLiteral : '\'' ( ~('\') )* '\'' ;

This doesn't work as getText() is not yet populated in gating predicate, so empty string is passed to isSpecial, and "'foo'" becomes StringLiteral.

If I change gating predicate to validating predicate, I have the correct getText in isSpecial, but when it returns false (for "'bar'", for example), lexer doesn't backtrack and make "'bar'" a StringLiteral, it just fails.

Is it possible to solve this problem? If I were to just edit the lexer manually, it seems like it should be doable. We know the position before mSpecialLiteral, so we can just go back there and then act like we do after gating predicate. How to do it using the grammar/what is to be passed to isSpecial?

1

There are 1 best solutions below

0
On

You could match both in the same rule, and then check after the literal has matched which type it is, and change the type to SpecialLiteral when isSpecial(getText()) is true:

grammar T;

...

tokens {
  SpecialLiteral;
}

...

StringLiteral
 : '\'' ( ~( '\'' ) )* '\''
   {
     if(isSpecial(getText())) {
       $type = SpecialLiteral;
     }
   }
 ;