I'm trying to figure out how to remove the indirect left recursion from the logical keyword expressions within my Rust port of a Ruby parser (https://github.com/kenaniah/ruby-parser/blob/master/ruby-parser/src/parsers/expression/logical.rs). The grammar looks like:
E --> N | A | O | t
N --> n E
A --> E a E
O --> E o E
E = expression
A = keyword_and_expression
O = keyword_or_expression
N = keyword_not_expression
How would I go about transforming this to remove the recursion in A
and O
?
I think the problem here is not the indirect recursion but rather the ambiguity.
If it were just indirect recursion, you could simply substitute the right-hand sides of
N
,A
andO
, eliminating the indirect recursion:In order to get rid of the direct left recursion, you can left-factor:
and then remove the remaining left-recursion:
That has no left-recursion, direct or indirect, and every rule starts with a unique terminal. However, it's not LL(1), because there's a first/follow conflict.
That's really not surprising, because the original grammar was highly ambiguous, and left-recursion elimination doesn't eliminate ambiguity. The original grammar really only makes sense if accompanied by an operator precedence table.
That table indicates that
AND
andOR
are left-associative operators with the same precedence (a slightly unusual decision), whileNOT
is a unary operator with higher precedence. That, in turn, means that the BNF should have been written something like this:The only difference from the grammar in the OP is the removal of ambiguity, as indicated by the precedence table.
Again, the first step is to substitute non-terminals
A
andO
in order to make the left-recursion direct:This is essentially the same grammar as the grammar for arithmetic expressions (ignoring multiplication, since there's only one precedence level), and the left-recursion can be eliminated directly: