I would like to design an LL1 grammar for arithmetic equations and variable assignments. I began with this grammar:
I have a nonambiguous grammar for arithmetic expressions:
E → T E’
E’ → | + E
T → id T’
T’ → | * T
However, I'm not sure how to incorporate variable assignments into the E productions.
How I tried previously was:
stmt -> assignment SEMI | RETURN stmt_prime
| LBRACE stmt_list RBRACE
| IF LPAREN assignment RPAREN stmt stmt_prime_prime
| FOR LPAREN assignment SEMI assignment SEMI assignment RPAREN stmt |
stmt_prime -> SEMI | -> assignment SEMI
stmt_prime_prime -> NOELSE
| ELSE stmt
assignment -> ID typ ASSIGN expr | expr
expr -> TE*
E* -> + TE* | -TE* | epsilon
T -> FT*
T* -> * FT* | / FT* | epsilon
F -> (E) | int_literal | TRUE | FALSE
(I'm ignoring the
typ
part because I assume it got there by accident)The problem here is that both
ID ASSIGN expr
andexpr
could start with anID
(or at least they could ifT
containedID
as an option, which I assume was the intention), so this isn't LL(1). It is LL(2) though, so if you're fine with that, you can just add anandalso next_token = ASSIGN
to theif
condition in your parser and be done with it.If you do need it to be LL(1), you'll have to adjust the language allowed by your parser, I'm afraid (that is, there is no LL(1) grammar that matches exactly the same language as your current grammar). One easy way out would be to simply add a keyword like
SET
before assignment, though admittedly that's ugly.The other alternative would be to allow arbitrary expressions as the left operand of
=
, making your assignment rule:Which is LL(1). The downside of this is that it allows a lot of non-sensical code such as
1+2 := 42
, but you can fix that outside of the grammar. That is, your code for parsing assignments can simply callparse_exp
and then, if the next token is anASSIGN
and the expression returned byparse_exp
is not just an identifier, raise an error that the left-side of an assignment must be an identifier.