I have created a grammar, a stripped-down version of which is reproduced below:
(0) exp1: ternary;
(1) exp1: exp2;
(2) ternary: exp2 "?" exp1 ":" exp1;
(3) exp2: exp2 "+" exp3;
(4) exp2: exp3;
(5) exp3: maybe;
(6) exp3: "1";
(7) maybe: exp3 "?";
I believe this language is unambiguous, and should be LR-parsable. (Please let me know if I'm wrong!)
However, when I attempt to generate an LR(1) parser for this grammar, I get shift/reduce conflicts, because when the parser sees exp3
with lookahead ?
, it doesn't know whether to shift or reduce:
Conflicts in state 3:
Reduction using rule 4: exp2: exp3 · | "?"
Shift to state 6
Conflicts in state 9:
Reduction using rule 3: exp2: exp2 "+" exp3 · | "?"
Shift to state 6
Conflicts in state 13:
Reduction using rule 4: exp2: exp3 · | "?"
Shift to state 16
Conflicts in state 20:
Reduction using rule 4: exp2: exp3 · | "?"
Shift to state 23
Conflicts in state 25:
Reduction using rule 3: exp2: exp2 "+" exp3 · | "?"
Shift to state 23
Conflicts in state 28:
Reduction using rule 3: exp2: exp2 "+" exp3 · | "?"
Shift to state 16
Is there a reasonable way for me to make this language LR(1)-parsable (with no conflicts)?
Or are GLR (or maybe LR(2)?) my only realistic options for a language like this?
(Or am I even wrong in believing that the language is unambiguous in the first place?)
For reference, the ambiguous state machine I generated is the following (where ♦ is EOF):
State 0:
exp1: · ternary | {♦} → shift 1
ternary: · exp2 "?" exp1 ":" exp1 | {♦} → shift 2
exp2: · exp2 "+" exp3 | {"?", "+"} → shift 2
exp2: · exp3 | {"?", "+"} → shift 3
exp3: · maybe | {"?", "+"} → shift 4
exp3: · "1" | {"?", "+"} → shift 5
maybe: · exp3 "?" | {"?", "+"} → shift 3
State 1:
exp1: ternary · | {♦} → reduce 0
State 2:
ternary: exp2 · "?" exp1 ":" exp1 | {♦} → shift 7
exp2: exp2 · "+" exp3 | {"?", "+"} → shift 8
State 3:
exp2: exp3 · | {"+"} → reduce 4
exp2: exp3 · | {"?"} → reduce 4 shift 6
maybe: exp3 · "?" | {"?", "+"} → reduce 4 shift 6
State 4:
exp3: maybe · | {"?", "+"} → reduce 5
State 5:
exp3: "1" · | {"?", "+"} → reduce 6
State 6:
maybe: exp3 "?" · | {"?", "+"} → reduce 7
State 7:
exp1: · ternary | {":"} → shift 10
exp1: · exp2 | {":"} → shift 11
ternary: · exp2 "?" exp1 ":" exp1 | {":"} → shift 11
ternary: exp2 "?" · exp1 ":" exp1 | {♦} → shift 12
exp2: · exp2 "+" exp3 | {"?", ":", "+"} → shift 11
exp2: · exp3 | {"?", ":", "+"} → shift 13
exp3: · maybe | {"?", ":", "+"} → shift 14
exp3: · "1" | {"?", ":", "+"} → shift 15
maybe: · exp3 "?" | {"?", ":", "+"} → shift 13
State 8:
exp2: exp2 "+" · exp3 | {"?", "+"} → shift 9
exp3: · maybe | {"?", "+"} → shift 4
exp3: · "1" | {"?", "+"} → shift 5
maybe: · exp3 "?" | {"?", "+"} → shift 9
State 9:
exp2: exp2 "+" exp3 · | {"+"} → reduce 3
exp2: exp2 "+" exp3 · | {"?"} → reduce 3 shift 6
maybe: exp3 · "?" | {"?", "+"} → reduce 3 shift 6
State 10:
exp1: ternary · | {":"} → reduce 0
State 11:
exp1: exp2 · | {":"} → reduce 1
ternary: exp2 · "?" exp1 ":" exp1 | {":"} → shift 26
exp2: exp2 · "+" exp3 | {"?", ":", "+"} → shift 27
State 12:
ternary: exp2 "?" exp1 · ":" exp1 | {♦} → shift 17
State 13:
exp2: exp3 · | {":", "+"} → reduce 4
exp2: exp3 · | {"?"} → reduce 4 shift 16
maybe: exp3 · "?" | {"?", ":", "+"} → reduce 4 shift 16
State 14:
exp3: maybe · | {"?", ":", "+"} → reduce 5
State 15:
exp3: "1" · | {"?", ":", "+"} → reduce 6
State 16:
maybe: exp3 "?" · | {"?", ":", "+"} → reduce 7
State 17:
exp1: · ternary | {♦} → shift 1
exp1: · exp2 | {♦} → shift 18
ternary: · exp2 "?" exp1 ":" exp1 | {♦} → shift 18
ternary: exp2 "?" exp1 ":" · exp1 | {♦} → shift 19
exp2: · exp2 "+" exp3 | {♦, "?", "+"} → shift 18
exp2: · exp3 | {♦, "?", "+"} → shift 20
exp3: · maybe | {♦, "?", "+"} → shift 21
exp3: · "1" | {♦, "?", "+"} → shift 22
maybe: · exp3 "?" | {♦, "?", "+"} → shift 20
State 18:
exp1: exp2 · | {♦} → reduce 1
ternary: exp2 · "?" exp1 ":" exp1 | {♦} → shift 7
exp2: exp2 · "+" exp3 | {♦, "?", "+"} → shift 24
State 19:
ternary: exp2 "?" exp1 ":" exp1 · | {♦} → reduce 2
State 20:
exp2: exp3 · | {♦, "+"} → reduce 4
exp2: exp3 · | {"?"} → reduce 4 shift 23
maybe: exp3 · "?" | {♦, "?", "+"} → reduce 4 shift 23
State 21:
exp3: maybe · | {♦, "?", "+"} → reduce 5
State 22:
exp3: "1" · | {♦, "?", "+"} → reduce 6
State 23:
maybe: exp3 "?" · | {♦, "?", "+"} → reduce 7
State 24:
exp2: exp2 "+" · exp3 | {♦, "?", "+"} → shift 25
exp3: · maybe | {♦, "?", "+"} → shift 21
exp3: · "1" | {♦, "?", "+"} → shift 22
maybe: · exp3 "?" | {♦, "?", "+"} → shift 25
State 25:
exp2: exp2 "+" exp3 · | {♦, "+"} → reduce 3
exp2: exp2 "+" exp3 · | {"?"} → reduce 3 shift 23
maybe: exp3 · "?" | {♦, "?", "+"} → reduce 3 shift 23
State 26:
exp1: · ternary | {":"} → shift 10
exp1: · exp2 | {":"} → shift 11
ternary: · exp2 "?" exp1 ":" exp1 | {":"} → shift 11
ternary: exp2 "?" · exp1 ":" exp1 | {":"} → shift 29
exp2: · exp2 "+" exp3 | {"?", ":", "+"} → shift 11
exp2: · exp3 | {"?", ":", "+"} → shift 13
exp3: · maybe | {"?", ":", "+"} → shift 14
exp3: · "1" | {"?", ":", "+"} → shift 15
maybe: · exp3 "?" | {"?", ":", "+"} → shift 13
State 27:
exp2: exp2 "+" · exp3 | {"?", ":", "+"} → shift 28
exp3: · maybe | {"?", ":", "+"} → shift 14
exp3: · "1" | {"?", ":", "+"} → shift 15
maybe: · exp3 "?" | {"?", ":", "+"} → shift 28
State 28:
exp2: exp2 "+" exp3 · | {":", "+"} → reduce 3
exp2: exp2 "+" exp3 · | {"?"} → reduce 3 shift 16
maybe: exp3 · "?" | {"?", ":", "+"} → reduce 3 shift 16
State 29:
ternary: exp2 "?" exp1 · ":" exp1 | {":"} → shift 30
State 30:
exp1: · ternary | {":"} → shift 10
exp1: · exp2 | {":"} → shift 11
ternary: · exp2 "?" exp1 ":" exp1 | {":"} → shift 11
ternary: exp2 "?" exp1 ":" · exp1 | {":"} → shift 31
exp2: · exp2 "+" exp3 | {"?", ":", "+"} → shift 11
exp2: · exp3 | {"?", ":", "+"} → shift 13
exp3: · maybe | {"?", ":", "+"} → shift 14
exp3: · "1" | {"?", ":", "+"} → shift 15
maybe: · exp3 "?" | {"?", ":", "+"} → shift 13
State 31:
ternary: exp2 "?" exp1 ":" exp1 · | {":"} → reduce 2
I think this might be a precedence issue. The conflicts you're getting occur when the parser is looking at something like this:
At the time that the parser has seen
a + b ?
and is looking atc
, it can't decide whether it needs toReduce
b?
, so that it will parse an expression of the forma + (b?)
and then continue from there, orReduce
a + b
, so that it will parse an expression of the form(a + b) ? c : d
I think the challenge here is that in one case,
?
has very low precedence (when used as a ternary operator), and in another case it has very high precedence (when used as a unary operator). However, if you did assign the precedences this way, I think that the parser might be able to disambiguate these cases.Hope this helps!