I'm trying to wrap my head around parser theory, and I keep finding the same example in different sources. The grammar goes approximately like this (simplified):
E = T
E = E + T
T = 0..9
So supposedly a string 2 + 2
will be parsed as such ("|" separates the stack from the reminder)
|2 + 2 <-can't reduce, shift
2|+ 2 <-reduce by T = 0..9
T|+ 2 <-reduce by E = T
E|+ 2 <-can't reduce, shift
E +|2 <-can't reduce, shift
E + 2| <-reduce by T = 0..9
E + T| <-reduction by E = E + T here?
E| <-done
The question is, at E + T
step parser can apply two different reductions to the rightmost part of the stack: E = T
(resulting in E + E
) and E = E + T
(resulting in E
). And I can't find a clear and conscise explanation how it chooses one over the other.
What am I missing?
What are the possible states?
So we start in state 0. Shift a
2
. We are now in state 1. Transition to state 2. Shift a+
. We are now in state 3. We shift a2
. We are in state 4. We reduce to state 5. We reach the end of the stack and wind up with an expression tree looking like the following: