LR(1) table construction confusion

244 Views Asked by At

I'm confused on how to parse this grammar using LR(1):

S -> A
A -> A(A) | empty

I'm aware there is left recursion but I was told it isn't necessary to remove it for LR(1). My item sets look like this: (the comma separates the grammar from the lookaheads)

item set 0:
s -> .A, $
A -> .A(A), $(
A -> ., $(

item set 1:
S -> A., $
A -> A.(A), $(

Now here is where I get confused:

item set 2:

A -> A(.A), $(  
A -> .A(A), )(
A -> ., )(

item set 3:
A -> A(A.), $(
A -> A. (A), )(

item set 4:
A -> A(A)., $(

I was told to parse the string "(())". When I did this i realized state 4 needed to have a right parenthesis ")" as a lookahead, which makes sense. Now I traced this back and figured out that this right parenthesis should have originally came from the first statement in item set 2.

A -> A(.A), $()

This would cause the right parenthesis to be carried over to the next 2 states and my grammar would parse perfectly. However, what I'm confused about is WHY a right parenthesis is supposed to be here? Shouldn't the $( be carried over from item set 1. Where does this right parenthesis come from? Can somebody please explain. Thanks

1

There are 1 best solutions below

0
On

Let's start by supposing that you are attempting to build a (Canonical) LR(1) machine, and consider state (item set) 3:

item set 3:
A -> A(A.), $(
A -> A.(A), )(

There are two possible successors: GOTO(S3, ')') and GOTO(S3, '(')):

item set 4: GOTO(S3, ')')
A -> A(A)., $(

item set 5: GOTO(S3, '(')
A -> A(.A), )(
A -> .A(A), )(
A -> .    , )(

Note that item set 5 is not the same as your item set 2, because the lookahead for the first item differs.

This is precisely where the ) lookahead which is puzzling you comes from, as you can see when finish the LR(1) state machine by continuing to advance from state 5:

item set 6: GOTO(S5, A)
A -> A(A.), )(
A -> A.(A), )(

item set 7: GOTO(S6, ')')
A -> A(A)., )(

That's it, because GOTO(S6, '(') is item set 5.

Most likely, you're trying to construct (and parse with) an LALR(1) machine. In the LALR(1) machine, item sets with the same cores are merged. When we merge two item sets, the lookaheads for corresponding items are replaced with the union of the lookaheads. So we merge item sets 2 and 5:

  item set 2          item set 5          merged item set
A -> A(.A), $(      A -> A(.A), )(      A -> A(.A), $)(
A -> .A(A), )(      A -> .A(A), )(      A -> .A(A), )(
A -> .    , )(      A -> .    , )(      A -> .    , )(

Similarly, we merge item sets 3 and 6, and item sets 4 and 7

item set 3+6:
A -> A(A.), $)(
A -> A.(A), )(

item set 4+7:
A -> A(A)., $)(

(There's a much more efficient algorithm for constructing LALR(1) machines, which does not involve constructing the entire LR(1) machine and then merging states. But the result is exactly the same, and the merging algorithm is arguably easier to understand.)