Why can't a left-recursive, non-deterministic, or ambiguous grammar be LL(1)?

Question

Why can't a left-recursive, non-deterministic, or ambiguous grammar be LL(1)?

3.3k Views Asked by DjaouadNM At 05 January 2019 at 13:14

I've learned from several sources that an LL(1) grammar is:

unambiguous,
not left-recursive,
and, deterministic (left-factorized).

What I can't fully understand is why the above is true for any LL(1) grammar. I know the LL(1) parsing table will have multiple entries at some cells, but what I really want to get is a formal and general (not with an example) proof of the following proposition(s):

A left-recursive (1), non-deterministic (2), or ambiguous (3) grammar is not LL(1).

Original Q&A

There are 2 best solutions below

intel_chris On 07 January 2019 at 13:09

The answer to these questions (and they are valid for LL(k) for any finite k) have to do with how the parsing stack works in an LL parser.

At the point where one is at the beginning of a non-terminal in a grammar, the parser must determine by looking ahead only k (1 in the LL(1)) case tokens before deciding whether to push onto the stack a specific rule or to parse the text using other rules. So, let’s look at each of these cases and see how it impacts that decision.

Left-recursive. There are two left-recursive cases.

a. The left-recursion has no tokens in it after the recursion. A rule something like:

nonterm: nonterm;

Such a rule has no effect and no matter how much you recurse doesn’t change what you are parsing.

b. The left-recursion has tokens in it after the recursion.  A rules something like:

nonterm: nonterm “X”;

In this rule, you need to push nonterm rules onto the stack for as many Xs as follow the nonterm. You cannot determine how many Xs there are with only k tokens of lookahead. If you guess and guess too small, you end up with Xs left over, and for any guess, there will be a case in the language with more than that many X tokens. If you guess and you guess too large, you end up with extern nonterm rules on the stack. You don’t get to remove them. In either case, you are simply wrong.

Non-deterministic. A non-deterministic grammar has the same characteristics as a left-recursive one. It is non-deterministic whether you should push or not. Palindrome languages are typical non-deterministic examples, but not the only ones. In a palindrome language, you don’t know whether you should push another nonterminal onto the stack or use the token you are seeing to help you pop your way back up the stack. If you make the wrong choice, you again misparse the input.
Ambiguous. Again the problem is similar. In this case, there are two possible parses. One which pushes one nonterminal and successfully parses the input and another parse which doesn’t (possibly pushing another non-terminal instead, either now or later in the parse). Either one will yield a correct parse. Now, in the ambiguous case, pushing the nonterminal will not necessarily cause a parsing error, you will simply choose one of the potential parses while ignoring the other. If you semantics require that the other parse be chosen, the problem will rear its head later. Note, of course, the most ambiguous grammars are also non-deterministic.

Now, if you look at those cases, you can see, that if you could somehow both push and not push the nonterminal onto the stack, you could parse the input with the grammar. And, in the ambiguous case, produce a set of parses that matched the input. There are techniques that do that, I believe they are considered GLL (generalized LL) — the equivalent technique with an LR parser generator is called GLR. The resulting output is often considered a “parse forest” (or sometimes a parse dag, directed acyclic graph).

[Note: I saw the above question first on Quora and this answer is copied from there.]

**DjaouadNM** · Accepted Answer · 2019-01-06T19:19:21.360000

I have done some more research, and I think I've found a solution for the 1st and 2nd questions, as for the 3rd one, I found an existing solution here on SO for it, the proof attempts are written below:

We start by writing the three rules of the definition of an LL(1) grammar:

For every production A -> α | β with α ≠ β:

FIRST(α) ∩ FIRST(β) = Ø.
If β =>* ε then FIRST(α) ∩ FOLLOW(A) = Ø (also, if α =>* ε then FIRST(β) ∩ FOLLOW(A) = Ø).
Including ε in rule (1) implies that at most one of α and β can derive ε.

Proposition 1: A non-factored grammar is not LL(1).

Proof:

If a grammar G is non-factored then there exists a production in G of the form:

A -> ωα1 | ωα2 | ... | ωαn

(where αi is the i-th α, not the symbols α and i), with α1 ≠ α2 ≠ ... ≠ αn. We can then easily show that:

∩(i=1,..,n) FIRST(ωαi) ≠ Ø

which contradicts rule (1) of the definition, thus, a non-factored grammar is not LL(1). ∎

Proposition 2: A left-recursive grammar is not LL(1).

Proof:

If a grammar is left-recursive then there exists a production in G of the form:

S -> Sα | β

Three cases arise here:

If FIRST(β) ≠ {ε} then:

FIRST(β) ⊆ FIRST(S)

=> FIRST(β) ∩ FIRST(Sα) ≠ Ø

which contradicts rule (1) of the definition.
If FIRST(β) = {ε} then:

2.1. If ε ∈ FIRST(α) then:

ε ∈ FIRST(Sα)

which contradicts rule (3) of the definition.

2.2. If ε ∉ FIRST(α) then:

FIRST(α) ⊆ FIRST(S) (because β =>* ε)

=> FIRST(α) ⊆ FIRST(Sα) ........ (I)

we also know that:

FIRST(α) ⊆ FOLLOW(S) ........ (II)

by (I) and (II), we have:

FIRST(Sα) ∩ FOLLOW(S) ≠ Ø

and since β =>* ε, this contradicts rule (2) of the definition.

In every case we arrive at a contradiction, hence, a left-recursive grammar is not LL(1). ∎

Proposition 3: An ambiguous grammar is not LL(1).

Proof:

While the above proofs are mine, this one is not, it's by Kevin A. Naudé which I got from his answer that is linked below:

https://stackoverflow.com/a/18969767/6275103

Why can't a left-recursive, non-deterministic, or ambiguous grammar be LL(1)?

There are 2 best solutions below

Related Questions in PARSING

Related Questions in COMPILER-CONSTRUCTION

Related Questions in GRAMMAR

Related Questions in LL-GRAMMAR

Related Questions in LANGUAGE-THEORY

Trending Questions

Popular # Hahtags

Popular Questions