Suppose I define a happy grammar
%name pf f
%tokentype { AB }
%error { parseError }
%token
a { A }
b { B }
%%
f :
a g a {}
| b {}
g :
b b {}
{
data AB = A | B deriving (Eq,Ord,Show)
parseError _ = error " bad "
}
If I compile this with
happy --glr
I am interested in generally in grammars with non-trivial ambiguities; however, this example demonstrates the bit that is confusing me.
I get a Haskell parser. I get a success only when the token stream is a b b a or b
However, I am much more interested in failure. I would like to fail very fast, and I seem to need to more tokens than I would think would be required.
For example, if I feed the token stream a,a,a ... it takes to the third a to fail. If I feed b b b, it takes to the third b to fail. Why the extra lookahead? When matching f, once I see two 'a's, there is nothing in the grammar that can match.