I currently implementing a JavaScript/ECMAScript 5.1 parser with JavaCC and have problems with the ArrayLiteral production.
ArrayLiteral :
[ Elision_opt ]
[ ElementList ]
[ ElementList , Elision_opt ]
ElementList :
Elision_opt AssignmentExpression
ElementList , Elision_opt AssignmentExpression
Elision :
,
Elision ,
I have three questions, I'll ask them one by one.
This is the second one.
I have simplified this production to the following form:
ArrayLiteral:
"[" ("," | AssignmentExpression ",") * AssignmentExpression ? "]"
Please see the first question on whether it is correct or not:
How to simplify JavaScript/ECMAScript array literal production?
Now I have tried to implement it in JavaCC as follows:
void ArrayLiteral() :
{
}
{
"["
(
","
| AssignmentExpression()
","
) *
(
AssignmentExpression()
) ?
"]"
}
JavaCC complains about ambiguous , or AssignmentExpression (its contents). Obviously, a LOOKAHEAD specification is required. I have spent a lot of time trying to figure the LOOKAHEADs out, tried different things like
LOOKAHEAD (AssignmentExpression() ",")in(...)*LOOKAHEAD (AssignmentExpression() "]")in(...)?
and a few other variations, but I could not get rid of the JavaCC warning.
I fail to understand why this does not work:
void ArrayLiteral() :
{
}
{
"["
(
LOOKAHEAD ("," | AssignmentExpression() ",")
","
| AssignmentExpression()
","
) *
(
LOOKAHEAD (AssignmentExpression() "]")
AssignmentExpression()
) ?
"]"
}
Ok, AssignmentExpression() per se is ambiguous, but the trailing "," or "]" in LOOKAHEADs should make it clear which of the choices should be taken - or am I mistaken here?
What would a correct LOOKAHEAD specification for this production look like?
Update
This did not work, unfortunately:
void ArrayLiteral() :
{
}
{
"["
(
","
|
LOOKAHEAD (AssignmentExpression() ",")
AssignmentExpression()
","
) *
(
AssignmentExpression()
) ?
"]"
}
Warning:
Warning: Choice conflict in (...)* construct at line 6, column 5.
Expansion nested within construct and expansion following construct
have common prefixes, one of which is: "function"
Consider using a lookahead of 2 or more for nested expansion.
Line 6 is ( before the first LOOKAHEAD. The common prefix "function" is simply one of the possible starts of AssignmentExpression.
Here is yet another approach. It has the advantage of identifying which commas indicate an undefined elements without using any semantic actions.