Is E4X specification wrong?

217 Views Asked by At

Does anybody know about ECMAScript for XML (E4X; ECMA-357 2º edition) and read its specification? Deprecated, I know. To start my question, please let me point somethings first:

E4X allows expressions like <x/>, thus extending PrimaryExpression with a XMLInitialiser | XMLListInitialiser production.

However, they say:

The syntactic grammar for XML initialisers processes input elements produced by the lexical grammar goal symbols InputElementXMLTag and InputElementXMLContent. These input elements are described in section 8.3.

(and same for XML list.)

So, isn't XMLInitialiser supposed to also process InputElementRegExp? The specification tries to reject that. Please check these definitions:

XMLInitialiser :

  • XMLMarkup
  • XMLElement

XMLElement :

  • < XMLTagContent XMLWhitespaceopt />

and also,

InputElementXMLContent ::

  • <
  • and more...

They all make < available, but then how would one parse other expressions that need InputElementRegExp?

2

There are 2 best solutions below

0
AudioBubble On

This just gave me an headache. But, re-looking up the section 8, there's an example where,

order = <{x}>{item}</{x}>;

is equivalent to

(specification bitmap)

table

I believe the < production in the InputElementXMLContent was a mistake.

5
Michael Dyck On

So, isn't XMLInitialiser supposed to also process InputElementRegExp?

Yes, XMLInitialiser (or XMLListInitialiser) would consume XMLMarkup elements matched by InputElementRegExp.

The specification tries to reject that.

How so?

They all make < available, but then how would one parse other expressions that need InputElementRegExp? [In comment:] ... it'd cause a conflict for scanning an InputElementRegExp as well as an InputElementXMLContent, this is because the < symbol comes first in XMLInitialiser or XMLListInitialiser.

If you're saying that there's a conflict between InputElementRegExp and InputElementXMLContent because they both derive '<': No, that's not a conflict, because those two symbols are never used at the same point in the input. See the third paragraph in section 8:

"The InputElementXMLContent is used in those syntactic contexts where the literal contents of an XML element are permitted. The InputElementRegExp symbol is used in all other syntactic grammar contexts."

That is, in any given syntactic context, there should be exactly one lexical goal symbol (InputElementSomething) to use.

On the other hand, if you're saying that there's a conflict between XMLInitiliser and XMLListInitialiser because they both derive sentences beginning with <: Well, yes, that would be a problem for an LL(1) parser, but an LL(2) or LR(1) parser could handle it, I believe.

I believe the < production in the InputElementXMLContent was a mistake.

I don't think so. Consider the example <A><B/></A>. After <A>, the syntactic parser is expecting XMLElementContent or </, which means that the lexical parser must use InputElementXMLContent as the goal symbol. So the latter must be able to match < in order for this parse to succeed.

[from comment:] XMLElement starts with a < terminal and may nest other elements, thus, it should be InputElementRegExp IMO.

Again, see section 8. The two sentences I quoted above indicate that, after seeing <A>, you must use InputElementXMLContent to get the next input element.