I am interested in the very richly semantic and XML-based TEI language, but I believe that if it could be encoded in a round-trippable manner with HTML, that it could thereby benefit from being creatable in web-based HTML editors or storeable on HTML-based wikis (at least those which supported the necessary semantic mechanisms), etc.
I would like to know whether RDFa would work as a mechanism for fully representing an XML dialect (or multiple ones) within HTML5, with the standard being round-trippability and awareness of the hierarchical nature of XML elements (and its other critical aspects like attributes).
I know one might be able to overload data-* attributes, Microformats, or Microdata, but none of these options allows for something which can both fully represent an XML dialect with its hierarchical nature while also being free of spec warnings about the mechanism not intended to be used by software independent of the site (e.g., if one wished to create a search engine to search such embedded XML in a hierarchically-aware manner).
If RDFa won't work, I think the best option might be data-* attributes, as one can easily do something like this to represent XML:
<div data-xml-ns="http://www.tei-c.org/ns/1.0"
data-xml-ns="html:http://www.w3.org/1999/xhtml" data-xml-element="div1"
data-xml-attribute-value="xml:id=myDiv1ID">Some TEI div1 content
and <div data-xml-element="div2">some div2 content</div></div>
(Not a good example of semantic richness I know, but just showing the nature of encoding.)
But again, I'd prefer to avoid the limitations placed on this mechanism as stated in the HTML spec:
"These attributes are not intended for use by software that is independent of the site that uses the attributes"
"these attributes are intended for use by the site's own scripts, and are not a generic extension mechanism for publicly-usable metadata."
If RDFa will work for this, I would appreciate an example of how, e.g., the example above might be encoded to preserve the hierarchical relationships, etc.
It should work, but it won't be "HTML5" per say, you can use something like your own DOMImplementation, which would allow you to manipulate the document as you wish, including defining your own document type if you want, to handle special tags and attributes as needed. DOMParser is another option, but not as capable as the latter. You should be able to render either one inside an iFrame as well, so that it is used more as a view, and the parent can do the manipulation via regular HTML methods.