General overview
This question follow this other one in order to complete it.
The goal is to covert a flat hierarchy of <h[0-6]> into a nested <ul>-<li> structure, in order to make a TOC.
For the moment, my work could do it, BUT he doesn’t process at all the gaps in the hierarchy, when a document move from, as example from <h2> to <h4> without an intermediate <h3> between them.
Current state
document.xml
<?xml version="1.0" encoding="UTF-8"?>
<document>
<h1>Lorem <i>arepo</i> ipsum dolor</h1>
<h2>Lorem ipsum dolor</h2>
<p>
Sed ut <i>perspiciatis</i> unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
</p>
<h1>sit amet et consectetur</h1>
<h2>Quia adipit</h2>
<h3>aliquam quaerat</h3>
<h6>-HERE-</h6>
<p>
Sed ut <i>perspiciatis</i> unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
</p>
<h2>Erit et nunquam</h2>
<h3>corporis suscipit</h3>
</document>
maketoc.xslt
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:key name="child" match="h2|h3|h4|h5|h6" use="generate-id(preceding-sibling::*[name()=concat('h', substring-after(name(current()), 'h') - 1)][1])"/>
<xsl:template match="/document">
<ul>
<xsl:apply-templates select="h1"/>
</ul>
</xsl:template>
<xsl:template match="h1|h2|h3|h4|h5|h6">
<li>
<span>
<xsl:copy-of select="node()" />
</span>
<xsl:variable name="childElements" select="key('child', generate-id())"/>
<xsl:if test="$childElements">
<ul>
<xsl:apply-templates select="$childElements"/>
</ul>
</xsl:if>
</li>
</xsl:template>
</xsl:stylesheet>
Current output
<ul>
<li><span>Lorem <i>arepo</i> ipsum dolor</span><ul>
<li><span>Lorem ipsum dolor</span></li>
</ul>
</li>
<li><span>sit amet et consectetur</span><ul>
<li><span>Quia adipit</span><ul>
<li><span>aliquam quaerat</span></li>
</ul>
</li>
<li><span>Erit et nunquam</span><ul>
<li><span>corporis suscipit</span></li>
</ul>
</li>
</ul>
</li>
</ul>
In this example, the elements are correctly nested according to their depth. Except the node <h6>-HERE-</h6> because he isn’t preceded by an <h5> (however it would be preceded by an <h5>, this <h5> should himself be preceded by an <h4> in our example). The document.xml directly move from <h3> to <h6>.
Desired output
<ul>
<li><span>Lorem <i>arepo</i> ipsum dolor</span><ul>
<li><span>Lorem ipsum dolor</span></li>
</ul>
</li>
<li><span>sit amet et consectetur</span><ul>
<li><span>Quia adipit</span><ul>
<li><span>aliquam quaerat</span><ul>
<li><span>-HERE-</span></li> <!-- This is the relevant behavior -->
</ul>
</li>
</ul>
</li>
<li><span>Erit et nunquam</span><ul>
<li><span>corporis suscipit</span></li>
</ul>
</li>
</ul>
</li>
</ul>
The problem
So, as I said, the problem is the hierarchical gaps are not processed and totally ignored. BUT they should be included in the final TOC.
In my modest opinion, the problem come from this line:
<xsl:key name="child" match="h2|h3|h4|h5|h6" use="generate-id(preceding-sibling::*[name()=concat('h', substring-after(name(current()), 'h') - 1)][1])"/>
Especially from here name(current()), 'h') - 1 with the subtraction operator. Because it explicitly match the strictly lower number (so higher in the <h[0-6]> hierarchy in HTML). It would be usefull if it was a <= operator who could match the immediate preceding tag witch is hierarchically higher.
Related treads
- Flat structure with gaps to nested/hierarchical using XSLT 1.0. The solution was found, but the structure is so different than mine, then I can’t adapt it easily.
- Converting flat hierarchy to nested hierarchy in XSLT depth
The question
How to make a nested <ul>-<li> hierarchy of heading nodes witch also process the hierarchical gaps?
I think the solution could simply be: