I'm using HtmlAgilityPack to get a filtered DOM of <h2>
and <h3>
nodes and using Xpath 1.0 (from my Xpath 1.0 crash course this week) I need to get the number of <h3>
's (the number varies) that are between sibling <h2>
's as follows:
<div>
<h2>heading 1</h2>
<h3>sub 1.1</h3>
<h3>sub 1.2</h3>
<h2>heading 2</h2>
<h3>sub 2.1</h3>
<h2>heading 3</h2>
....
</div>
When I iterate (using C#) through the filtered nodes I want the exact number of <h3>
's that are after a <h2>
and before the next <h2>
. When I use the following I get all the <h3>
's as the result.
int countH3 = n.SelectNodes("./preceding-sibling::h2[2]/following-sibling::h2[3]/preceding-sibling::h3").Count(); //the [position] is set dynamically
For the node structure above would like the result of the code line to be:
countH3 = 1
but it is:
countH3 = 3
I've found many similar SO questions regarding "sibling nodes between sibling nodes" and have to thank @LarsH for his comment in another question that /preceding::h3
returns ALL <h3>'s
which helped explain the issue. I think I may need to use the Kayessian method of node-set intersection but get the "invalid token" error when I include the . |
union character as follows:
countH3 = n.SelectNodes("./h2[2]/following-sibling::h2[3]
[count(.|./h2[2]/following-sibling::h2[3]/preceding-sibling::h3)=
count(./h2[2]/following-sibling::h2[3]/preceding-sibling::h3)]").Count();
Any suggestions appreciated.