How can I get all elements following once, like :
<div id="exemple">
<h2 class="target">foo</h2>
<p>bla bla</p>
<ul>
<li>bar1</li>
<li>bar2</li>
<li>bar3</li>
</ul>
<h4>baz</h4>
<ul>
<li>lot</li>
</ul>
<div>of</div>
<p>possible</p>
<p>tags</p>
<a href="#">after</a>
</div>
I need to detect <h2 class="target"> and get all tags to the next <h4> and ignore <h4> AND all followings tags (if <h4> not exist, I have to get all tags to the end of parent [here : end of <div>])
The content is dynamic and unpredictable The only rule is : we know there is a target and there is a (or end of element). I need to get all tags beetween both and exclud all others.
With this exemple I need to get the HTML following :
<h2 class="target">foo</h2>
<p>bla bla</p>
<ul>
<li>bar1</li>
<li>bar2</li>
<li>bar3</li>
</ul>
so I can get : target = page.at('#exemple .target')
I know next_sibling method, but how can i test the type of tag of the current node?
I think about something like that to course the node tree :
html = ''
while not target.is_a? 'h4'
html << target.inner_html
target = target.next_sibling
How can I do this?
You can subtract the ones you don't want from your nodeset:
Maybe it makes more sense to use xpath but I can do this without googling.
Your idea of iterating next sibling can work too: