How to select a text between 2 different tags in the dom?

892 Views Asked by At

I have a <p> tag which contains text separated by <br> tags like this:

<p>
    <small>Some text here</small>
    This is the text that want to remove
    <br> another text here
    <br> more text here
    <br> also there are other tags like <em>this one</em>
</p>

The elements that I want to select are after the first <br> tag until the end, I'm currently using the QueryPath library and I'm only getting the html tags and the text between them and not getting the other text which is not surrounded by tags.

for example I get only the <br> tags and the <em></em> tag with this code:

$qp->find('div > p')->children('br')->eq(0)->nextAll();

So I tried to get the whole <p> tag and try to remove the elements from the <small> tag until the first <br> tag:

// remove the text after the small tag
$qp->branch('div > p')->children('small')->textAfter(''); // didn't work

// although when I return the textAfter I get the text
// so setting it to an empty string didn't work

// I can only remove the small tag
$qp->branch('div > p')->children('small')->remove();

The QueryPath library is a wrapper on top of the Dom native extension, so any solution using the Dom extension will work.

1

There are 1 best solutions below

2
On

The QueryPath-methods used for the selection of nodes(e.g. nextAll() or children()) only return ElementNodes, but the nodes between the <br/>-elements are TextNodes.

Use the nextSibling-property of DOMNode to select TextNodes too.

Example(using native DOM):

<?php
$dom = new DOMDocument();

$dom->loadXML('<p>
    <small>Some text here</small>
    This is the text that want to remove
    <br/> another text here
    <br/> more text here
    <br/> also there are other tags like <em>this one</em>
</p>');
$text='';
$node = $dom->getElementsByTagName('br')->item(0);
while($node->nextSibling){
  $node=$node->nextSibling;
  $text.=$node->textContent;
}
echo $text;
//output:
//another text here more text here also there are other tags like this one 
?>