How to get HTML site contents and manipulate its tags in PHP?

178 Views Asked by At

I'm trying to write a script that will get the html code of a site and will have to calculate the number of the <li> items within a particular <ul>.

<html>

<head>...</head>

<body>
    ...
    <ul class="the-list">
        ...
        <li>...</li>
        ...
    </ul>
    ...
</body>

</html>

So what I'm currently doing is to get the contents via file_get_contents(), but then I have to get this particular <ul> and somehow parse and foreach its <li>s. What's the best approach of doing that?

Thanks

2

There are 2 best solutions below

2
On

This can be done but <ul> tag must have an Id for the below php snippet to work:

Load HTML then reference the target ul then the target sub tag that is <li> here

Like:

$dom = new DOMDocument;
$dom->loadHTML($HTML);
$allElements = $dom->getElementById('targetUlId')->getElementsByTagName('li');
echo $allElements->length;

This will echo the count of li tags inside the target ul

Hope this helps

0
On
$html = file_get_html('http://www.google.com/');

foreach($html->find('li') as $element) 
       echo $element->plaintext. '<br>';

You can also use simplephpdom library

http://simplehtmldom.sourceforge.net/