I can't grab specific URL in search page

113 Views Asked by At

I enter the estate website and searched by name of the city. After that I want to grab Osaka City building URL. In here http://brillia.com/search/?area=27999 There are four of those. 

And I m using that link to grab URL.

$allDivs = $parser->getElementsByTagName('div');
    foreach ($allDivs as $div) {
        if ($div->getAttribute('class') == 'boxInfomation') {
            $allLinks = $div->getElementsByTagName('a');
            foreach ($allLinks as $a) {
                $linkler[] = $a->getAttribute('href');
            }
        }
    }

But I cant grab those. Actually I grabbed not just osaka city pages URL actually grabbed all of it. When I try to see the source the osaka page site. It shows http://brillia.com/search/ Thats why I m grabbing all other links...

But how can I grab just URLs in here -> http://brillia.com/search/?area=27999

Any idea? Thank you.

2

There are 2 best solutions below

1
On

Can you do this by using jQuery? in that case this grab the a href

 $("div h3 a").each(function(){
    var link = $(this).attr("href");
    console.log(link);
 });

here a jsfiddle test

2
On

The parser relies on libxml to extract elements but that page is using html5 heavily, ommiting certain close tags, etc and that isn't really strict xml, so it's struggling to "correct mistakes" by guessing where to close missing tags, returning wrong results.

You need a parser with html5 support like HTML5DOMDocument that extends DOMDocument and should have mostly the same interface.