Web scraping for dynamic content

282 Views Asked by At

I am trying to scrape the information from a couple sites (mega.nz, openlaod.co) and the content is loaded dynamically so the code i am actuallu using doesn't work

 <?php

    require 'simple_html_dom.php';

    $ch = curl_init();
    curl_setopt($ch,  CURLOPT_URL,"https://openload.co/f/41I9Ak_QBxw/DPLA.mp4");
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    $response = curl_exec($ch);
    curl_close($ch);

    echo $response;
    $html = new simple_html_dom();
    $html->load($response);


    foreach ($html->find('img[id=imagedisplay]') as $key ) {
        echo $key;
    }



?> 

when i use it on openload (like the example above) it redirects me to "https://oload.download/scraping/" being "/scraping" the folder where i have my script at.

Is there any javascript/jquery framework (or php) that i can use to scrape the content on the fly??

1

There are 1 best solutions below

2
Mike B On

It's not suitable for a large amount of scraping, but in the past when I've needed to grab some basic data from a dynamic web page I've found that Selenium works pretty well.

Depending on your stack of choice, I'd recommend looking into headless browsers. This way you can render a page in the background and parse the resulting HTML.