I am scraping websites using the FriendsOfPHP/Goutte package. Everything works great. I'm scraping the sites for open graph tags like image, title, etc., when a user pastes a URL into an input.
The problem occurs when a user copies the URL from a mobile device, the URL is now a mobile URL, like https://m.datpiff.com/tape/818948, and on that URL there are no open-graph tags.
When I access the same URL and replace the sub-domain m with www e.g. https://www.datpiff.com/tape/818948 from a desktop, it redirects me to: http://www.datpiff.com/Chance-The-Rapper-Jeremih-Merry-Christmas-Lil-Mama-mixtape.818948.html.
and this desktop URL does contain open-graph tags.
Is there a way I can get my server to force or trick the receiving server to redirect all URLs to the desktop version, so that I can use the open graph tags? The receiving server is already redirecting to the proper URL, but only if I'm typing directly from a browser on a desktop.
Here's the code I'm using - it works great. I just need to be able to redirect the URL I'm scraping to the desktop version.
First I'm replacing the m with www in my js like so:
fullurl.replace('m.',"www");
that converts https://m.datpiff.com/tape/818948 into https://www.datpiff.com/tape/818948
then in my PHP code i'm using something like this:
$url_to_scrape = $urltoscrape;
$client = new Client();
// Go to the example.com website
$crawler = $client->request('GET', $url_to_scrape);
$opengraphImage =$crawler->filterXpath('//meta[@property="og:image"]')->attr('content');
$title = $crawler->filter('title')->text();
You can set your client to follow redirect responses (HTTP status 3XX + Location header). Add this line after instantiating
$client
:It doesn't redirect mobile links from desktop browser, so you still need to replace
m.
withwww.