Crawler / CURL sees edge site includes

328 Views Asked by At

I'm trying to retrieve a website via curl/wget but instead of real content that I see with the browser I see ESI tags.

The URL is http://www.patagonia.com/home/?setCountryCode=US&setLocaleCode=en_US&setLocaleCodeSelect=en

<html xmlns="http://www.w3.org/1999/xhtml" class="no-js" lang="en"><head/><body onload="submitWait();true;"><esiU00003Aremove>

</esiU00003Aremove>



<esiU00003Acomment text=" ------------- begin html ---------- ">  

<esiU00003Acomment text=" --- CUSTOMIZE HEAD HERE --- ">

  <meta charset="utf-8"/>   <meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>

    <title>Hang Tight! Routing to checkout...</title> ......

I already tried it via postman, only sending Accept and Connection cookies and I see normal HTML results. I'm not quite sure what is going on. Has anybody any idea on what header to send or what else to do for wget/curl to get the page correctly?

1

There are 1 best solutions below

7
On

Some websites don't like Curl's user-agent. Try:

curl -v -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0' 'http://www.patagonia.com/home/?setCountryCode=US&setLocaleCode=en_US&setLocaleCodeSelect=en'