Here's the URL: https://www.grammarly.com
I'm trying to fetch HTTP headers by using the native get_headers() function:
$headers = get_headers('https://www.grammarly.com')
The result is
HTTP/1.1 400 Bad Request
Date: Fri, 27 Apr 2018 12:32:34 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 52
Connection: close
But, if I do the same with the curl command line tool, the result will be different:
curl -sI https://www.grammarly.com/
HTTP/1.1 200 OK
Date: Fri, 27 Apr 2018 12:54:47 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 25130
Connection: keep-alive
What is the reason for this difference in responses? Is it some kind of poorly implemented security feature on Grammarly's server-side or something else?
It is because
get_headers()uses the default stream context, which basically means that almost no HTTP headers are sent to the URL, which most remote servers will be fussy about. Usually the missing header most likely to cause issues is the User-Agent. You can set it manually before callingget_headers()usingstream_context_set_default. Here's an example that works for me: