URL with # -> file_get_contents/sockets cuts URL after the #

305 Views Asked by At

I have looked for questions that may already have my answer, but I didn't find that specific problem.

When I try to get the content of a file with a '#' in the URL, it cuts the part after the #.

For example:

I try to get the content of http://steamcommunity.com/id/Schwabba/inventory/#730 but when I try to download it via socket or file_get_contents, all I get is http://steamcommunity.com/id/Schwabba/inventory/.

Someone knows how to fix that problem?

Thanks.

2

There are 2 best solutions below

0
On

URL fragments (part of the URL after hash) are not handled over HTTP and it's up to the browser to make up some sense of them. Usually they're being read by Javascript running on the page -as in this case- which then makes further AJAX calls to fetch rest of the page.

0
On

An unescaped # is a delimiter between the URL and a fragment. A fragment is not part of the URL itself, and thus is not included in HTTP requests. A fragment only has meaning to the client, not the server. For example, when you type in http://steamcommunity.com/id/Schwabba/inventory/#730 into a web browser, it requests http://steamcommunity.com/id/Schwabba/inventory/ and renders the result, and if the result is HTML then the browser jumps to the section denoted by an HTML <a> tag whose name or id attribute is 730.

So it makes sense why file_get_contents() is ignoring the fragment - it is supposed to. You have to decide what to do with the fragment after you have downloaded the file. A fragment is relative to the type of content being downloaded.