Application Request Routing: Why some sites return HTTP 404 some don't?

2k Views Asked by At

I want to add a load balancer to an existing asp.net project using Application Request Routing. So I made myself familiar with the concepts and created a local test-setup:

  • IIS locally running on Windows 10:
    • Installed Application Request Routing 3.0 with Windows Platform Installer
    • Created server farm with following servers:
      1. <test-server-name>.de (Microsoft 2012 R2 Server: contains the asp.net project)
      2. www.google.com (just to see if load balancing and url rewriting works because I don't have two test servers available)

URL-Rewriting rule: url rewrite rule

After typing localhost multiple times in any browser, I can see that load balancing (weighted round robin) is working fine. It's alternating between 1. and 2. website.

The problem I'm facing is a 404 Error on both websites. enter image description here

I already tried the following:

  • Installing and enabling Failed Request Tracing Rules (on local IIS): URL Rewriting is working properly i think.
    Failed Request Log for www.google.com: google drive, unzip and open xml in e.g. IE for better view

  • Create Server Farm without automatic creation of URL Rewrite rulesenter image description here

    (selecting No and create own URL Rewrite rule)

  • Change "Managed Pipeline Mode"-setting of Applcation Pool from Integrated to Classic

  • Healthcheck on other Websitesenter image description here I have absolutly no clue why it's working on Git-websites and why facebook is returning a 400 error code.

  • Enabling/disabling proxy (IIS-Manager -> Application Request Routing Cache -> Server Proxy Settings...)

I don't know what i could do next, so I appreciate any help. Thanks.

1

There are 1 best solutions below

0
On BEST ANSWER

Answer can be found here: https://forums.iis.net/t/1238739.aspx?Why+some+sites+return+HTTP+404+some+don+t+

Some websites simply don't support localhost as hostname, which is why localhost can't be found (error 404) e.g. on google.com

Detailed answer if link above is not working in future:

That is not an effective test.

What you are doing is sending the hostname of your request to the third party servers. Like Google.

So if your request is say http://example.com you are sending this to say www.google.com and the Google servers will likely reject this as you can see

Web server admins generally don't let themselves receive traffic from domain thet do not host.

If you sent a request to my servers IP with mysite.com I too would likely reject it. (Things get complex if you have wildcard sites and you allow all traffic through)

But simply showing that 404 page from Google means tour request hit there server so that implies ARR is working.

If you really wanted to test it this way have a local host file with www.google.com resolving to your servers IP. Set up a site with www.google.com as the hostheader and then you should see the correct info hitting Google. But there is no accounting for what 3rd party admins do on their side.