Python urllib2.urlopen returns a HTTP error 503

7.7k Views Asked by At

Here you can see my code snippet. Since 3 days it does not work any longer. My python is running under Ubuntu 10.04.4 LTS. Python version is 2.6.5.

#!/usr/bin/env python
import urllib2 as ur
...
webpage = []

site = "http://www.gametracker.com/server_info/94.250.218.247:25200/top_players/"
hdr =  {'User-Agent': 'Mozilla/5.0'}
req = ur.Request(site , headers=hdr)
data = ur.urlopen(req)
for line in data:
    line = line.split(",")
    webpage.append(line)
...

here the returned Error-msg

Traceback (most recent call last):

File "read_top5.py", line 21, in <module>
  data = ur.urlopen(req)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
  return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
  response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
  'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 435, in error
  return self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
  result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
  raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 503: Service Temporarily Unavailable
2

There are 2 best solutions below

2
On BEST ANSWER

The service is not currently working. curl:

curl -i "http://www.gametracker.com/server_info/94.250.218.247:25200/top_players/"

also returns a 503:

HTTP/1.1 503 Service Temporarily Unavailable
Date: Mon, 08 Dec 2014 09:37:17 GMT
Content-Type: text/html; charset=UTF-8
Server: cloudflare-nginx

The service is using CloudFlare, which provides a form of DDoS protection that requires you to use a full web browser to connect.

Although you could likely work around it, by deciding to use this service, the site operators are declaring that they don't want you to connect using a script.

This is not a programming problem; you'll need to determine why the service is not available to scripts.

0
On

This is just something the site does. It appears to be part of some kind of anti-DDoS system. Why it returns 503 is perplexing, but it is definitely the site itself.

I tried the curl command Joe has above, and this is the response I get back:

HTTP/1.1 503 Service Temporarily Unavailable
Date: Mon, 08 Dec 2014 09:47:41 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d32f001037fafc1363bf86d29be0baf921418032061; expires=Tue, 08-Dec-15 09:47:41 GMT; path=/; domain=.gametracker.com; HttpOnly
X-Frame-Options: SAMEORIGIN
Cache-Control: no-cache
Server: cloudflare-nginx
CF-RAY: 19580b02d7c70f21-IAD

<!DOCTYPE HTML>
<html lang="en-US">
<head>
  <meta charset="UTF-8" />
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
  <meta name="robots" content="noindex, nofollow" />
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
  <title>Just a moment...</title>
  <style type="text/css">
    html, body {width: 100%; height: 100%; margin: 0; padding: 0;}
    body {background-color: #ffffff; font-family: Helvetica, Arial, sans-serif; font-size: 100%;}
    h1 {font-size: 1.5em; color: #404040; text-align: center;}
    p {font-size: 1em; color: #404040; text-align: center; margin: 10px 0 0 0;}
    #spinner {margin: 0 auto 30px auto; display: block;}
    .attribution {margin-top: 20px;}
  </style>

    <script type="text/javascript">
  //<![CDATA[
  (function(){
    var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
    b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
    b(function(){
      var a = document.getElementById('cf-content');a.style.display = 'block';
      setTimeout(function(){
        var t,r,a,f, sdDUenl={"xRvHG":+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]))};
        t = document.createElement('div');
        t.innerHTML="<a href='/'>x</a>";
        t = t.firstChild.href;r = t.match(/https?:\/\//)[0];
        t = t.substr(r.length); t = t.substr(0,t.length-1);
        a = document.getElementById('jschl-answer');
        f = document.getElementById('challenge-form');
        ;sdDUenl.xRvHG*=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG+=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG*=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+!![]+!![]+!![]+[])+(+[]));sdDUenl.xRvHG-=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG*=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]));sdDUenl.xRvHG*=+((+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG+=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]));a.value = parseInt(sdDUenl.xRvHG, 10) + t.length;
        f.submit();
      }, 5850);
    }, false);
  })();
  //]]>
</script>


</head>
<body>
  <table width="100%" height="100%" cellpadding="20">
    <tr>
      <td align="center" valign="middle">
          <div class="cf-browser-verification cf-im-under-attack">
  <noscript><h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript>
  <div id="cf-content" style="display:none">
    <img id="spinner" src="/cdn-cgi/images/spinner-2013.gif" />
    <h1><span data-translate="checking_browser">Checking your browser before accessing</span> gametracker.com.</h1>
    <p data-translate="process_is_automatic">This process is automatic. Your browser will redirect to your requested content shortly.</p>
    <p data-translate="allow_5_secs">Please allow up to 5 seconds&hellip;</p>
  </div>
  <form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
    <input type="hidden" name="jschl_vc" value="3cecd7cab5d69708a3b1081e462824d0"/>
    <input type="hidden" id="jschl-answer" name="jschl_answer"/>
  </form>
</div>


          <div class="attribution"><a href="http://www.cloudflare.com/" target="_blank" style="font-size: 12px;">DDoS protection by CloudFlare</a></div>
      </td>
    </tr>
  </table>
</body>
</html>

Note that the body contains content, despite being a 503 status code. This is actually consistent with what I saw when trying to visit the page in the browser. First I was sent to this "anti-DDoS" page you see in the response above, and then I was automatically redirected to the page requested in the URL (apparently via JavaScript). This explains why it doesn't behave as you expect outside your browser; a Python web request won't execute JavaScript to perform the redirect.

So it's definitely the service. You'll have to consult the people who made it to find out why and how they expect you to deal with it. You may want to look into whether they have a different endpoint for API calls, or the endpoint might respond differently if you set the Accept header. (application/json can be used to indicate you want JSON back.)