Getting errors while using Python urllib

6.4k Views Asked by At

I'm having trouble using Python's urllib.

Here is the code I have tried:

import urllib
s = urllib.urlopen("https://www.mci.ir/web/guest/login")

And here is the error I am seeing:

Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
     s = urllib.urlopen("https://www.mci.ir/web/guest/login")
   File "C:\Python27\lib\urllib.py", line 86, in urlopen
    return opener.open(url)
  File "C:\Python27\lib\urllib.py", line 207, in open
  return getattr(self, name)(url)
 File "C:\Python27\lib\urllib.py", line 450, in open_https
   return self.http_error(url, fp, errcode, errmsg, headers)
 File "C:\Python27\lib\urllib.py", line 371, in http_error
   result = method(url, fp, errcode, errmsg, headers)
 File "C:\Python27\lib\urllib.py", line 634, in http_error_302
   data)
 File "C:\Python27\lib\urllib.py", line 660, in redirect_internal
   return self.open(newurl)
 File "C:\Python27\lib\urllib.py", line 207, in open
   return getattr(self, name)(url)
 File "C:\Python27\lib\urllib.py", line 436, in open_https
   h.endheaders(data)
 File "C:\Python27\lib\httplib.py", line 954, in endheaders
   self._send_output(message_body)
 File "C:\Python27\lib\httplib.py", line 814, in _send_output
   self.send(msg)
 File "C:\Python27\lib\httplib.py", line 776, in send
   self.connect()
 File "C:\Python27\lib\httplib.py", line 1161, in connect
   self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
 File "C:\Python27\lib\ssl.py", line 381, in wrap_socket
   ciphers=ciphers)
 File "C:\Python27\lib\ssl.py", line 143, in __init__
   self.do_handshake()
 File "C:\Python27\lib\ssl.py", line 305, in do_handshake
   self._sslobj.do_handshake()
IOError: [Errno socket error] [Errno 8] _ssl.c:504: EOF occurred in         violation of protocol
2

There are 2 best solutions below

0
On BEST ANSWER

The remote server does not seem to like the User-Agent header being used by urllib.urlopen() and urllib2.urlopen() (Python 2), nor urllib.request.urlopen() (Python 3). It is closing the connection.

Issuing a request with the requests package does work:

>>> import requests
>>> r = requests.get('https://www.mci.ir/web/guest/login')
>>> r
<Response [200]>

Setting the User-Agent to that used by urllib/urllib2:

>>> r = requests.get('https://www.mci.ir/web/guest/login', headers={'User-Agent': 'Python-urllib/2.7'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 594, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 196, in resolve_redirects
    **adapter_kwargs
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/home/mhawke/virtualenvs/py2/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:590)

My advice is to use requests as this is a much better library, however, if you must use the standard library, use urllib2 and set a user agent header that is acceptable to the remote server:

req = urllib2.Request('https://www.mci.ir/web/guest/login')
req.add_header('User-Agent','Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0')
r = urllib2.urlopen(req)
html = r.read()

One other thing worth noting is that once the remote server receives a request that it doesn't like (e.g. with an unaccepted user agent), it blocks requests from the originating IP address until there has been a period of time with no requests (or it might be a random period).

0
On

I was also facing the same problem, I fixed it by using python3.

 File "/usr/lib/python2.7/ssl.py", line 830, in do_handshake
    self._sslobj.do_handshake()
IOError: [Errno socket error] EOF occurred in violation of protocol (_ssl.c:590)