I'm having trouble using Python's urllib
.
Here is the code I have tried:
import urllib
s = urllib.urlopen("https://www.mci.ir/web/guest/login")
And here is the error I am seeing:
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
s = urllib.urlopen("https://www.mci.ir/web/guest/login")
File "C:\Python27\lib\urllib.py", line 86, in urlopen
return opener.open(url)
File "C:\Python27\lib\urllib.py", line 207, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 450, in open_https
return self.http_error(url, fp, errcode, errmsg, headers)
File "C:\Python27\lib\urllib.py", line 371, in http_error
result = method(url, fp, errcode, errmsg, headers)
File "C:\Python27\lib\urllib.py", line 634, in http_error_302
data)
File "C:\Python27\lib\urllib.py", line 660, in redirect_internal
return self.open(newurl)
File "C:\Python27\lib\urllib.py", line 207, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 436, in open_https
h.endheaders(data)
File "C:\Python27\lib\httplib.py", line 954, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 814, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 776, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 1161, in connect
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
File "C:\Python27\lib\ssl.py", line 381, in wrap_socket
ciphers=ciphers)
File "C:\Python27\lib\ssl.py", line 143, in __init__
self.do_handshake()
File "C:\Python27\lib\ssl.py", line 305, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
The remote server does not seem to like the
User-Agent
header being used byurllib.urlopen()
andurllib2.urlopen()
(Python 2), norurllib.request.urlopen()
(Python 3). It is closing the connection.Issuing a request with the
requests
package does work:Setting the User-Agent to that used by
urllib/urllib2
:My advice is to use
requests
as this is a much better library, however, if you must use the standard library, useurllib2
and set a user agent header that is acceptable to the remote server:One other thing worth noting is that once the remote server receives a request that it doesn't like (e.g. with an unaccepted user agent), it blocks requests from the originating IP address until there has been a period of time with no requests (or it might be a random period).