Scrapy and Tor/Privoxy unable to crawl [Connection refused 61]

711 Views Asked by At

I have the following code refered from here in my middlewares.py which i'm trying to change my ip in TOR with every request

def _set_new_ip():
    with Controller.from_port(port=9051) as controller:
        controller.authenticate(password='tor_password')
        controller.signal(Signal.NEWNYM)

class RandomUserAgentMiddleware(object):
    def process_request(self, request, spider):
        ua  = random.choice(settings.get('USER_AGENT_LIST'))
        if ua:
            request.headers.setdefault('User-Agent', ua)

class ProxyMiddleware(object):
    def process_request(self, request, spider):
        _set_new_ip()
        request.meta['proxy'] = 'http://127.0.0.1:8118'
        spider.log('Proxy : %s' % request.meta['proxy'])

But when i try to start the crawling in scrapy it keeps returning me the following:

2017-09-10 22:36:44 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:44 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 1 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:44 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0003)
2017-09-10 22:36:44 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:44 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:52 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 2 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:52 [stem] DEBUG: GETCONF __owningcontrollerprocess (runtime: 0.0004)
2017-09-10 22:36:52 [stem] INFO: Error while receiving a control message (SocketClosed): empty socket content
2017-09-10 22:36:52 [IT] DEBUG: Proxy : http://127.0.0.1:8118
2017-09-10 22:36:56 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology> (failed 3 times): Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.jobstreet.com.sg/en/job-search/job-vacancy.php?ojs=10&key=information%20technology>: Connection was refused by other side: 61: Connection refused.
2017-09-10 22:36:56 [scrapy.core.engine] INFO: Closing spider (finished)
0

There are 0 best solutions below