How to hide the continuous hit rates(Refresh) to a website

177 Views Asked by sam mathew At 11 June 2018 at 23:16

I have developed a Python (Requests) and Java code to scrap data from a Website. And it will work by continuously refresh the website for new data.
But the Website recently identified my scraper as an Automated Service and my account had been Locked out. Is there any way to hide this refreshes to get new data without account lock?

Original Q&A

There are 1 best solutions below

Nicolò Gasparini On 14 June 2018 at 19:38

It depends on which website it is, in any case, the scraper simulates an user behavior, which would still be blocked.
If the website detects timed tasks a solution might be to randomize a refresh time of your application.
If the website will presents a captcha code, you have no easy solution
If the website just counts the visit from a particular IP address, you might set up a dynamic proxy server to simulate requests from other IPs

How to hide the continuous hit rates(Refresh) to a website

There are 1 best solutions below

Related Questions in WEB-SCRAPING

Related Questions in PYTHON-REQUESTS

Related Questions in SCRAPY

Related Questions in PYSPIDER

Trending Questions

Popular # Hahtags

Popular Questions