How to hide the continuous hit rates(Refresh) to a website

164 Views Asked by At

I have developed a Python (Requests) and Java code to scrap data from a Website. And it will work by continuously refresh the website for new data.
But the Website recently identified my scraper as an Automated Service and my account had been Locked out. Is there any way to hide this refreshes to get new data without account lock?

1

There are 1 best solutions below

0
On

It depends on which website it is, in any case, the scraper simulates an user behavior, which would still be blocked.
If the website detects timed tasks a solution might be to randomize a refresh time of your application.
If the website will presents a captcha code, you have no easy solution
If the website just counts the visit from a particular IP address, you might set up a dynamic proxy server to simulate requests from other IPs