I am doing parsechecker for url:https://www.nicobuyscars.com o/p Fetch failed with protocol status: exception(16), lastModified=0: Http code=403, url=https://www.nicobuyscars.com
May I know what is the issue and how to solve it. I tried changing the agent name but it did not work. Please help me.
looks like the server is blocking requests based on the user-agent request header. It's reproducible using another HTTP client (wget):
In any case, use polite settings for Nutch: large
fetcher.server.delay
, keep respecting the robots.txt, etc. It's very likely that the server implements other heuristics to detect and block bots.