I'm trying to run a fairly simple scraper, but I keep getting the error in the title. I want to scrape around 64,000 pages, but I get the no such file error every time. Setting waitForAllRequestsToBeAdded to true doesn't fix the issue. I get the same error even when I try to process only 1,000 pages. Changing the crawler options doesn't seem to work either.
The full error I'm getting is
ERROR PlaywrightCrawler:AutoscaledPool: runTaskFunction failed.
Error: ENOENT: no such file or directory, open '/project/storage/request_queues/default/ehieKpeBY6Mf39n.json'
This is how I'm setting up and running the crawler:
const opts = {
navigationTimeoutSecs: 3,
requestHandlerTimeoutSecs: 3,
maxRequestRetries: 6,
maxConcurrency: 20
};
const config = new Configuration({
memoryMbytes: 8000
});
const crawler = new PlaywrightCrawler(opts, config);
crawler.router.addDefaultHandler(handlePage);
const requests = data.map(
(d) =>
new Request({
url: d.url,
userData: d
})
);
await crawler.run(requests);
How can I get the crawler to run without the request queue failing?