I have a basic playwright scrape implementation is as follows:
const scrape = async (browser, site) => {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto(site, { timeout: 300_000 });
await page.waitForLoadState('load');
// read page text, etc
page && (await page.close());
context && (await context.close());
browser && (await browser.close());
}
const sites = ['https://google.com', 'https://bing.com'];
for (const site of sites) {
const browser = await playwright.chromium.connectOverCDP('my url');
await scrape(browser, site);
}
When I do this with only 1 site, everything works fine. When I add 2+ sites to the list, Towards the end of any past the first, I start to see Target Closed errors, e.g.
Error: page.goto: Target page, context or browser has been closed
Am I misunderstanding how to open/close these resources? My CDP URL is a load balancer that distributes these ws:// requests, so my requirement is to open a new connection for each site to connect to CDP and run playwright steps concurrently, in isolation for each site in the array. Is this possible? I feel I am stupidly reusing some reference or something here.
Any insight appreciated, thanks.