Concurrent CDP Sessions in playwright

86 Views Asked by At

I have a basic playwright scrape implementation is as follows:

const scrape = async (browser, site) => {
  const context = await browser.newContext();
  const page = await context.newPage();
  await page.goto(site, { timeout: 300_000 });
  await page.waitForLoadState('load');
  // read page text, etc
  page && (await page.close());
  context && (await context.close());
  browser && (await browser.close());
}

const sites = ['https://google.com', 'https://bing.com'];
for (const site of sites) {
  const browser = await playwright.chromium.connectOverCDP('my url');
  await scrape(browser, site);
}

When I do this with only 1 site, everything works fine. When I add 2+ sites to the list, Towards the end of any past the first, I start to see Target Closed errors, e.g. Error: page.goto: Target page, context or browser has been closed

Am I misunderstanding how to open/close these resources? My CDP URL is a load balancer that distributes these ws:// requests, so my requirement is to open a new connection for each site to connect to CDP and run playwright steps concurrently, in isolation for each site in the array. Is this possible? I feel I am stupidly reusing some reference or something here.

Any insight appreciated, thanks.

0

There are 0 best solutions below