I recently added crawlee 3.0.2 to a project that has worked fine for months using a standard puppeteer setup (both with and without puppeteer-extra/stealth).
Everything runs fine when debugging and running through npm start
in VS Code, but once compiled I hit an issue:
Unhandled Exception - Error: Cannot find module 'puppeteer'
I've traced the issue and the error appears to come from:
@crawlee\puppeteer\internals\puppeteer-launcher.js - line 18:
const { launcher = browser_1.BrowserLauncher.requireLauncherOrThrow('puppeteer', 'apify/actor-node-puppeteer-chrome'), ...browserLauncherOptions }
I'm not sure how to get crawlee to see the puppeteer module installed in the project.
I'm not sure if this is relevant/related, but at one point I was trying to use WebpackObfuscator and unpacked puppeteer from the compiled ASAR during builds, with a line in the main package.json:
"build": {
"asar": true,
"asarUnpack": [
"**\\*.{node,dll}",
"node_modules/puppeteer/.local-chromium/**/*"
],
and
"extraResources": [
"./assets/**",
"node_modules/puppeteer"
],
I gave up on trying to use the obfuscator but found that passing the exectuablePath for puppeteer worked well, so I left it as-is (I can't remember now if the obfuscator was the only reason I did this in the first place - I think it was so that I could distribute with packaged chromium).
A couple of odd discrepancies I've noticed (I'm still fairly new to Electron/Node, so please forgive for the ignorance):
The package.json for crawlee has puppeteer listed for dependencies as:
"peerDependencies": {
"playwright": "^1.21.1",
"puppeteer": ">= 9.x <= 14.x"
},
"peerDependenciesMeta": {
"playwright": {
"optional": true
},
"puppeteer": {
"optional": true
}
}
I was using a standalone puppeteer install well before adding crawlee in, and I believe it was newer than the 14.x, but it still ran fine at the version I had already added while building and testing.
I tried installing puppeteer v14 (downgrading), which updated the main project package.json to
"puppeteer": "^14.4.1",
But that didn't resolve the issue either. However, when running npm view puppeteer version
it shows 16.1.0, regardless of what the package.json shows. So I can't make heads or tails of where the actual issue lies.
Any pointers on how to get crawlee to see the puppeteer package on compiled versions would be greatly appreciated.
I found a fix for this by making sure to include dependencies as an external in webpack and to declare the node_modules path relative to your package.json in your package.json as an 'extraResource'.