I collect the data using puppeteer and chrome-aws-lambda. I plan to push it to AWS Lambda but while testing locally I get an error:
Error: Protocol error (Runtime.callFunctionOn): Target closed.
when I call for waitForSelector.
I've some posts that mentioned there is a chance that chrome process gets too little memory within the docker. The question is: how to get it more memory? I also read that disable-dev-shm-usage may help, but it doesn't. That's how I do it now (the last line is where error happens):
const chromium = require('chrome-aws-lambda');
browser = await chromium.puppeteer.launch({
args: [...chromium.args, `--proxy-server=${proxyUrl}`, '--disable-dev-shm-usage'],
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
ignoreHTTPSErrors: true,
});
const page = await browser.newPage();
await page.authenticate({ username, password });
await page.goto(MY_URL, { waitUntil: 'domcontentloaded' })
await page.click(SUBMIT_SELECTOR);
await page.waitForSelector('#myDiv')
.then(() => console.log('got it')).
catch((e)=>console.log('Error happens: '+ e));
UPDATE: more info on local setup:
I run it locally using sam local start-api.
Here is the content of my template.yaml (just a slightly updated hello-world template:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
samnode
Sample SAM Template for samnode
# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
Function:
Timeout: 60
Resources:
HelloWorldFunction:
Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
Properties:
CodeUri: hello-world/
Handler: app.lambdaHandler
Runtime: nodejs14.x
MemorySize: 4096
Layers:
- !Sub 'arn:aws:lambda:${AWS::Region}:764866452798:layer:chrome-aws-lambda:22'
Events:
HelloWorld:
Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
Properties:
Path: /hello
Method: get
Outputs:
# ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
# Find out more about other implicit resources you can reference within SAM
# https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
HelloWorldApi:
Description: "API Gateway endpoint URL for Prod stage for Hello World function"
Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hello/"
HelloWorldFunction:
Description: "Hello World Lambda Function ARN"
Value: !GetAtt HelloWorldFunction.Arn
HelloWorldFunctionIamRole:
Description: "Implicit IAM Role created for Hello World function"
Value: !GetAtt HelloWorldFunctionRole.Arn
You have already configured 4GB memory for the Lambda and it should be more than enough to load couple of pages. If you still feel this is the issue, you can increase the memory upto 10240. I suspect the error may not be related to memory.
To verify, you can do the following to see if the Lambda is actually getting the specified memory.
Run the lambda in Eager mode (This keeps the lambda running on local even if there are no active requests)
sam local start-api --warm-containers EAGERNow run the following command to track the memory consumption
docker statsYou can send a request to your local api now and track the memory consumption. If you see less than 4GB memory allocated to your lambda function, then update the Docker resources and ensure you allocate appropriate memory to Docker.
Update Docker Resources (Increase memory)
Try out different versions of
chrome-aws-lambda(may be using a local layer with SAM). I would also run the same block of code on local usingPuppeteerby disabling the headless mode and verify the selector the code is waiting for is actually available.puppeteerdependency.puppeteerinstead ofchrome-aws-lambdaconst puppeteer = require('puppeteer');browser = await puppeteer.launch({headless: false});node <replace-with-your-file-name.js>e.g. if the file name is somejsfile.js then the command would benode somefile.jsHope this helps you proceed further.