How to scrape protected sites using puppeteer and js

Question

How to scrape protected sites using puppeteer and js

480 Views Asked by Bertil Frigaard At 20 August 2025 at 09:05

I am trying to make a bot that can scrape any site, however some sites i run into problems. For now i Just open the browser in headless: false mode and then navigate myself. But i still run into problems, therefore i think it could be a case of the site detecting my footprint.

I have tried with a couple different sets of options when i launch, which is the reasons there is multiple option variables, and only 1 of them are used

Here is my current code:

const puppeteer = require("puppeteer-extra");
const { executablePath } = require("puppeteer");
const pluginStealth = require("puppeteer-extra-plugin-stealth");
const Ua = require("puppeteer-extra-plugin-anonymize-ua");

puppeteer.use(pluginStealth());

puppeteer.use(Ua());

let browser, page;

function log(log){
    console.log(log);
};

function delay(time) {
    return new Promise((resolve) => {
        setTimeout(resolve, time);
    });
}

async function openBrowser(){
    if (!browser){

        const options1= {
            headless: false, 
            executablePath: "C:/Program Files/Google/Chrome/Application/chrome.exe",
            args: ['--profile-directory="Person 1"'],
            userDataDir: "C:\\Users\\berti\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
        };

        const options2 = {
            args: ['--start-maximized', 'disable-gpu', '--disable-infobars', '--disable-extensions', '--ignore-certificate-errors'],
            headless: false,
            ignoreDefaultArgs: ['--enable-automation'],
            executablePath: "C:/Program Files/Google/Chrome/Application/chrome.exe",
            defaultViewport: null,
        };
        browser = await puppeteer.launch(options2);
        await delay(Math.random() * 1000)
        page = await browser.newPage(); 
        log("New browser has been booted up");
    } else {
        log("Browser alleready in existience");
    };
}

One of the tests i do is to head onto nike and try and add a shoe to the cart, but it wont let me.

Original Q&A

There are 1 best solutions below

**Prashant Patil** · Answer 1

To improve the success rate of your web scraping bot and avoid detection, you can try the following techniques:

User Agent Rotation: Use a library or plugin to rotate and randomize the User Agent string to make your bot appear more like a regular browser.

JavaScript Rendering: Ensure that the headless browser executes JavaScript properly, as many modern websites rely on it for functionality and content rendering.

Rate Limiting and Delay: Introduce random delays between your requests to avoid triggering rate-limiting mechanisms. Mimic human-like behavior in your bot.

IP Rotation and Proxying: Use a pool of rotating IP addresses or proxies to prevent IP-based blocking.

CAPTCHA Solving: Implement CAPTCHA-solving services or libraries to handle CAPTCHAs programmatically.

How to scrape protected sites using puppeteer and js

There are 1 best solutions below

Related Questions in JAVASCRIPT

Related Questions in WEB-SCRAPING

Related Questions in AUTOMATION

Related Questions in PUPPETEER

Related Questions in BOTDETECT

Trending Questions

Popular # Hahtags

Popular Questions