pyppeteer-install behind proxy

2k Views Asked by At

I'm behind a corporate proxy.

I can get pip working by doing set https_proxy=http://myproxy:port so I can install pyppeteer

but whatever I've tried - I can't get pyppeteer to download chromium. I run pyppeteer-install, and it just says downloading chromium, but nothing ever gets put in the %appdata% pyppeteer location. is there any way to fix it, beyond downloading chromium manually and just putting it in the correct spot?

2

There are 2 best solutions below

0
On BEST ANSWER

Based on pyppeteer/download_chromium but with the use of urllib3.ProxyManager instead of urllib3.PoolManager.

from io import BytesIO
import urllib3

from tqdm import tqdm
from pyppeteer import chromium_downloader  

def download_zip(url: str) -> BytesIO:
        """Download data with proxy from url."""
        print('Starting Chromium download. Download may take a few minutes.')
    
        with urllib3.ProxyManager(proxy_url='http://proxy-ip:port') as http:
            # Get data from url.
            # set preload_content=False means using stream later.
            r = http.request('GET', url, preload_content=False)
            if r.status >= 400:
                raise OSError(f'Chromium downloadable not found at {url}: Received {r.data.decode()}.\n')
    
            # 10 * 1024
            _data = BytesIO()
            try:
                total_length = int(r.headers['content-length'])
            except (KeyError, ValueError, AttributeError):
                total_length = 0
    
            process_bar = tqdm(total=total_length, unit_scale=True, unit='b')
            for chunk in r.stream(10240):
                _data.write(chunk)
                process_bar.update(len(chunk))
            process_bar.close()
    
        print('Chromium download done.')
        return _data
    
    
    def download_chromium() -> None:
        """Download and extract chromium."""
        chromium_downloader.extract_zip(download_zip(chromium_downloader.get_url()), chromium_downloader.DOWNLOADS_FOLDER / chromium_downloader.REVISION)
        print(f'Chrome executable path: {str(chromium_downloader.chromium_executable())}')

Simply call download_chromium() in your program

Rem: Don't forget to replace http://proxy-ip:port with your corporate proxy.

0
On

Pyppeteer is using urllib3 for downloads and urllib3 doesn't get configuration from environment variables.

You can download it manually and then specify a custom location:

from pyppeteer import launch

browser = await launch(executablePath='<path>')

References: