Issue with Docker Image for Custom Deployment on Zyte (formerly Scrapinghub) Platform

208 Views Asked by At

I am currently facing an issue while trying to create a Docker image for custom deployment on the Zyte platform (formerly known as Scrapinghub). My goal is to set up a Python environment with Scrapy, Playwright, Twisted, and other necessary dependencies.

Here is my Dockerfile:

FROM python:3.11.6-slim

WORKDIR /app
COPY . /app

RUN apt-get update \
    && pip install --no-cache-dir --upgrade pip \
    && pip install --no-cache-dir -r requirements.txt \
    && playwright install --with-deps chromium  \
    && mv /root/.cache/ms-playwright /ms-playwright \
    && mv /ms-playwright/chromium-* /ms-playwright/chromium \
    && chmod -Rf 777 /ms-playwright

ENV SCRAPY_SETTINGS_MODULE dummy.settings

RUN python setup.py install

And here is my requirements.txt file:

scrapy==2.11.0
playwright==1.39.0
Twisted==22.10.0
scrapinghub-entrypoint-scrapy==0.17.1

However, I am encountering the following error:

playwright._impl._api_types.Error: Executable doesn't exist at /root/.cache/ms-playwright/chromium-1084/chrome-linux/chrome
╔════════════════════════════════════════════════════════════╗
║ Looks like Playwright was just installed or updated.       ║
║ Please run the following command to download new browsers: ║
║     playwright install                                     ║
║                                                            ║
║ <3 Playwright Team                                         ║
╚════════════════════════════════════════════════════════════╝

I have tried running the suggested playwright install command within the Dockerfile, but it seems that the executable is not being recognized. I suspect there might be an issue with the paths or permissions.

Any guidance on how to resolve this problem would be greatly appreciated. Thank you in advance!

To be able to run my spider inside Zyte

0

There are 0 best solutions below