I am currently facing an issue while trying to create a Docker image for custom deployment on the Zyte platform (formerly known as Scrapinghub). My goal is to set up a Python environment with Scrapy, Playwright, Twisted, and other necessary dependencies.
Here is my Dockerfile:
FROM python:3.11.6-slim
WORKDIR /app
COPY . /app
RUN apt-get update \
&& pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -r requirements.txt \
&& playwright install --with-deps chromium \
&& mv /root/.cache/ms-playwright /ms-playwright \
&& mv /ms-playwright/chromium-* /ms-playwright/chromium \
&& chmod -Rf 777 /ms-playwright
ENV SCRAPY_SETTINGS_MODULE dummy.settings
RUN python setup.py install
And here is my requirements.txt file:
scrapy==2.11.0
playwright==1.39.0
Twisted==22.10.0
scrapinghub-entrypoint-scrapy==0.17.1
However, I am encountering the following error:
playwright._impl._api_types.Error: Executable doesn't exist at /root/.cache/ms-playwright/chromium-1084/chrome-linux/chrome
╔════════════════════════════════════════════════════════════╗
║ Looks like Playwright was just installed or updated. ║
║ Please run the following command to download new browsers: ║
║ playwright install ║
║ ║
║ <3 Playwright Team ║
╚════════════════════════════════════════════════════════════╝
I have tried running the suggested playwright install command within the Dockerfile, but it seems that the executable is not being recognized. I suspect there might be an issue with the paths or permissions.
Any guidance on how to resolve this problem would be greatly appreciated. Thank you in advance!
To be able to run my spider inside Zyte