Selenium parser

20 Views Asked by At

I wrote a Selenium parser in Python. Locally on the localhost server it works fine, but on the aws ec2 server I get the following error, how to fix it.

ERR HTTPConnectionPool(host='localhost', port=50687): Max retries exceeded with url: /session/56a774d4a576540ea13a5de796af6b8a/execute/sync (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x796147a135e0>: Failed to establish a new connection: [Errno 111] Connection refused'))

dockerfile

FROM python:3.10

WORKDIR /service
COPY requirements.txt ./

RUN apt-get update && \
    apt-get install -y \
    wget \
    ca-certificates \
    fonts-noto \
    libxss1 \
    libappindicator3-1 \
    fonts-liberation \
    xdg-utils \
    gnupg

RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
    echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list && \
    apt-get update && \
    apt-get install -y google-chrome-stable


ENV CHROME_BIN=/usr/bin/google-chrome-stable

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 80

CMD ["python", "manage.py", "runserver", "0.0.0.0:80"]

dokcer-compose.yml

services:
  worker-1:
    build: .
    volumes:
      - ./service:/service
    command: python manage.py runserver 0.0.0.0:80
    ports:
      - '80:80'
    restart:  on-failure

  worker-2:
    build: .
    volumes:
      - ./service:/service
    command: python manage.py via_parser
    restart: on-failure

beginning file via_parser.py

 while True:
            print("circle start")
            try:
                chrome_options = Options()
                ua = UserAgent()
                userAgent = ua.random
                chrome_options.add_argument("--headless")
                print(userAgent)
                chrome_options.add_argument(f"user-agent={userAgent}")
                chrome_options.add_argument("--disable-extensions")
                chrome_options.add_argument("--disable-application-cache")
                chrome_options.add_argument("--disable-gpu")
                chrome_options.add_argument("--no-sandbox")
                chrome_options.add_argument("--disable-setuid-sandbox")
                chrome_options.add_argument("--disable-dev-shm-usage")
                chrome_options.add_argument("--disable-blink-features=AutomationControlled")
                # chrome_options.binary_location = '/usr/bin/google-chrome-stable'
                # driver = uc.Chrome(options=chrome_options, version_main=122)
                chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
                chrome_options.add_experimental_option('useAutomationExtension', False)
                driver = webdriver.Chrome(options=chrome_options)

                stealth(driver,
                        languages=["en-US", "en"],
                        vendor="Google Inc.",
                        # platform="Win64",
                        platform="Linux x86_64",
                        webgl_vendor="Intel Inc.",
                        renderer="Intel Iris OpenGL Engine",
                        fix_hairline=True,
                        )

Tried changing and adding add_argument but nothing helped

0

There are 0 best solutions below