Running Google Pubsub emulator with Docker Compose results in random behavior

3.3k Views Asked by At
version: "3.9"
services:

  service_1: #This server emulates Google Pubsub locally
    build:
      dockerfile: <dockerfile_path>
      context: ./
    ports:
      - "8074:8074" # port 8074 is used inside CMD in the Dockerfile
    restart: always

  service_2: #This service creates necessary topics and subscriptions for the other services
    build:
      dockerfile: <dockerfile_path>
      context: ./
    environment:
      PUBSUB_EMULATOR_HOST: service_1:8074
    depends_on:
      - emulator

  service_3: #database
    image: postgres:13.1
    environment:
      - POSTGRES_USER=<USER>
      - POSTGRES_PASSWORD=<PASSWORD>
      - APP_DB_USER=<USER>
      - APP_DB_PASS=<PASSWORD>
      - APP_DB_NAME=test
    volumes:
      - ./db:/docker-entrypoint-initdb.d/
    ports:
      - "5432:5432"


  service_4: #this service orchestrates the three services below by receiving and sending messages from/to pubsub
    build:
      dockerfile: <dockerfile_path>
      context: ./
    ports:
      - "8083:8083"
    environment:
      PUBSUB_EMULATOR_HOST: service_1:8074
    depends_on:
      - postgres

    restart: always

  service_5: 
    build:
      dockerfile: <dockerfile_path>
      context: ./
    ports:
      - "8090:8090"
    environment:
      PUBSUB_EMULATOR_HOST: service_1:8074
    restart: always

  service_6:
    build:
      dockerfile: <dockerfile_path>
      context: ./
    ports:
      - "8096:8096"
    environment:
      PUBSUB_EMULATOR_HOST: service_1:8074
    restart: always


  service_7:
    build:
      dockerfile: <dockerfile_path>
      context: ./
    ports:
      - "8080:8080"
    environment:
      PUBSUB_EMULATOR_HOST: service_1:8074
    restart: always

This is what I currently have in my docker-compose.yml. It seems that there is something crucial I don't understand about how containers are run, but I get random results every time I run docker-compose up.

Even using depends_ondoesn't guarantee that one service is started after another one. For some reason, this breaks how services interact with the local pubsub emulator. I noticed that whenever I change ports inside services and restart, all the services might start working appropriately. But then after docker-compose down and docker-compose up, some services report not being able to subscribe and don't even try any further despite setting restart: always.

I guess this might be to a misunderstanding in how this configuration is supposed to work on my side.

  1. Why is the output so indeterministic?
  2. Is it just by coincidence that changing ports used by the web apps somehow makes it work?
  3. How do I fix that behavior?

According to the documentation, we specify ports: "HOST_PORT:CONTAINER_PORT" and the latter one is used internally by services. It's not even required to set the host ports, but it doesn't change anything whether I set it or not.

1

There are 1 best solutions below

0
On

I think indetermenistic behaviour is caused by readiness order of your services which is not guarantied by depends_on. Docker documentation has good explanation of this problem:

You can control the order of service startup and shutdown with the depends_on option. Compose always starts and stops containers in dependency order, where dependencies are determined by depends_on, links, volumes_from, and network_mode: "service:...".

However, for startup Compose does not wait until a container is “ready” (whatever that means for your particular application) - only until it’s running. There’s a good reason for this.

The problem of waiting for a database (for example) to be ready is really just a subset of a much larger problem of distributed systems. In production, your database could become unavailable or move hosts at any time. Your application needs to be resilient to these types of failures.

To handle this, design your application to attempt to re-establish a connection to the database after a failure. If the application retries the connection, it can eventually connect to the database.

The best solution is to perform this check in your application code, both at startup and whenever a connection is lost for any reason. However, if you don’t need this level of resilience, you can work around the problem with a wrapper script: ...