Adding new containers to existing cluster (sworm)

518 Views Asked by At

I am having a problem trying to implement the best way to add new container to an existing cluster while all containers run in docker.

Assuming I have a docker swarm, and whenever a container stops/fails for some reason, the swarm bring up new container and expect it to add itself to the cluster.

How can I make any container be able to add itself to a cluster?

I mean, for example, if I want to create a RabbitMQ HA cluster, I need to create a master, and then create slaves, assuming every instance of RabbitMQ (master or slave) is a container, let's now assume that one of them fails, we have 2 options:

1) slave container has failed.

2) master container has failed.

Usually, a service which have the ability to run as a cluster, it also has the ability to elect a new leader to be the master, so, assuming this scenerio is working seemlesly without any intervention, how would a new container added to the swarm (using docker swarm) will be able to add itself to the cluster?

The problem here is, the new container is not created with new arguments every time, the container is always created as it was deployed first time, which means, I can't just change it's command line arguments, and this is a cloud, so I can't hard code an IP to use.

Something here is missing. Maybe trying to declare a "Service" in the "docker Swarm" level, will acctualy let the new container the ability to add itself to the cluster without really knowing anything the other machines in the cluster...

2

There are 2 best solutions below

1
On

There are quite a few options for scaling out containers with Swarm. It can range from being as simple as passing in the information via a container environment variable to something as extensive as service discovery.

Here are a few options:

  • Pass in IP as container environment variable. e.g. docker run -td -e HOST_IP=$(ifconfig wlan0 | awk '/t addr:/{gsub(/.*:/,"",$2);print$2}') somecontainer:latest
    • this would set the internal container environment variable HOST_IP to the IP of the machine it was started on.
  • Service Discovery. Querying a known point of entry to determine the information about any required services such as IP, Port, ect.
    • This is the most common type of scale-out option. You can read more about it in the official Docker docs. The high level overview is that you set up a service like Consul on the masters, which you have your services query to find the information of other relevant services. Example: Web server requires DB. DB would add itself to Consul, the web server would start up and query Consul for the databases IP and port.
  • Network Overlay. Creating a network in swarm for your services to communicate with each other.
    • Example:


$ docker network create -d overlay mynet
$ docker service create –name frontend –replicas 5 -p 80:80/tcp –network mynet mywebapp
$ docker service create –name redis –network mynet redis:latest

This allows the web app to communicate with redis by placing them on the same network.

Lastly, in your example above it would be best to deploy it as 2 separate containers which you scale individually. e.g. Deploy one MASTER and one SLAVE container. Then you would scale each dependent on the number you needed. e.g. to scale to 3 slaves you would go docker service scale <SERVICE-ID>=<NUMBER-OF-TASKS> which would start the additional slaves. In this scenario if one of the scaled slaves fails swarm would start a new one to bring the number of tasks back to 3.

0
On

https://docs.docker.com/engine/reference/builder/#healthcheck

Docker images have a new layer for health check. Use a health check layer in your containers for example:

RUN ./anyscript.sh HEALTHCHECK exit 1 or (Any command you want to add) HEALTHCHECK check the status code of command 0 or 1 and than result as 1. healthy 2. unhealthy 3. starting etc.

Docker swarm auto restart the unhealthy containers in swarm cluster.