Apache Ignite split brain scenario

128 Views Asked by At

I deploy 3 instances (pods) of a java app in Kubernetes (EKS) with an in-process replicated cache.

using labels I have 2 of the 3 pods assigned to group:A and 1 pod assigned to group:B

I use Chaos mesh to simulate a loss of communication between the 2 groups, so the pod in B is isolated and the 2 pods in A can only talk to each other in their group.

As expected, the cluster is split in 2. one is majority, 1 is minority. I understand out of the box, each cluster does not know if it is majority or minority (there is no flag), so I have implemented TopologyValidator so the pods in the minority partition return false to the validate() method and so that cache/cluster becomes Read only.

it works well, pods in group B becomes read only as expected, while the 2 pods in group A remains R/W. Now when I stop chaos mesh, I would like the node from B to join the 2 other nodes to form 1 cluster as it was initially before the test.

Is there a way to configure something that would automatically do that merge? especially because one was read-only. with some option/policy like the latest update of a key wins just in case but one cluster didn't make any write.

also I noticed that If I scale up my application by adding a fourth pod/node, then it will join one of the 2 clusters (in one of my test it joined the read-only cluster, in another test that was the R/W), and then if I even scale up again, all next pods will always join the same cluster. it's strange, I'm using the K8 IP finder so all nodes (all pods) should be discovered (the K8 API querying the service endpoints returns all pods/endpoints), no idea how a new node knows which sub-cluster to use, and why always the same just like one was flagged with something special, if that is the case how can I read that flag?

to summarize:

  1. is there a way to automate merging back the 2 clusters? (without custom code if possible)

  2. worst case, is there a flag/event I can monitor when all nodes (from the 2 clusters) can communicate with each other so I can e.g. restart the nodes from the minority partition? I know there are 2 clusters, but the K8 IP finder always has a common view on ALL pods because they are all behind the same service, so it returns ALL IPs. maybe ignite tries to communicate with all those IPs periodically...

  3. is there a flag saying that a cluster will not accept new nodes?

0

There are 0 best solutions below