Difference between Availability and Partition Tolerance in CAP Theorem

1.1k Views Asked by At

I know there's one same question on SO but I didn't find the answers there helpful.

Availability says that the system will return a non-error response, even if one or more nodes in the system are down.

Partition Tolerance means that the system will keep functioning as a whole, even if some nodes in the system are down or there are communication breaks in the system.

If I am not wrong, both of them mean that the client will receive a response even if some nodes in the system are down. What is the difference between the two?

4

There are 4 best solutions below

0
On

Availability is when every request sent to a node that is NOT down, returns a valid response. If it is a write request, the node should return a valid response even if it is not able to communicate the new update to the other nodes in the system. Similarly, if it is a read request the node should return the data it has even if it is not able to guarantee that it is the most recently updated data.

Partition tolerance means even if there is a network partition, i.e. the connection between two nodes breaks and they are not able to communicate back and forth, the system will not be down. Please note that network partition simply means lack of communication between two nodes in a system, it does not mean any of the nodes is down.

In the case of a distributed system, we cannot guarantee that there will be no network partitions, since network connectivity sits outside the scope of our system, to ensure that the system continues to work we either sacrifice consistency or availability.

0
On

Partition means that a group of nodes is isolated from another group. A downed node is not a partition by itself (unless it is the only node in the cluster?). Two nodes that can't communicate directly but that can communicate indirectly via a third node is not a partition either. Partition tolerance is the ability to deal with two (or more) groups of one or more nodes that are isolated from one another.

Ref: https://arxiv.org/abs/1509.05393

1
On

At a base, Partition is splitting data in order to speed up the process. But the more you split, the more there can be partition problem. With 2 or more nodes.

Therefore P for Partition is NOT Partition Tolerance at all ! Even I can see this everywhere.... It's the opposite of A ( with 2 or more replicas )

But when people bring A ( replication ) with P ( Partition ) together, then comes PA ok ( Is tolerant to partitions ) In this case you need nodes for Partitions, nodes for Replication behind, and a Load Balancer ( can be client side ).

0
On

Availability means the ability to access the cluster even if a node in the cluster goes down.

Partition tolerance means that the cluster continues to function even if there is a "partition" (communication break) between two nodes (both nodes are up, but can't communicate).

From: https://stackoverflow.com/a/12347673/6567891