CAP Theorem

What is the CAP Theorem?

CAP theorem states that any distributed system can support only two among :

  • Consistency
  • Availability
  • Partition tolerance

Before we proceed, understand that in a distributed system, each instance of system is referred as node below.

What do you mean by Consistency?

Consistency implies:

  • When data is distributed among nodes, all nodes see the same data at a given time.
  • When queried, each node will return the latest data. To achieve consistence, when we write data into one nodes, all the nodes should be updated (should have updated view of data) before allowing read on that data.
Lisa Simpson GIF by The Simpsons

For example: Consider your ATM machines as part of consistent systems. If you add or remove money from 1 ATM machine, you do not want to see different view when you go to different ATM.

tituss burgess atm GIF

What do you mean by Availability?

Availability implies:

  • At all times, every request being fired at the system generates a valid response, even if one or more nodes went down. Both read and write requests should get allowed, if only read is allowed and write is not allowed, we can’t say the system is available.
  • Another way to state this—all working nodes in the distributed system return a valid response for any request, without exception.

Example: When you are using Instagram, you want your Picture feed to be available all the time, no matter if one or more servers of Instagram goes down. You may be ok with not getting latest of information or picture feed in this case, but what will be the worst is you getting message that one of the Instagram server is down, please come back after few minutes.

Cat Love GIF by Fran Solo

What do you mean by Partition Tolerance?

Partition tolerance implies:

  • A partition is a communications break within a distributed system—a lost or temporarily delayed connection between two nodes. 
  • Partition tolerance means that the cluster must continue to work despite any number of communication breakdowns between nodes in the system.
  • Once the network is re-established, nodes may flow the latest information to each other.
Switch Network GIF by emsTV

Example: Let’s again take an Instagram example. Consider there is a network break between the Instagram server. How do you want Instagram servers you are connecting to behave in this case? With a message that there is a network break and we will serve you pictures in some time, or you would like to see your picture feed, though may not the latest pictures of all people you follow. And when the network is re-established, you will get the latest pictures.

How are Systems classified based on the CAP theorem?

Systems are classified into three types under CAP Theorem:

CA System: 

  • Data is consistent between all nodes, and you can read/write from any node, while you cannot afford to let your network go down. 
  • A CA database delivers consistency and availability across all nodes. It can’t do this if there is a partition between any two nodes in the system, however, and therefore can’t deliver fault tolerance.
  • For example: RDBMS like MSSQL Server, Oracle, MySQL, Postgress

CP System: 

  • A CP database delivers consistency and partition tolerance at the expense of availability. When a partition occurs between any two nodes, the system has to shut down the non-consistent node (i.e., make it unavailable) until the partition is resolved.
  • For example: Redis, Google Big Table, MongoDB, and HBase

AP System: 

  • An AP database delivers availability and partition tolerance at the expense of consistency. When a partition occurs, all nodes remain available but those at the wrong end of a partition might return an older version of data than others. (When the partition is resolved, the AP databases typically resync the nodes to repair all inconsistencies in the system.)
  • For example: CouchDB (document-oriented), Dynamo DB, and Cassandra (columnar)

How to use CAP theorem during system design interview?

So by now we know that different data stores provides different types of guarantees, some provides consistency, some partition tolerance and some availability.

Now during system design interview, understand the requirement of system. Ask yourself:

  • Do I need a consistent view of data for all servers throughout the world at every given moment?
  • Or is it the availability of the system which is more important to me?
  • Or is it ok, if network breaks occur but the system continues to serve stale data?

Can you give me a classification of important data stores being used as CA, CP, or AP?

RDBMS Systems like:
Google Big Table

Dynamo DB

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s