Skip to main content

CAP Theorem in Practice

The useful part of CAP starts when parts of the system stop seeing each other and you need to decide whether to wait or keep responding.

Andrews Ribeiro

Andrews Ribeiro

Founder & Engineer

The problem

CAP theorem becomes a slogan very quickly.

Lots of people cite it. Very few explain it.

The most famous line is:

“You can only have two out of three.”

The problem is that this barely helps.

If you heard that sentence and still did not get it, the problem is not you.

It jumps straight to the slogan before showing the real situation the theorem is trying to explain.

Mental model

CAP is about what happens when a distributed system suffers a network failure.

Think about two parts of the system that are still up, but stop talking to each other properly.

Simple examples:

  • two regions stop seeing each other
  • a primary and a replica lose communication
  • two nodes stay alive, but cannot confirm state with each other

When that happens, you lose the comfort of assuming both sides will answer with the same truth.

That is where CAP starts to matter.

What you actually need to choose

The three terms are these:

  • C for consistency: I only answer when I can preserve a coherent view of the data
  • A for availability: I keep answering even during the failure
  • P for partition tolerance: the system keeps existing even with network failure between parts of it

The part that confuses people most is this:

In distributed systems, P is not a luxury. The network can fail whether you like it or not.

So the practical choice is usually not “which two do I want on a good day?”

The practical choice is usually:

  • do I hold the response to protect consistency?
  • or do I keep responding while accepting stale data or temporarily divergent behavior?

Breaking the problem down

When it makes sense to wait

Some flows do not tolerate lies.

Examples:

  • balance
  • credit limit
  • leader coordination
  • distributed locking

If the network broke and you cannot confirm the state properly, it may be better to stop, wait, or return a temporary error.

The pain here is temporary unavailability.

But that trade-off may be worth it because wrong data is too expensive.

When it makes sense to keep responding

Other flows tolerate delay better than silence.

Examples:

  • feed
  • like counters
  • recommendations
  • non-critical ranking

If one part of the system keeps answering with slightly stale data for a few seconds, that may be acceptable.

The pain here is temporary inconsistency.

But that trade-off may be worth it because the product does not disappear in front of the user.

Only then does CP and AP help

If someone uses the labels CP and AP, they get much easier now:

  • CP: under failure, the system leans more toward consistency
  • AP: under failure, the system leans more toward availability

But those are already labels. The important part comes first.

Concrete example

Imagine a social network.

A user publishes a post. Some followers in another region still do not see it for a few seconds.

That is usually acceptable. The system keeps responding and the state converges later.

Now imagine a bank balance or coordination for a critical job.

If two sides of the system cannot talk to each other, it may be better to hold the response than to risk conflicting states.

That is the real point:

  • in a feed, a small delay is often acceptable
  • in a balance or critical coordination flow, delay may be less dangerous than a wrong answer

CAP does not choose for you.

The business decides which mistake is acceptable during the failure.

What CAP does not explain

CAP is not for explaining every stale-data case.

Sometimes the real problem is:

  • badly invalidated cache
  • replica lag
  • reading from the wrong region
  • a badly designed async pipeline

So:

  • not every stale-data problem is CAP
  • not every consistency problem needs this theorem

CAP is a lens for system behavior under network failure in a distributed system.

Common mistakes

  • Repeating “two out of three” without even mentioning partition.
  • Talking as if P were optional.
  • Using CP and AP before explaining the concrete problem.
  • Assuming CAP explains every stale-data situation.
  • Treating CAP as architecture decoration instead of a failure-behavior decision.

How this shows up in interviews

In interviews, a strong answer often sounds like this:

“If the network fails here, would I rather block this flow to protect consistency, or keep responding with a chance of delay?”

That is much better than dropping “two out of three” and hoping it sounds deep.

The interviewer wants to see whether you:

  • understand partition in human language
  • connect consistency and availability to a real flow
  • do not use the term as decoration

Closing

CAP gets much easier when you change the question.

Instead of asking:

“Which two out of three do I choose?”

Ask:

“When the network breaks, would I rather wait for the right answer or keep responding with a risk of stale data?”

Quick summary

What to keep in your head

Practice checklist

Use this when you answer

You finished this article

Next article Strong vs Eventual Consistency Previous article Load Balancing Without a Black Box

Keep exploring

Related articles