Skip to main content

Cache and Consistency in Real Systems

How to think about cache as a copy with a consistency cost, not as a magic patch for slow reads.

Andrews Ribeiro

Andrews Ribeiro

Founder & Engineer

Track

System Design Interviews - From Basics to Advanced

Step 5 / 19

The problem

Cache often enters the conversation by reflex.

The read is slow?

  • add cache

The endpoint is struggling?

  • add cache

The database is expensive?

  • add cache

That sounds practical.

But a lot of the time it is just a neat way to move the problem somewhere else.

Because the real question is not:

  • “Can we add cache?”

The real question is:

  • “Are we speeding up the right read or hiding a weak design?”

If the query is bad, the index is missing, the ORM is causing N+1, or the screen asks for too much data, cache may improve the benchmark and make the system harder to understand.

And worse:

it may improve the benchmark while serving stale data to the user.

Mental model

Think about it like this:

Cache is a temporary copy of the truth, created so the system does not have to fetch the original every time.

That sentence already clears up half the confusion.

If cache is a copy, the conversation changes.

You have to answer:

  • how stale can this copy be?
  • who updates it?
  • when does it stop being trustworthy?
  • what happens if the user sees that delay?

So cache is not only about performance.

It is performance bought with consistency risk.

Breaking it down

Good cache speeds up repeated and expensive reads

Cache usually makes a lot of sense when you have a pattern like this:

  • the same read happens all the time
  • fetching the source is expensive
  • a small delay is acceptable
  • the update policy is understandable

Common examples:

  • product page content
  • public configuration
  • rankings that can lag a little
  • heavily read lists with rare updates

In those cases, the temporary copy often pays for itself.

Bad cache becomes makeup for a misunderstood cause

This is the most common mistake.

The read is slow, but the cause may be:

  • bad query shape
  • missing index
  • N+1
  • excessive join cost
  • unnecessary select *
  • a screen asking for too much data

If you throw cache on top without understanding that, you may:

  • preserve the original cause
  • increase complexity
  • make debugging harder
  • create stale reads

So maturity here starts with one simple question:

  • “Why is this read expensive right now?”

TTL is not a full strategy

Many people think the cache design is done when they define:

  • TTL = 5 minutes

But TTL is only one part.

It answers:

  • how long the copy can live before forced expiration

It does not answer everything else.

You still need to think about:

  • what if the data changes before that?
  • what if many copies expire at the same time?
  • what if this read cannot wait that long?

TTL helps.

It does not replace invalidation thinking.

Different data deserves different tolerance

This gets ignored too early.

A small delay may be acceptable for:

  • product description
  • avatar
  • ranking
  • like counts

The same delay may be terrible for:

  • balance
  • tight inventory
  • permissions
  • payment state

If you use the same cache strategy for everything, the cache stops being an optimization and starts becoming a nicely packaged lie.

Invalidation is a product decision too

Another common mistake is treating invalidation as only an infrastructure detail.

It is not.

The question “when should this cache die?” often depends on:

  • user impact
  • risk of stale data
  • update flow
  • read frequency

That is why good invalidation does not come only from the tool.

It comes from understanding what truth the user expects on that screen or in that flow.

Using cache at the wrong layer also hurts

Sometimes the issue is not whether to cache.

It is where to cache.

You can cache:

  • the database query
  • the API response
  • a fragment of the page
  • an object in the application
  • an edge or CDN response

Each layer has different trade-offs.

Caching the whole page when only the price changes quickly may be worse than caching only the stable part.

Caching per user when most of the content is shared may waste memory.

So besides deciding whether to use cache, you also need to decide where.

Misses and stampedes are part of the design too

This is another place where mature answers stand out:

cache is not only about the nice hit path.

There is also:

  • expensive miss
  • synchronized expiration
  • bursts of recomputation

If many requests lose the copy at the same time, you can push the whole load back to the origin at the worst possible moment.

So cache design also needs to consider what happens when cache stops helping.

Simple example

Imagine a product page with:

  • product description
  • images
  • current stock
  • current price

Caching the full page may reduce load a lot.

But if stock changes quickly during a promotion, a user may see “in stock,” click buy, and then fail at checkout because the cached page was old.

So the real discussion is not “cache or no cache.”

It is deciding which parts can tolerate delay and which parts must stay fresh.

For example:

  • description and images may tolerate more delay
  • stock and price may need a much fresher path

That is a much more useful cache conversation.

Common mistakes

  • Adding cache before proving where the real bottleneck is.
  • Acting as if invalidation is a small detail to solve later.
  • Treating all data as if it had the same tolerance for staleness.
  • Forgetting that user trust and perceived consistency are part of the product.

How a senior thinks

A strong senior engineer does not just ask, “Where do we put the cache?”

They ask:

Which read really needs to be cheaper, and how much staleness can the business accept before we start lying to the user?

That question changes the maturity of the decision immediately.

What the interviewer wants to see

In interviews, cache separates shallow performance talk from actual system thinking.

They want to see whether you can:

  • describe cache as a trade-off, not a free bonus
  • bring up invalidation and freshness early
  • connect consistency directly to user experience

Cache speeds up reads, but it also creates distance from the truth.

If you still do not know when the copy stops being valid, the cache design is not finished.

Quick summary

What to keep in your head

Practice checklist

Use this when you answer

You finished this article

Part of the track: System Design Interviews - From Basics to Advanced (5/19)

Next article Scalability and Bottlenecks Previous article Choosing Between SQL and NoSQL

Keep exploring

Related articles