Skip to main content

Request Cache vs Shared Cache Without Mixing Layers

When the team calls any kind of data reuse a cache, it starts mixing local deduplication, request memoization, and shared cache as if they were the same thing.

Andrews Ribeiro

Andrews Ribeiro

Founder & Engineer

The problem

Cache becomes an elastic word too quickly.

The team sees the same data being fetched more than once and concludes:

  • “let’s add cache”

But it almost never stops to ask:

  • cache where?
  • for how long?
  • visible to whom?
  • invalidated by whom?

Without that, two very different things get mixed together:

  • reusing data inside the same request
  • reusing data across different requests

Mental model

Request cache usually means:

  • local memoization
  • deduplication inside one execution
  • avoiding the same fetch two or three times in the same flow

Shared cache usually means:

  • Redis
  • memory shared across instances
  • value reused by multiple requests and multiple processes

Both can be useful.

But they do not solve the same problem.

When request cache makes more sense

Request cache helps when:

  • the same request touches the same data more than once
  • multiple parts of the flow ask for the same entity
  • you want to reduce repeated work without creating external state

Simple example:

one request loads the current user in middleware, then the use case asks for the same user, then one permission check asks again.

That does not require Redis.

It only requires one local reuse layer during execution.

The gain here is usually:

  • simplicity
  • zero distributed invalidation
  • less internal repetition

When shared cache enters the picture

Shared cache starts making sense when:

  • multiple requests repeat one expensive read
  • the source cost is too high
  • the data tolerates controlled staleness
  • several instances need to reuse the same value

Examples:

  • configuration read all the time
  • low-churn catalog
  • expensive dashboard projection
  • aggregated response that does not need perfect real-time accuracy

Here another type of question appears:

  • how old can this data get?
  • what triggers invalidation?
  • what happens if the cache disappears?

The common mistake

The common mistake is using shared cache to solve a local problem.

Example:

  • the same request makes three identical queries
  • instead of fixing the flow or using request cache, the team inserts Redis in the middle

That creates:

  • serialization
  • network hops
  • keys
  • invalidation
  • extra observability

to solve repetition that lived inside the same process.

The opposite mistake also exists.

Some systems try to reuse global data with one local process variable, as if that were enough in an environment with several instances.

Simple example

Imagine one orders listing API.

Inside the same request:

  • you need to translate customerId into name in several places

Maybe request cache is already enough to avoid fetching the same customer five times.

Now imagine:

  • the home page asks for the same operational indicators every few seconds

Then a shared cache with a short TTL may make sense.

Those are different problems.

Treating both as just “cache” without context only hides the real decision.

How a senior thinks

Engineers who choose better usually ask:

  • does the reuse need to live only during this request?
  • can the value be shared across requests?
  • can the data get slightly old?
  • who invalidates it and with what semantics?
  • am I improving architecture or only hiding a bad query?

Those questions prevent automatic Redis-for-everything decisions.

Interview angle

This topic appears in performance, backend, and system design.

The interviewer usually wants to see whether you distinguish:

  • local deduplication
  • distributed cache
  • consistency
  • operational cost

A strong answer often sounds like this:

“If the problem is repetition inside the same request, I would start with local cache for that execution. I would leave shared cache for repeated reads across requests, when the origin is expensive and the invalidation semantics are clear.”

Direct takeaway

Good cache is not the one that appears earlier.

It is the one that solves the right repetition in the right layer.

Quick summary

What to keep in your head

Practice checklist

Use this when you answer

You finished this article

Next article Sharing Context Between Requests and Jobs Without Global Hacks Previous article Batch vs Streaming: When Each Processing Shape Makes Sense

Keep exploring

Related articles