July 29 2025
Request Cache vs Shared Cache Without Mixing Layers
When the team calls any kind of data reuse a cache, it starts mixing local deduplication, request memoization, and shared cache as if they were the same thing.
Andrews Ribeiro
Founder & Engineer
3 min Intermediate Systems
The problem
Cache becomes an elastic word too quickly.
The team sees the same data being fetched more than once and concludes:
- “let’s add cache”
But it almost never stops to ask:
- cache where?
- for how long?
- visible to whom?
- invalidated by whom?
Without that, two very different things get mixed together:
- reusing data inside the same request
- reusing data across different requests
Mental model
Request cache usually means:
- local memoization
- deduplication inside one execution
- avoiding the same fetch two or three times in the same flow
Shared cache usually means:
- Redis
- memory shared across instances
- value reused by multiple requests and multiple processes
Both can be useful.
But they do not solve the same problem.
When request cache makes more sense
Request cache helps when:
- the same request touches the same data more than once
- multiple parts of the flow ask for the same entity
- you want to reduce repeated work without creating external state
Simple example:
one request loads the current user in middleware, then the use case asks for the same user, then one permission check asks again.
That does not require Redis.
It only requires one local reuse layer during execution.
The gain here is usually:
- simplicity
- zero distributed invalidation
- less internal repetition
When shared cache enters the picture
Shared cache starts making sense when:
- multiple requests repeat one expensive read
- the source cost is too high
- the data tolerates controlled staleness
- several instances need to reuse the same value
Examples:
- configuration read all the time
- low-churn catalog
- expensive dashboard projection
- aggregated response that does not need perfect real-time accuracy
Here another type of question appears:
- how old can this data get?
- what triggers invalidation?
- what happens if the cache disappears?
The common mistake
The common mistake is using shared cache to solve a local problem.
Example:
- the same request makes three identical queries
- instead of fixing the flow or using request cache, the team inserts Redis in the middle
That creates:
- serialization
- network hops
- keys
- invalidation
- extra observability
to solve repetition that lived inside the same process.
The opposite mistake also exists.
Some systems try to reuse global data with one local process variable, as if that were enough in an environment with several instances.
Simple example
Imagine one orders listing API.
Inside the same request:
- you need to translate
customerIdinto name in several places
Maybe request cache is already enough to avoid fetching the same customer five times.
Now imagine:
- the home page asks for the same operational indicators every few seconds
Then a shared cache with a short TTL may make sense.
Those are different problems.
Treating both as just “cache” without context only hides the real decision.
How a senior thinks
Engineers who choose better usually ask:
- does the reuse need to live only during this request?
- can the value be shared across requests?
- can the data get slightly old?
- who invalidates it and with what semantics?
- am I improving architecture or only hiding a bad query?
Those questions prevent automatic Redis-for-everything decisions.
Interview angle
This topic appears in performance, backend, and system design.
The interviewer usually wants to see whether you distinguish:
- local deduplication
- distributed cache
- consistency
- operational cost
A strong answer often sounds like this:
“If the problem is repetition inside the same request, I would start with local cache for that execution. I would leave shared cache for repeated reads across requests, when the origin is expensive and the invalidation semantics are clear.”
Direct takeaway
Good cache is not the one that appears earlier.
It is the one that solves the right repetition in the right layer.
Quick summary
What to keep in your head
- Request cache solves repetition inside one execution. Shared cache solves reuse across executions.
- These two layers carry different cost, invalidation, and risk.
- Using global cache for a local problem usually adds complexity too early.
- The first useful question is not 'should we cache?' but 'who needs to reuse this data and for how long?'.
Practice checklist
Use this when you answer
- Do I know whether my problem is repetition inside one request or between different requests?
- Can I say who invalidates this cache and with what semantics?
- Am I using shared cache to hide a bad query or avoidable N+1?
- Can I explain the cache layer without calling everything generic optimization?
You finished this article
Share this page
Copy the link manually from the field below.