September 6 2025
Workload Affinity Without Turning Scaling Into a Lottery
Not every workload should land on any worker at any time. Without some degree of affinity, the system wastes locality, heats up hotspots, and scales by luck.
Andrews Ribeiro
Founder & Engineer
3 min Intermediate Systems
The problem
Not all work is the same.
Some workloads benefit a lot when they stay close to:
- one key
- one partition
- one warm cache
- one already established connection
- one already loaded context
If the system distributes everything randomly all the time, every scale event becomes a bet:
- where will this work land?
- will it reuse anything or reheat everything?
- will it create a hotspot nobody predicted?
Scaling turns into a lottery.
Mental model
Workload affinity means keeping some stable proximity between the work and one useful resource.
That affinity can be by:
- tenant
- shard
- entity
- time partition
- work class
The goal is not to “pin it forever.”
The goal is to capture locality when that locality reduces enough cost to justify the extra control.
Simple example
Imagine recomputation jobs by tenant.
Each job needs to read:
- tenant configuration
- already warmed local cache
- related connections and partitions
If each execution lands on a random worker:
- it warms context every time
- it fights the cache with other tenants
- it creates more variable behavior
If there is affinity by tenant or partition, the system reuses more context and becomes more predictable.
But if that tenant grows too much, the system also needs to rebalance.
The common mistake
The common mistake is going to one of the extremes:
- everything random
- everything rigidly pinned
In the first case, you lose useful locality.
In the second case, you create:
- a fixed hotspot
- difficulty rebalancing
- loud failure when one node dies
Another common mistake is adopting affinity without knowing what is being preserved.
If there is no clear gain in cache, context, or partition locality, maybe you are only making scaling harder.
What usually helps
It usually helps to answer four questions:
- what is the natural affinity key?
- what real gain does it bring?
- when is the system allowed to break that affinity?
- how does redistribution happen without chaos?
In practice, that often appears as:
- partitioning by key
- consistent hashing
- preferred workers with fallback
- controlled rebalance
- skew observability
Good affinity improves predictability.
Bad affinity only ties the system to an arbitrary distribution.
How a senior thinks
Engineers who have already suffered through unpredictable scaling often ask:
- which context is actually worth keeping together?
- how much skew do I accept before I rebalance?
- if one worker disappears, does the work fit somewhere else cleanly?
- am I using affinity to gain locality or to hide some other inefficiency?
That conversation avoids a backend that scales in replica count but not in behavior.
Interview angle
This topic appears in queues, jobs, multi-tenant systems, caching, and partitioning.
The interviewer wants to see whether you understand:
- that not every distribution needs to be fully random
- that locality is also an architectural decision
- that useful affinity must coexist with rebalance and fault tolerance
A strong answer often sounds like this:
“I would use affinity when it preserves context or partition locality in a measurable way, but I would make clear how the system rebalances and breaks affinity when one node gets hot or disappears. Otherwise we are just trading randomness for brittle rigidity.”
Direct takeaway
Scaling should not depend on luck.
If locality matters, affinity needs to be designed.
Quick summary
What to keep in your head
- Distributing everything completely at random may be simple, but sometimes it wastes locality and increases variability.
- Affinity makes sense when keeping context, cache, or partition nearby reduces real cost.
- Affinity that is too rigid creates hotspots and fragility. Good affinity still accepts rebalance.
- Healthy scaling should not depend on luck for work to land in the right place.
Practice checklist
Use this when you answer
- Can I say what real gain exists in keeping this workload near one key, partition, or worker?
- Can I rebalance when one node gets hot or disappears?
- Is my system using affinity for useful locality or only to hide a structural problem?
- If I shut one worker down right now, does the workload keep moving without unpredictable behavior?
You finished this article
Share this page
Copy the link manually from the field below.