May 20 2025
Messaging and Queues
When to use a queue instead of a synchronous call, what the main patterns mean, and why that changes the architecture in production.
Andrews Ribeiro
Founder & Engineer
6 min Intermediate Systems
Track
System Design Interviews - From Basics to Advanced
Step 8 / 19
The problem
When a system gets slow under load, a lot of people try to solve it by adding more instances.
Sometimes that helps. A lot of the time it does not.
If the heavy work still sits on the synchronous request path, the user still waits, the timeout is still there, and the spike can still take down the main flow. Scaling the application without changing the shape of the work is like hiring more people for a line where each person still has to do everything in front of the customer.
A queue changes that shape.
Mental model
A queue is a buffer between the part that produces work and the part that executes work.
The producer says, “this needs to happen.” The consumer does it when it can. They do not need to be alive at the same time, and they do not need to move at the same speed.
That solves three common problems:
- heavy work on the request path
- short load spikes
- a slow or unstable dependency
Think about it like this:
- direct call = “do this now and answer me”
- queue = “write this down and process it at the right pace”
Not everything should go through a queue. But anything that can leave the critical path without breaking the user experience deserves to be considered.
Breaking it down
When a queue makes sense
A queue usually fits when:
- the user does not need the final result immediately
- the work is too expensive to stay inside the request
- you want to absorb spikes without taking everything down
- more than one consumer needs to react to the same event
Classic examples:
- sending email
- generating reports
- image processing
- search indexing
- notifications
If the user needs the final answer right now, a queue may not be the main solution. Or maybe you need to split the flow in two: fast confirmation now, full processing later.
At-least-once vs exactly-once delivery
These names sound scarier than they need to.
At-least-once means the message will be delivered at least once. In practice, it may show up again.
Exactly-once means the system promises the message is processed exactly once from end to end. In theory that sounds beautiful. In practice, real end-to-end exactly-once is expensive, rare, and full of conditions.
That is why most real systems assume repetition is possible and make the consumer survive that repetition safely.
Idempotency in the consumer
If the same message may arrive twice, the consumer needs to be able to repeat the work without causing duplicate damage.
Examples:
- do not send the same confirmation email ten times
- do not charge the same payment twice
- do not create the same order again
In plain language, idempotency means this: process again and still end up in the same final state.
Dead letter queue
A dead letter queue, or DLQ, is where a message goes after failing too many times.
Without that, you risk:
- retrying forever
- clogging the main queue
- hiding a serious error inside a lot of noise
A message that lands in the DLQ did not die for no reason. It became a signal that this case needs inspection.
Kafka vs SQS vs RabbitMQ: when each one fits
You do not need to memorize a catalog. You just need to understand the shape.
- SQS is a strong fit for simple async work, with a managed queue and low operational friction.
- RabbitMQ often fits well when you need more routing control and more traditional messaging behavior.
- Kafka fits better when the problem looks like an event stream at high scale, with message retention and many consumers reading the same history.
One question helps:
does this look like a task that must be done, or an event that many consumers should observe?
If it looks like a task, a traditional work queue is often enough. If it looks like an event stream with shared history, Kafka may fit better.
Another good question is:
do I need to guarantee that someone executes this, or do I need many consumers to observe this?
That usually separates work queues from event streams better than vendor comparisons do.
Simple example
Imagine an orders system.
When payment is approved, several things may happen:
- confirm the order
- issue an invoice
- send email
- update inventory
- notify analytics
A naive implementation would do all of that inside the request.
A better implementation could:
- confirm the payment and save the order
- publish an
order_paidevent or message - let separate consumers handle the rest
Now you gained:
- a faster request
- clearer separation of responsibility
- controlled reprocessing
- less coupling across flows
But you also gained responsibility:
- handling duplicates
- observing failure
- deciding order and guarantees
A queue does not remove complexity. It moves complexity to a place that is easier to manage.
And that is only worth it when you actually need that move. If the flow is simple, synchronous, and cheap, a queue may just trade clarity for unnecessary operations.
Common mistakes
- Putting a queue in everything by reflex.
- Assuming a message arrives only once.
- Having no strategy for repeated failure.
- Treating an event like disguised RPC.
- Thinking a queue fixes business rules that were already vague.
It is also worth watching the opposite extreme: leaving everything synchronous just because it “feels simpler.” At some point, the system pays for that with timeouts, unstable spikes, and too much coupling.
How a senior thinks
People with more experience usually ask two questions early:
does the user need this answer right now?
and
if this message arrives again, what happens?
Those two questions clean up a big part of the conversation.
If the answer can come later, a queue becomes a strong candidate. If repetition creates duplicate damage, idempotency becomes mandatory.
There is another question that usually appears quickly in the mind of someone who has already been burned by this:
if the consumer falls behind, does the system degrade acceptably or silently accumulate a bigger problem?
That pulls in backlog, backpressure, and observability, which are part of the real cost of the choice.
What the interviewer wants to see
In interviews, messaging is not just about drawing producer and consumer.
The interviewer wants to see whether you:
- know when to take work off the critical path
- understand repeated delivery as normal behavior
- think about reprocessing and dead letter queues
- choose the tool for the problem shape, not for the famous name
A good queue is not the one that makes the diagram look more modern. It is the one that protects the main flow without hiding the cost of the complexity.
Quick summary
What to keep in your head
- A queue decouples producer and consumer and helps the system absorb load spikes.
- At-least-once means the same message may arrive again, so the consumer needs to be safe under repetition.
- A dead letter queue holds messages that failed too many times and avoids infinite loops.
- Kafka, SQS, and RabbitMQ can look similar from far away, but they fit different shapes of problems.
Practice checklist
Use this when you answer
- Can I explain when to use a queue instead of a direct synchronous call?
- Do I know what changes once the same message may be delivered more than once?
- Can I sketch producer, queue, consumer, and failure handling clearly?
- Can I tell when a flow looks like an event and when it looks like pending work?
You finished this article
Part of the track: System Design Interviews - From Basics to Advanced (8/19)
Share this page
Copy the link manually from the field below.