May 20 2025

Messaging and Queues

When to use a queue instead of a synchronous call, what the main patterns mean, and why that changes the architecture in production.

Andrews Ribeiro

Founder & Engineer

6 min Intermediate Systems

#system-design#systems#messaging#queues#async#architecture

Track

System Design Interviews - From Basics to Advanced

Step 8 / 19

Back to track Previous article Next article

The problem

When a system gets slow under load, a lot of people try to solve it by adding more instances.

Sometimes that helps. A lot of the time it does not.

If the heavy work still sits on the synchronous request path, the user still waits, the timeout is still there, and the spike can still take down the main flow. Scaling the application without changing the shape of the work is like hiring more people for a line where each person still has to do everything in front of the customer.

A queue changes that shape.

Mental model

A queue is a buffer between the part that produces work and the part that executes work.

The producer says, “this needs to happen.” The consumer does it when it can. They do not need to be alive at the same time, and they do not need to move at the same speed.

That solves three common problems:

heavy work on the request path
short load spikes
a slow or unstable dependency

Think about it like this:

direct call = “do this now and answer me”
queue = “write this down and process it at the right pace”

Not everything should go through a queue. But anything that can leave the critical path without breaking the user experience deserves to be considered.

Breaking it down

When a queue makes sense

A queue usually fits when:

the user does not need the final result immediately
the work is too expensive to stay inside the request
you want to absorb spikes without taking everything down
more than one consumer needs to react to the same event

Classic examples:

sending email
generating reports
image processing
search indexing
notifications

If the user needs the final answer right now, a queue may not be the main solution. Or maybe you need to split the flow in two: fast confirmation now, full processing later.

At-least-once vs exactly-once delivery

These names sound scarier than they need to.

At-least-once means the message will be delivered at least once. In practice, it may show up again.

Exactly-once means the system promises the message is processed exactly once from end to end. In theory that sounds beautiful. In practice, real end-to-end exactly-once is expensive, rare, and full of conditions.

That is why most real systems assume repetition is possible and make the consumer survive that repetition safely.

Idempotency in the consumer

If the same message may arrive twice, the consumer needs to be able to repeat the work without causing duplicate damage.

Examples:

do not send the same confirmation email ten times
do not charge the same payment twice
do not create the same order again

In plain language, idempotency means this: process again and still end up in the same final state.

Dead letter queue

A dead letter queue, or DLQ, is where a message goes after failing too many times.

Without that, you risk:

retrying forever
clogging the main queue
hiding a serious error inside a lot of noise

A message that lands in the DLQ did not die for no reason. It became a signal that this case needs inspection.

Kafka vs SQS vs RabbitMQ: when each one fits

You do not need to memorize a catalog. You just need to understand the shape.

SQS is a strong fit for simple async work, with a managed queue and low operational friction.
RabbitMQ often fits well when you need more routing control and more traditional messaging behavior.
Kafka fits better when the problem looks like an event stream at high scale, with message retention and many consumers reading the same history.

One question helps:

does this look like a task that must be done, or an event that many consumers should observe?

If it looks like a task, a traditional work queue is often enough. If it looks like an event stream with shared history, Kafka may fit better.

Another good question is:

do I need to guarantee that someone executes this, or do I need many consumers to observe this?

That usually separates work queues from event streams better than vendor comparisons do.

Simple example

Imagine an orders system.

When payment is approved, several things may happen:

confirm the order
issue an invoice
send email
update inventory
notify analytics

A naive implementation would do all of that inside the request.

A better implementation could:

confirm the payment and save the order
publish an order_paid event or message
let separate consumers handle the rest

Now you gained:

a faster request
clearer separation of responsibility
controlled reprocessing
less coupling across flows

But you also gained responsibility:

handling duplicates
observing failure
deciding order and guarantees

A queue does not remove complexity. It moves complexity to a place that is easier to manage.

And that is only worth it when you actually need that move. If the flow is simple, synchronous, and cheap, a queue may just trade clarity for unnecessary operations.

Common mistakes

Putting a queue in everything by reflex.
Assuming a message arrives only once.
Having no strategy for repeated failure.
Treating an event like disguised RPC.
Thinking a queue fixes business rules that were already vague.

It is also worth watching the opposite extreme: leaving everything synchronous just because it “feels simpler.” At some point, the system pays for that with timeouts, unstable spikes, and too much coupling.

How a senior thinks

People with more experience usually ask two questions early:

does the user need this answer right now?

and

if this message arrives again, what happens?

Those two questions clean up a big part of the conversation.

If the answer can come later, a queue becomes a strong candidate. If repetition creates duplicate damage, idempotency becomes mandatory.

There is another question that usually appears quickly in the mind of someone who has already been burned by this:

if the consumer falls behind, does the system degrade acceptably or silently accumulate a bigger problem?

That pulls in backlog, backpressure, and observability, which are part of the real cost of the choice.

What the interviewer wants to see

In interviews, messaging is not just about drawing producer and consumer.

The interviewer wants to see whether you:

know when to take work off the critical path
understand repeated delivery as normal behavior
think about reprocessing and dead letter queues
choose the tool for the problem shape, not for the famous name

A good queue is not the one that makes the diagram look more modern. It is the one that protects the main flow without hiding the cost of the complexity.

Quick summary

What to keep in your head

A queue decouples producer and consumer and helps the system absorb load spikes.
At-least-once means the same message may arrive again, so the consumer needs to be safe under repetition.
A dead letter queue holds messages that failed too many times and avoids infinite loops.
Kafka, SQS, and RabbitMQ can look similar from far away, but they fit different shapes of problems.

Practice checklist

Use this when you answer

Can I explain when to use a queue instead of a direct synchronous call?
Do I know what changes once the same message may be delivered more than once?
Can I sketch producer, queue, consumer, and failure handling clearly?
Can I tell when a flow looks like an event and when it looks like pending work?

You finished this article

Part of the track: System Design Interviews - From Basics to Advanced (8/19)

Next step

Replication and Sharding Without Mystery Next step →