April 3 2025
Async Bugs and Race Conditions
How to make timing failures easier to understand by making ordering, concurrency, and shared state visible.
Andrews Ribeiro
Founder & Engineer
3 min Intermediate Systems
The problem
Race conditions feel scary because they rarely fail the exact same way twice.
It works locally, fails in production, disappears when you add a console.log, and only shows up when two responses land in one specific order.
That makes a lot of people treat async bugs like bad luck, when the real problem is usually much simpler: the system has no solid rule for which result is still allowed to update shared state.
Mental model
When you are hunting an async bug, looking only at “what the code does” is not enough.
You also need to look at:
- the timeline of events
- which operation finished before the other
- whether the state was still valid when the result arrived
Once the investigation shifts from “reading lines of code” to “drawing the sequence of events,” the bug usually stops feeling like a ghost.
It also helps to replace a bad sentence with a better one:
- bad: “the app went weird”
- better: “two operations finished in an order the UI was not prepared to handle”
Breaking it down
A practical way to investigate this kind of bug looks like this:
- list the concurrent events involved
- draw the order in which they can finish
- find the point where two operations compete over the same state
- identify the missing guarantee: cancellation, locking, request versioning, or final validation
That turns a “random bug” into a predictable collision.
This matters because concurrency does not mean total chaos. It means there are multiple valid timelines, and your code still has to stay correct in more than one of them.
Simple example
Imagine an autocomplete input:
- the user types
re - request A is sent
- the user keeps typing and reaches
react - request B is sent
- request B returns first and shows the correct results
- request A returns later and overwrites the UI with stale data
The problem is not fetch.
The problem is that the frontend accepted an old response as if it were still the current truth.
Good fixes here are straightforward:
- cancel the earlier request with
AbortController - ignore responses with an outdated request ID
- only update the UI if the response still matches the current input
None of these fixes exist to make the request “faster.” They exist to stop old state from winning after the world has already changed.
Common mistakes
- trying to reproduce the bug by random clicking without mapping the timeline first
- putting a
setTimeouton top of the problem and hoping it disappears - assuming “async” means random and impossible to fix
- forgetting that two perfectly valid responses can still break the UI if they arrive in the wrong order
How a senior thinks
More experienced engineers do not call an async bug flaky by reflex.
They draw the timeline and ask:
What sequence of events puts this system into an invalid state?
That question pulls the discussion out of superstition and back into causality.
Another useful question usually follows:
What guarantee is missing that should stop old state from becoming valid again?
Sometimes the answer is cancellation. Sometimes it is idempotency. Sometimes it is a lock. Sometimes it is just checking whether the state is still current before applying the result.
What the interviewer wants to see
In frontend or systems interviews, concurrency reveals depth very quickly.
- You understand that concurrency makes execution order less predictable.
- You look for collision points over shared mutable state.
- You talk about architectural guarantees, not just adding more
await.
A strong answer often sounds like this:
I would draw the timeline and figure out which response or operation arrived too late but still managed to write into shared state. From there I would choose the right guarantee: cancellation, locking, versioning, request IDs, or a final validation check.
A race condition is not bad luck. It is a collision the architecture still does not know how to survive.
Quick summary
What to keep in your head
- Async bugs usually get clearer when you draw the event order instead of staring at code in isolation.
- A race condition happens when two valid operations compete over the same state without enough protection.
- Cancellation, request IDs, final validation, and locks solve different versions of the same timing problem.
- The faster you can name the collision, the less time you waste calling the bug random.
Practice checklist
Use this when you answer
- Can I draw the timeline of two concurrent requests or events?
- Can I tell whether the fix needs cancellation, a lock, a request ID, or a final validation check?
- Can I explain why an old response should not overwrite newer state?
- Can I talk about async bugs in interviews as a causality problem, not a luck problem?
You finished this article
Share this page
Copy the link manually from the field below.