May 8 2025
SLO, SLA, and SLI: What They Are and How to Answer About Them in Interviews
How to distinguish these three concepts without buzzwords and explain the role of each one clearly in a real context.
Andrews Ribeiro
Founder & Engineer
6 min Intermediate Systems
The problem
These three acronyms show up a lot in platform, backend, reliability, and more mature product-team interviews.
And many people answer badly for one simple reason:
they do not separate the roles clearly enough.
So the answer becomes something like:
- “SLA is availability”
- “SLO is the goal”
- “SLI is the metric”
Technically that points in the right direction.
But it is still too shallow.
Because it does not show:
- why these things exist
- how they relate to each other
- and how they influence engineering decisions
The real problem is this:
many answers stop at the short definition and never reach operational judgment.
Mental model
Think about it like this:
SLI measures, SLO guides, and SLA commits.
That sentence organizes almost everything.
In simple terms:
SLIis the observed indicatorSLOis the internal target the team is trying to hitSLAis the formal commitment, usually with commercial or contractual consequences
Once you understand that order, you stop treating the three as synonyms with different clothes.
Breaking it down
SLI is what you observe
SLI means Service Level Indicator.
In practice, think of it as a measurement of relevant system behavior.
Examples:
- percentage of successful requests
- p95 latency of a critical endpoint
- rate of messages processed within a certain time
- successful logins without error
The main point is:
an SLI is not just “any metric.”
It is a metric that represents something important about the experience or the service.
SLO is the internal target that guides decisions
SLO means Service Level Objective.
It is the target the team chooses to pursue on top of an SLI.
Simple example:
SLI: percentage of successful checkout requestsSLO: 99.9% monthly checkout success
Here is the important part:
an SLO is not just a pretty number.
It exists to influence decisions.
Things like:
- is this release worth shipping now?
- has this incident already consumed too much of our margin?
- are we spending too much reliability for speed?
Without that layer, the SLO becomes only a dashboard.
SLA is an external commitment
SLA means Service Level Agreement.
Usually this is the more formal part:
- a contract
- a commercial promise
- a customer commitment
- a consequence if the promise is not met
Examples:
- financial credits
- contractual penalties
- response obligations
That is why an SLA is usually more connected to the external relationship than to the team’s daily operation.
Mature teams do pay attention to the SLA, but they should not operate only by it.
Because waiting until the contractual limit is near is already too late.
SLO is usually more useful to engineering than SLA
This is a very good interview point.
SLA matters.
But in day-to-day work, engineering usually operates much more around the SLO.
Why?
Because the SLO gives room to maneuver before the problem becomes a commercial crisis.
It lets the team see:
- degradation before a serious break
- reliability budget being consumed over time
- the need to reduce risk before an external commitment explodes
In other words:
the SLO helps the team steer.
The SLA only warns when the team is already close to the wall.
Without a good SLI, the rest gets weak
This mistake is also common.
The team defines an SLO without having a reliable indicator.
The result:
- poorly measured target
- false sense of health
- vague discussion about reliability
If the indicator does not represent the experience or the service well, the target built on top of it loses value too.
So the first useful question is often:
- does this SLI actually capture the behavior that matters?
Not every availability number summarizes reliability
Another bad shortcut is reducing everything to uptime.
A system can be “up” and still be:
- too slow
- failing in one critical flow
- failing for an important subset of users
- accumulating operational delay that stays invisible in a simple dashboard
That is why a mature answer does not treat reliability as only a raw availability percentage.
It thinks about relevant user experience.
In interviews, the best answer connects the concept to use
It is not enough to say:
- SLI is a metric
- SLO is an objective
- SLA is an agreement
It is better to show:
- one real example
- why the SLO would be defined that way
- how it helps a decision
- and why the SLA is not the only ruler for the team
That is what moves the answer out of memorization.
Simple example
Imagine a product with a critical checkout flow.
A weak answer would be:
“SLI is the metric, SLO is the goal, and SLA is the contract.”
That is correct, but still superficial.
A better answer:
“I would first think about the SLI that actually represents the critical experience, for example the checkout success rate or the payment-confirmation latency. On top of that, the team defines an internal SLO, like 99.9% monthly success, to guide release decisions and reliability prioritization. The SLA would be the external commitment to the customer, possibly more conservative and with contractual consequences. In practice, engineering operates by the SLO so it does not discover the problem only after the SLA has already been broken.”
That answer shows:
- the difference between the three pieces
- one concrete example
- operational use
- judgment
Common mistakes
- treating SLI, SLO, and SLA as synonyms
- assuming any metric is already a good SLI
- operating the team only by the SLA
- reducing reliability to raw uptime
- memorizing the definition without explaining why it changes decisions
How a senior thinks
More mature engineers often think like this:
“Good reliability needs a clear way to measure, one internal target that guides trade-offs, and one external commitment that does not get discovered only when things are already bad.”
That view is useful because it connects observability with product and operations.
Seniority here is not knowing the acronym expansion.
It is understanding how those pieces influence:
- prioritization
- release decisions
- risk
- investment in reliability
What the interviewer wants to see
When this topic comes up, the evaluator usually wants to understand whether you:
- clearly separate indicator, objective, and agreement
- can give a concrete example of an SLI and an SLO
- understand why teams operate more around SLO than SLA
- connect reliability to real engineering decisions
- avoid answers that sound too bureaucratic
A strong answer usually has this shape:
- explain the difference
- give one real flow example
- show how the SLO guides decisions
- explain the more external role of the SLA
If that appears, the answer is already above average.
SLI, SLO, and SLA do not exist to make a team look process-heavy. They exist to make reliability measurable and negotiable.
When the team only remembers the SLA, it is usually already reacting too late.
Quick summary
What to keep in your head
- SLI is the observed metric, SLO is the internal target, and SLA is the formal commitment with external consequence.
- Without a reliable SLI, an SLO becomes opinion. Without a clear SLO, reliability turns into vague conversation.
- An SLA should not be the team's main instrument for daily operation.
- In interviews, a strong answer shows how these pieces help decision-making, prioritization, and trade-offs.
Practice checklist
Use this when you answer
- Can I explain the difference between SLI, SLO, and SLA without mixing definition and contract?
- Can I give one concrete example of an SLI and an SLO for a real flow?
- Can I explain why engineering teams operate more around SLO than SLA?
- Can I answer without turning reliability into corporate jargon?
You finished this article
Share this page
Copy the link manually from the field below.