February 18 2025
Data Modeling Without Overcomplicating
How to turn business rules into entities, relationships, and constraints that stay useful in real systems.
Andrews Ribeiro
Founder & Engineer
5 min Intermediate Systems
Track
System Design Interviews - From Basics to Advanced
Step 7 / 19
The problem
When people start modeling data, they usually fall into one of two holes.
Hole one:
- create a table for every word that came up in the meeting
Hole two:
- put everything into one table because “we can split it later”
Both feel practical at the start.
Both tend to get expensive later.
In the first case, the schema becomes a maze of relationships nobody really needed.
In the second case, everything gets mixed together:
- business rules
- derived data
- history
- current state
- information that changes at different speeds
So the problem is not only “how do I draw tables?”
It is knowing what really deserves its own identity in the system.
Mental model
Think about it like this:
A good entity is something the system needs to recognize, relate, and protect over time.
That helps because not everything in the business should become its own table.
Some things are clearly entities:
- user
- order
- payment
- subscription
Some things may just be part of another entity:
- shipping address
- money value
- notification preference
Some things look more like events or history than core entities:
- status change
- payment attempt
- retry
So a better question than “does this exist in the business?” is:
- does it need to exist on its own?
- does it have its own rules?
- does it change separately from the rest?
- will it be queried or updated independently?
- does it need its own identity?
If the answer is “no” most of the time, it may not need a new table.
Breaking it down
Start with invariants, not with pretty nouns
Good modeling protects important truths in the system.
Examples:
- a payment cannot exist without an order
- an order has items
- a user cannot have two active identities for the same legal document
- a coupon expires
Those are invariants.
When you see them early, modeling stops being diagram work and starts being rule protection.
Not every important field deserves a separate table
This mistake is common.
In ecommerce, someone sees shipping_address and thinks:
- “address matters, so it must be its own table”
Maybe.
Maybe not.
It depends on how the system uses it.
If the address on an order is just a snapshot of the purchase moment, it may fit better as an embedded value on the order.
If you need to reuse, edit, validate, and relate addresses independently, the chance that it deserves its own entity goes up.
In other words:
business importance does not automatically mean a separate table.
Access patterns matter as much as rules
Some teams model only by looking at meaning.
Some teams model only by looking at CRUD.
Both miss half the picture.
Useful questions are:
- what is usually read together?
- what gets updated often?
- do we need history?
- is there concurrency pressure?
- do we need aggregate queries?
For example:
- if orders always come with items, that relationship needs to be clear
- if price history matters, overwriting a field is not enough
- if stock changes under concurrency, writes deserve extra care
Good modeling appears when business rules and real usage of the data move together.
Generic future-proof abstractions usually hurt the present
Another common mistake is creating abstraction too early:
entitiesmetadataattributes- a generic
eventstable trying to serve everything
That usually comes from the fantasy that:
- “we might need something more generic later”
What you often buy instead is:
- bad queries
- scattered rules
- confusing maintenance
- a system that is hard to explain
Good modeling is not the one that predicts every future.
It is the one that can handle the likely future without destroying clarity in the present.
Normalization and pragmatism are not a religion
Sometimes splitting helps a lot:
- less duplication
- lower inconsistency risk
- clearer rules
Sometimes keeping things closer helps:
- simpler reads
- fewer expensive joins
- useful historical snapshots
The point is not to memorize normal forms.
The point is to know why you separated or grouped something.
Simple example
Imagine an order system.
A confused model would put everything into orders:
- customer data
- serialized items
- address
- total
- status
- payment data
- coupon data
At first, that feels fast.
Problems show up soon:
- updating one item becomes awkward
- querying best-selling products gets harder
- separating current customer data from purchase-time snapshot becomes messy
- payment rules get mixed with order rules
- history and current state blur together
A better model might split:
customersordersorder_itemspayments
And still keep some purchase-time snapshot fields inside orders, such as shipping name and delivery address.
That is not over-modeling.
It is giving each important part a clear responsibility.
Common mistakes
- Modeling for imaginary edge cases before validating real use.
- Stuffing too many responsibilities into one central table.
- Ignoring how the application actually needs to query the data.
- Treating normalization or denormalization like a moral choice instead of a practical tool.
How a senior thinks
A strong senior engineer models data by looking at hard business rules and access patterns, not technical ornament.
The reasoning usually sounds like this:
Before I finalize the schema, I want to know which truths this system must protect and which reads absolutely need to be easy.
That question prevents a lot of complexity that feels smart on paper and painful in production.
What the interviewer wants to see
In interviews, your modeling approach reveals your level quickly.
They want to see whether you can:
- distinguish between an entity, a relationship, and a business rule
- think about reads and writes, not just storage
- justify the structure with realistic use cases
Data modeling is not just drawing boxes and lines. It is deciding what truths the system has to represent without lying to itself.
If the structure only makes sense in the diagram, it is probably not ready for the product.
Quick summary
What to keep in your head
- Not every business noun deserves its own table. Good entities have identity, rules, and a meaningful lifecycle.
- Good modeling starts from invariants and access patterns, not from pretty diagrams or generic abstractions for the future.
- Too many tables spread logic around. Too few tables mix responsibilities and make evolution painful.
- In interviews, strong answers show entities, relationships, constraints, and main reads with clear judgment.
Practice checklist
Use this when you answer
- Can I explain why something should be a table, an embedded value, or just a supporting field?
- Can I talk about identity, cardinality, and constraints without drifting into theory for its own sake?
- Can I connect modeling choices to the reads and writes the system actually performs?
- Can I justify why a simple structure today can still evolve tomorrow?
You finished this article
Part of the track: System Design Interviews - From Basics to Advanced (7/19)
Next step
Messaging and Queues Next step →Share this page
Copy the link manually from the field below.