February 18 2025

Data Modeling Without Overcomplicating

How to turn business rules into entities, relationships, and constraints that stay useful in real systems.

Andrews Ribeiro

Founder & Engineer

5 min Intermediate Systems

#data-storage#data#modeling#sql#backend

Track

System Design Interviews - From Basics to Advanced

Step 7 / 19

Back to track Previous article Next article

The problem

When people start modeling data, they usually fall into one of two holes.

Hole one:

create a table for every word that came up in the meeting

Hole two:

put everything into one table because “we can split it later”

Both feel practical at the start.

Both tend to get expensive later.

In the first case, the schema becomes a maze of relationships nobody really needed.

In the second case, everything gets mixed together:

business rules
derived data
history
current state
information that changes at different speeds

So the problem is not only “how do I draw tables?”

It is knowing what really deserves its own identity in the system.

Mental model

Think about it like this:

A good entity is something the system needs to recognize, relate, and protect over time.

That helps because not everything in the business should become its own table.

Some things are clearly entities:

user
order
payment
subscription

Some things may just be part of another entity:

shipping address
money value
notification preference

Some things look more like events or history than core entities:

status change
payment attempt
retry

So a better question than “does this exist in the business?” is:

does it need to exist on its own?
does it have its own rules?
does it change separately from the rest?
will it be queried or updated independently?
does it need its own identity?

If the answer is “no” most of the time, it may not need a new table.

Breaking it down

Start with invariants, not with pretty nouns

Good modeling protects important truths in the system.

Examples:

a payment cannot exist without an order
an order has items
a user cannot have two active identities for the same legal document
a coupon expires

Those are invariants.

When you see them early, modeling stops being diagram work and starts being rule protection.

Not every important field deserves a separate table

This mistake is common.

In ecommerce, someone sees shipping_address and thinks:

“address matters, so it must be its own table”

Maybe.

Maybe not.

It depends on how the system uses it.

If the address on an order is just a snapshot of the purchase moment, it may fit better as an embedded value on the order.

If you need to reuse, edit, validate, and relate addresses independently, the chance that it deserves its own entity goes up.

In other words:

business importance does not automatically mean a separate table.

Access patterns matter as much as rules

Some teams model only by looking at meaning.

Some teams model only by looking at CRUD.

Both miss half the picture.

Useful questions are:

what is usually read together?
what gets updated often?
do we need history?
is there concurrency pressure?
do we need aggregate queries?

For example:

if orders always come with items, that relationship needs to be clear
if price history matters, overwriting a field is not enough
if stock changes under concurrency, writes deserve extra care

Good modeling appears when business rules and real usage of the data move together.

Generic future-proof abstractions usually hurt the present

Another common mistake is creating abstraction too early:

entities
metadata
attributes
a generic events table trying to serve everything

That usually comes from the fantasy that:

“we might need something more generic later”

What you often buy instead is:

bad queries
scattered rules
confusing maintenance
a system that is hard to explain

Good modeling is not the one that predicts every future.

It is the one that can handle the likely future without destroying clarity in the present.

Normalization and pragmatism are not a religion

Sometimes splitting helps a lot:

less duplication
lower inconsistency risk
clearer rules

Sometimes keeping things closer helps:

simpler reads
fewer expensive joins
useful historical snapshots

The point is not to memorize normal forms.

The point is to know why you separated or grouped something.

Simple example

Imagine an order system.

A confused model would put everything into orders:

customer data
serialized items
address
total
status
payment data
coupon data

At first, that feels fast.

Problems show up soon:

updating one item becomes awkward
querying best-selling products gets harder
separating current customer data from purchase-time snapshot becomes messy
payment rules get mixed with order rules
history and current state blur together

A better model might split:

customers
orders
order_items
payments

And still keep some purchase-time snapshot fields inside orders, such as shipping name and delivery address.

That is not over-modeling.

It is giving each important part a clear responsibility.

Common mistakes

Modeling for imaginary edge cases before validating real use.
Stuffing too many responsibilities into one central table.
Ignoring how the application actually needs to query the data.
Treating normalization or denormalization like a moral choice instead of a practical tool.

How a senior thinks

A strong senior engineer models data by looking at hard business rules and access patterns, not technical ornament.

The reasoning usually sounds like this:

Before I finalize the schema, I want to know which truths this system must protect and which reads absolutely need to be easy.

That question prevents a lot of complexity that feels smart on paper and painful in production.

What the interviewer wants to see

In interviews, your modeling approach reveals your level quickly.

They want to see whether you can:

distinguish between an entity, a relationship, and a business rule
think about reads and writes, not just storage
justify the structure with realistic use cases

Data modeling is not just drawing boxes and lines. It is deciding what truths the system has to represent without lying to itself.

If the structure only makes sense in the diagram, it is probably not ready for the product.

Quick summary

What to keep in your head

Not every business noun deserves its own table. Good entities have identity, rules, and a meaningful lifecycle.
Good modeling starts from invariants and access patterns, not from pretty diagrams or generic abstractions for the future.
Too many tables spread logic around. Too few tables mix responsibilities and make evolution painful.
In interviews, strong answers show entities, relationships, constraints, and main reads with clear judgment.

Practice checklist

Use this when you answer

Can I explain why something should be a table, an embedded value, or just a supporting field?
Can I talk about identity, cardinality, and constraints without drifting into theory for its own sake?
Can I connect modeling choices to the reads and writes the system actually performs?
Can I justify why a simple structure today can still evolve tomorrow?

You finished this article

Part of the track: System Design Interviews - From Basics to Advanced (7/19)

Next step

Messaging and Queues Next step →