Why Your Data Model Isn’t Wrong — It’s Just Solving the Wrong Problem

Most data models fail not because they are wrong, but because they are asked to solve the wrong problem. Learn how separating raw, relational, and curated layers can restore trust in KPIs and make decisions faster.

TECH TOOLS

Alexander Pau

1/18/20265 min read

I used to think I was bad at naming things.

Whenever someone asked what data architecture I used, I hesitated. Dimensional? Layered? Analytics engineering? Something else?

The truth was simpler. I did not know the names. I just built what the business needed to survive.

Only later did I realize I had been using versions of patterns people now call Medallion architecture, staging layers, semantic models, and curated marts. Not because I followed a framework, but because everything else broke once the organization grew.

That is the part most blog posts skip.

Data models do not fail because they are wrong.
They fail because teams ask them to solve problems they were never designed for.

The Real Mistake Most Teams Make

Most data arguments sound technical.

Should we normalize or denormalize?
Should metrics live in SQL or the BI tool?
Should we use star schemas or wide tables?

Those questions show up late.

The real mistake happens when teams collapse too many responsibilities into one model.

They want a single dataset that can:

Act as an audit trail
Support exploration
Power executive dashboards
Answer ad hoc questions on the fly

That is not ambition. That is overload.

A data model can do one of those things extremely well. Maybe two. Never all four.

A Real Example: When No One Could Find the Truth

I ran into this while working with a client where everyone believed they had the right number.

Sales pulled metrics from the CRM.
Marketing trusted their analytics tool.
Finance relied on spreadsheets built from exports.
Product leaned on event data.

Same KPI. Different systems. Different answers.

Nothing looked obviously broken.
Pipelines were running. Dashboards refreshed on time. The data was recent.

But every leadership conversation stalled in the same place.

When someone asked, “Why did this KPI change?” the discussion immediately shifted from interpretation to definition.

Before anyone could talk about performance, we had to agree on which number we were even discussing.

As we tried to trace the metric backward, a pattern started to show up.

There was no shared starting point.
No dataset everyone trusted as the baseline.
No clear path from source to dashboard.

Logic lived in too many places.
Some filters existed only inside dashboards.
Some joins lived in one-off SQL.
Business rules were mostly tribal knowledge.

So when the KPI moved, the response was rarely an explanation.

It was usually a question.

“Which version are we looking at?”

At the time, I didn’t have a name for what was missing.

I just knew the data was being asked to do too many things at once.

Raw data, transformation logic, and reporting were all blended together. Everything lived too close to everything else.

Nothing was clearly built for traceability.
Nothing was clearly built for decision-making either.

Only later, when I started reading more about data architecture patterns, did it click.

What I had been reacting to was the absence of separation.

Not bad tools.
Not bad SQL.
But no clear boundary between evidence, logic, and output.

That’s when I realized the way I’d been thinking about the problem lined up closely with what people now call medallion architecture.

Not because I was following a framework back then.

Because the business confusion forced the same conclusion.

Raw Models: When You Need Evidence, Not Answers

Raw or staging models exist for one job only. Preserve evidence.

They are intentionally messy.
They mirror source systems closely.
They keep edge cases and history intact.

Raw models answer questions like:

What did the system actually record?
Did this change upstream?
Can we reconstruct this metric if challenged?

If you clean too early, you erase proof. That is how data teams lose credibility during audits or postmortems.

Raw models are not for dashboards.
They are for trust.

Relational Models: Where Logic Belongs

Relational models exist to encode shared understanding.

This is where:

Joins become consistent
Definitions are explicit
Business rules are written once

This layer answers questions like:

What does a customer mean across systems?
How do we define an active user?
Which filters are always applied?

Most teams skip this step and jump straight from raw data to charts. That is why metrics drift.

The same thinking shows up in how I approach execution systems more broadly, which I broke down in Your SQL Isn’t Messy. It’s Lying. Grain mistakes in SQL and grain mistakes in architecture come from the same root cause. Rushing past definitions.

Curated Models: Built for Decisions, Not Debate

Curated models are opinionated by design.

They exist to answer:

Is the business up or down?
What needs attention?
Where should leadership focus?

This is where KPIs belong. Not because they are perfect, but because they are agreed upon.

Alignment beats precision at this level. A slightly imperfect metric that everyone trusts is more useful than a perfect one nobody believes.

This mirrors how I think about tracking work outside of data as well, which I outlined in The Sharp Starts Tracking Playbook.

Medallion Architecture Without the Buzzwords

People often ask whether they should use Medallion architecture.

Bronze. Silver. Gold.

The names do not matter. The separation does.

Medallion works because it enforces three ideas:

Raw data is preserved
Transformation is centralized
Outputs are curated

Databricks explains this cleanly in their overview of Medallion architecture, but the value is not the labels. It is the discipline of separating trust, logic, and decision making.
https://www.databricks.com/glossary/medallion-architecture

I did not adopt Medallion intentionally. I backed into it after being burned by blended layers that tried to do everything.

Other Data Architectures Worth Knowing

Medallion is not the only pattern worth understanding.

Dimensional modeling still works extremely well when reporting needs are stable and definitions change slowly. Ralph Kimball’s work on star schemas remains foundational for many BI teams.
https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/dimensional-modeling-techniques/

Data Vault is powerful when auditability and historical tracking matter more than usability. It is heavy, but effective under regulatory pressure.

Semantic layers matter more every year as metrics logic moves out of dashboards and into governed definitions.

If you want a neutral comparison of these approaches, IBM’s overview of common data warehouse architectures is a solid reference.
https://www.ibm.com/topics/data-warehouse-architecture

The question is never which architecture is best.
It is which problem you are solving.

Why This Matters Beyond the Data Team

This is not just a data problem. It is an operating problem.

When metrics are unclear:

Decisions slow down
Meetings become arguments
Trust erodes quietly

I have seen leaders lose confidence not because results were bad, but because numbers could not be explained.

The same dynamic shows up in careers and teams. When roles are unclear, progress stalls. I wrote about that pattern in The Multi-Hat Survival Guide.

Clarity scales. Ambiguity compounds.

How to Tell If Your Model Is Solving the Wrong Problem

Ask yourself:

Can we trace this KPI back to raw data?
Is transformation logic centralized or duplicated?
Do different teams answer the same question differently?

If yes, your model is not wrong. It is overloaded.

Separate concerns.
Name the layers.
Force decisions to flow from one place.

You do not need a new tool.
You need intent.

Final Thought

Most people do not fail at data because they lack technical skill. They fail because they never decide what the data is for.

I did not set out to build a Medallion architecture.
I set out to stop meetings from collapsing into arguments.

The architecture followed naturally.

That is usually how the best systems are built.

📚Further Reading

Data Warehouse Design Patterns and Best Practices – A practical overview of common warehouse design strategies like star, snowflake, and Data Vault modeling.
Data Warehouse Models Comparison – A side-by-side look at dimensional, normalized, galaxy, and other data models and when to use them.
Dimensional Modeling Explained – A concise background on star schema design, the most common analytics modeling approach.
Data Vault Concepts and Structure – A clear definition of Data Vault modeling components like hubs, links, and satellites.
Data Warehouse Architecture Overview – A vendor-neutral explainer on different architecture types, including lakehouse, mesh, and serverless patterns.
Data Warehousing Modeling on Databricks – An industry example of how modeling techniques like Data Vault and dimensional models fit into a modern lakehouse pipeline.
Medallion vs Data Vault Architecture Breakdown – A comparison of Medallion and Data Vault approaches and when each shines.
Cloud and Analytics Architecture Trends – A broader look at modern architectural trends like lakehouses, mesh, and serverless analytics.

TLDR

Most data models fail because they are asked to do too many jobs at once
Architecture matters less than being clear about what each layer is for
I accidentally used Medallion-style thinking long before I knew the name
Separating evidence, logic, and decisions fixes more than new tools ever will
If your KPIs are debated instead of discussed, your model is overloaded