Is Data Modelling Dead?
If like me you've been embedded in the data world for a little while, you might have experienced an interesting trend. That of complete lack of data modelling. No Kimball. No dimensions or facts. No 3NF. The view that modelling isn't required up front. Just get the data in there and let the reporting tools handle the rest.
With the rise of the "modern data stack" perhaps many organisations question the need for the structured, often slow-going modelling work of old. To some, the methods developed by the likes of Ralph Kimball (star schemas and conformed dimensions) are sometimes branded as relics of a bygone era.
But is this right? Are these traditional data modelling techniques really obsolete or are we throwing away useful discipline and structure because modern tooling feels so fast and sexy? The short answer: yeah nah. The longer answer: it depends.
Why data modelling exists: structure, context and business meaning
At its core, data modelling is not just about technical design. It’s about creating shared meaning, predictable structure, and a stable environment on which both analysts and business users can rely.
It gives you:
- Clarity for the business audience. Dimensional modelling (facts, dimensions, star schemas) gives business users something intuitive to grasp: “Here are the measures (fact), here’s the context (dimensions)”.
- Consistency of definitions. Conformed dimensions (a Kimball concept) allow differing business processes to share the same “customer”, “product” context.
- Predictability and performance. In earlier years, data warehouses were expensive, queries were slow, and modelling for structure had clear performance benefits.
- Governance and data quality safety net. When you model deliberately, you force upfront thought about grain, relationships, slowly changing dimensions, history, and so on. Things that are easy to ignore when you throw everything into a big flat dataset and deal with it later.
This is foundational stuff. If you don't put the thought in at the start, you'll surely pay for it later.
The modern data stack: Have cloud tools changed the landscape?
The world has changed. Tools, platforms, and organisational expectations have shifted which has put pressure on traditional modelling.
Some of the changes:
- Low cost storage + compute and columnar engines. Platforms such as Snowflake, Databricks, and Synapse or Fabric make it cheap and fast to store large volumes of data and query broadly without worrying as much about every join and every normalisation.
- ELT over ETL. The shift from heavy upstream transformation (ETL) to landing raw or lightly-transformed data and doing transformations closer to consumption changes modelling needs.
- Semi-structured/unstructured data. Logs, JSON, XML, event streams; all of which don’t cleanly map to classical dimensional schemas. That means more flexibility is needed, and some organisations favour schema-on-read rather than schema-on-write.
- Faster time-to-value pressures. Business stakeholders expect quicker insights. Waiting months for a perfectly modelled data warehouse can feel slow and out of sync with agile business demands.
- Self-service BI. More non-technical users want to explore data quickly themselves, so teams feel under pressure to deliver data faster and with less friction.
These shifts have caused many organisations to rethink or bypass traditional modelling in favour of more agile, fluid pipelines. But this doesn’t automatically mean modelling is pointless. It just means we need to think differently.
Why some orgs are skipping modelling
If you’re hearing “we’re skipping modelling” in your organisation, you’re not alone. There are several reasons (some valid, some less so) why teams choose this route.
1. Misconception: “modern tools mean we don’t need a model”
Because platforms can handle large, wide, denormalised tables and because query-performance is stronger, some believe that modelling is redundant. It's becoming more common to see the “one big table” approach rather than a star schema because of how the modern architecture performs. But as my mum used to say, just because you “can” doesn't mean you “should”.
2. Skill shortages
Good data modelling takes expertise. Analysts, engineers or architects who deeply understand grain, slowly changing dimension types, conformed dimensions, etc, aren’t always abundant. If your team lacks that expertise, you might skip modelling because “we can’t build it right anyway”.
3. Cost pressures and time-to-value
Modelling takes time and effort. If leadership is pushing for quick wins, you might land data fast with minimal modelling and iterate later (or probably never). The cost of upfront modelling (design sessions, workshops, documentation) can seem high compared to just landing data and letting analysts figure it out.
4. Unclear governance / blame culture
If the culture is “just deliver dashboards”, not “build data as a product”, modelling becomes low priority. When ownership is weak, it’s easier to skip refining models and treat the data warehouse as a raw dump with “some cleaning later”.
5. Agile-overload and context shifts
In fast moving organisations, by the time you’ve modelled something carefully, the business question has changed. So teams say: “Why waste time modelling? Let’s just load and query.” That’s tempting but also risky, and is often a result of unclear business requirements rather than the question having truely changed.
So yes, skipping modelling might look attractive, but it comes at a price.
What they're missing out on
So what's the risk? What could go wrong by not investing, or investing less, in some proper modelling?
1. Loss of usability and trust
When you haven’t established clear definitions, business users can’t reliably understand or join data. One team’s “customer” might differ from another’s. Without conformed dimensions and a clear semantic layer, you risk multiple versions of the truth and fractured insights.
2. Technical debt and “data spaghetti”
Without modelling you end up with a messy pipeline where ad hoc joins, wide tables, duplicated data, and repetitive logic proliferate. Over time this becomes hard to maintain, understand, and modify. AKA tech debt.
3. Performance surprises
Yes, modern platforms are fast, but even so, when you have massive datasets, incomplete modelling can mean inefficient queries, large joins, redundant scans, and big compute bills. The structure of your data still matters.
4. Governance, compliance and audit issues
If you haven’t modelled for history (slowly changing dimensions), time (date dimensions), or consistent keys (surrogate keys vs natural keys), you may find reporting harder, lineage unclear, and compliance heavier.
5. Limited extensibility and reuse
Proper modelling encourages reusable dimensions and fact tables aligned with business processes. If you skip that, you may find every new use-case becomes a new pipeline rather than re-using existing structures. Think more “reinvent wheel” than “leverage asset”.
6. Analytic usability degraded
Imagine giving the business a blob of semi-structured tables and saying “go explore”. They may be able to, but with much more friction, less self-service, more handholding. Dimensional models help analysts and non-analysts alike more easily slice, dice, filter, and aggregate the data in meaningful ways.
7. Strategy misalignment
The reason the business invests in analytics is to drive decisions. If the data environment is unstructured, inconsistent or ad-hoc, the value chain from data to decision breaks. You may be doing analytics, but not enabling analytics at scale.
Skipping modelling is tempting, but you trade speed for structure, and you might pick up hidden costs.
Is there a pragmatic middle ground?
So what’s a senior data leader to do? The binary “traditional model vs no model” framing doesn’t help. The better path is a pragmatic middle ground: spend modelling effort where it matters, and leverage those modern stack capabilities where you should.
Here are some thoughts:
1. Start with high-value, high-repetition areas
Pick business processes that are central and repeated (e.g., sales transactions, customer relationships, product catalogues) and apply dimensional modelling here. Don’t model everything upfront. Use an iterative approach. This aligns with modern agile practices but still ensures core consistency.
2. Use your modern ELT/ELT tooling to automate repeatable modelling patterns
Platforms like Databricks, Snowflake, and dbt allow you to automate and version transformations. You can embed dimensional model patterns into your pipelines, making the “up-front cost” lower. Dimensional modelling can still be a helpful design activity regardless of how you end up implementing the technical solution.
3. Maintain key modelling discipline but adapt for flexibility
You might not model everything to 3NF or enforce strict star schema everywhere, but you can still adopt key disciplines:
- Define grain for fact tables.
- Define conformed dimensions when the business context is shared.
- Use surrogate keys for stability across changes.
- Capture business rules explicitly rather than letting them hide in code.
- Design for analytics (ease of querying) rather than purely for data capture.
4. Embrace a layered architecture
Many modern data stacks support a layered approach: raw layer (landing zone), curated layer, presentation layer. Or the medallion architecture if you prefer that. In the presentation layer you apply modelling and star schemas for consumption. The raw layer can be schema-light, schema-on-read. This gives you speed + structure.
5. Balance trade-offs consciously
Some parts of your data may never need full dimensional models (ad-hoc feeds, exploratory data). Use modelling selectively. But do so consciously and document the rationale, track risk.
6. Align modelling with business value and analytics usage
If a dataset is getting heavy use, exposing it to many business users, supporting self-service, you’ll benefit more from structured modelling. If it’s a narrow internal feed used only by one team, you might accept lighter modelling.
7. Communicate the value
Part of why modelling sometimes gets skipped is that its value is opaque to stakeholders. Frame modelling as an enabler of scalable analytics (not as “boring design”). Think “we’re investing a little time now so we don’t spend a lot more time later”. And tie it to risks (consistency, duplication, cost).
8. Monitor and iterate
Use monitoring to detect when your “light modelling” approach is starting to hit friction (many joins, slow queries, duplicated data, support overhead). When you see those signals, it’s time to double down on modelling. While a lot of the modern data stacks have this vast amount of compute to throw at them, there comes a cost/benefit trade-off between just continuing to throw money at more compute versus investing the time in modelling.
In short: there’s no need to cling to the early data modelling days approach unmodified, but nor should you discard modelling completely. Modern stacks give you more flexibility. Your aim is to blend flexibility with discipline.
The future of data modelling: evolution, not extinction
In my view, classic data modelling techniques like those advocated by Kimball are not dead. Their value of enabling clarity, consistency, and usability still remains. But they are evolving in the face of modern data stacks, cloud platforms, faster time-to-value expectations, semi-structured data and self-service analytics.
If you assume “we don’t need modelling anymore”, you risk building a data environment that looks like a kitchen with every ingredient dumped on the bench. Technically everything’s there, but it’s chaos when you’re trying to make something useful. But on the other hand, if you insist on modelling everything the same way you did 20 years ago (shhh, I'm not that old), you’ll slow down innovation and frustrate the business.
So it's about balance. Use modelling where it delivers value, especially in your core, high-usage areas. Leverage modern tooling to accelerate and automate modelling patterns. Accept lighter modelling when appropriate, but do so knowingly, and monitor for when deeper structure is needed.
In a medium or large organisation modernising its data platform, the wise move is to evolve your modelling practices. Think of it less as “modelling or no modelling” and more as “which level of modelling makes sense for this use case, given our platform, team skills, business priorities and data maturity”.
So yes, keep the star schema in your toolkit. Teach your team grain, fact tables, conformed dimensions, surrogate keys. But also keep your eyes open for when simplicity, flexibility, and speed should prevail. Data modelling isn’t about nostalgia or tradition. It’s about enabling reliable, scalable, and trustworthy data. The tools may change, but the need for shared understanding never will.