Your AI isn't broken. Your data model is.

last updated on Jun 08, 2026
This is a guest post from Dustin Dorsey, Senior Director of Data Engineering at phData. Dustin is a co-author of Unlocking dbt. He works with enterprise teams to build the data foundations that make AI reliable at scale.
There is a pattern showing up in organizations everywhere right now, and it is so consistent it almost feels scripted.
A team runs a proof of concept. The AI performs brilliantly. Questions get answered in seconds that used to take days. Executives are impressed. Someone uses the word "transformative." Budget gets approved, the rollout begins, and within a few months the same executives are quietly wondering why the thing keeps giving different answers to the same questions. The data team gets called into a meeting. Someone suggests they need a better model. Someone else suggests better prompts. A third person wonders aloud if they should evaluate a different vendor
Almost nobody asks the question that actually matters: Is the problem in the AI, or is it in the data the AI is trying to reason over?
In most cases, it is the data. And more specifically, it is not a data quality problem. It is a data design problem.
Why the POC always works
The proof of concept works because it was designed to work.
This is not cynicism. It is just how POCs operate. When you stand up an AI pilot, you pick a domain you understand well. You choose datasets that have been curated and used repeatedly. You scope the questions narrowly enough that the answers are unambiguous. You have subject-matter experts in the room who course-correct when something looks off. And critically, you are working within a slice of your data where meaning has already been implicitly enforced through months or years of human use.
The AI is not discovering meaning in those scenarios. It is operating inside a perimeter where meaning was already established, and it is doing a very good job of navigating that perimeter quickly.
The problem is that this creates the impression that your organization is ready for AI at scale. It is not. It is ready for AI in the narrow, well-maintained corner of your data estate that you chose for the demo. The rest of your data is a different story.
When AI moves into production, the perimeter disappears. Real users ask questions that span domains. They phrase things differently. They ask about concepts that exist in three tables with slightly different definitions. They want to compare metrics that two separate teams calculate in two separate ways, and both of them call it the same thing. The AI, without a human expert in the room to catch the ambiguity, picks an interpretation and runs with it. Sometimes it picks correctly. Often it does not. And the frustrating part is that you cannot always tell which is which from looking at the output.
This is where confidence erodes. Not because the technology failed. Because the expectations were built on a foundation that was never as solid as the demo made it appear.
The human buffer no one talks about
Here is the uncomfortable truth that most AI discussions sidestep: your analysts have always been compensating for this problem.
Every time a business user asks "what was our revenue last quarter," an experienced analyst does not just run a query and send back a number. They instinctively clarify intent. They know that "revenue" means three different things depending on who is asking. They know the CFO wants recognized revenue on the accrual schedule; the sales team wants booked orders net of cancellations; and the operations team wants billed invoices for the period. They know which dataset is authoritative for each use case. They know which edge cases to handle and which filters to apply. They run the query, sanity-check the result against a number they already have a rough expectation for, and only then do they send it.
That entire process is invisible to the business. It looks like "the analyst ran a query." What it actually is, is a highly experienced person serving as a translation layer between a messy, ambiguous data environment and a decision that needs to be made.
AI does not have that translation layer. It cannot. It sees the raw structure of your data, reasons over what it finds, and produces an answer. If the structure is ambiguous, the answer will be inconsistent. Not randomly inconsistent, which would at least be easy to catch, but defensibly inconsistent. Inconsistent in ways where every answer it gives is technically justifiable based on what the data says.
That is the hardest kind of wrong to catch, because nothing looks broken. The query ran. The numbers came back. The dashboard loaded. The output just happens to be answering a slightly different version of the question than the one the business was asking.
One question, five defensible answers
Let me make this concrete.
"What was our revenue last quarter?"
In a typical enterprise data environment, this question has multiple technically valid answers. Revenue might exist at the transaction level, the invoice level, or the recognition level. It might include or exclude returns, internal transfers, or credits depending on who configured the pipeline and when. There might be a table in the CRM that tracks bookings, a separate table in the ERP that tracks invoices, and a reconciliation table in the finance system that is the authoritative source for period-close reporting. All of them have a revenue column. All of them have a date field. All of them will give you a number.
A human analyst knows which one to use. They know it because they were told, or because they learned it the hard way, or because they asked and someone explained it in a meeting two years ago that was never documented. That knowledge lives in their head, not in the data.
Now ask an AI to answer the same question across all of those tables. It will pick one interpretation based on the structure it can see, the column names it recognizes, and whatever contextual signals exist in the prompt. Ask the same question worded slightly differently and it may pick a different interpretation. Ask it twice on different days and you may get two numbers that cannot be reconciled without knowing exactly which path each query took through your schema.
The AI is not making mistakes. It is doing exactly what you would do if you were handed a schema with no documentation and asked to answer a business question as fast as possible. It is making reasonable inferences. The problem is that reasonable inferences are not the same as business-defined answers, and at scale, the gap between those two things becomes very expensive.
Centralized storage is not the same as centralized meaning
Most organizations spent the last decade centralizing data. They moved from distributed data marts to cloud warehouses. They built pipelines. They established governance frameworks. They invested in tooling. By most measures of data infrastructure maturity, they are in a strong position.
What they did not centralize is meaning.
Centralizing storage answers the question of where data lives. Centralizing meaning answers the question of what that data represents and how it should be used. These are completely different problems, and solving the first one does not automatically solve the second.
When you bring data from multiple source systems into a single warehouse without establishing a shared interpretation of that data, you have not created clarity. You have created a larger surface area for ambiguity. You have more tables, more join paths, more definitions of the same concept, and more ways for a system trying to reason over that data to arrive at a different answer than the one you expected.
In a human-driven analytics environment, this ambiguity gets resolved through people. It gets resolved in the meeting where two teams present conflicting numbers and someone explains which calculation is correct for this context. It gets resolved through tribal knowledge that experienced analysts carry around and apply every time they touch a dataset. It gets resolved through the dashboard filter that is always set to "exclude refunds" even though there is nothing in the underlying table that enforces that rule.
AI cannot attend those meetings. It cannot acquire that tribal knowledge. It cannot apply that filter unless someone has encoded it into the data structure itself. The ambiguity that humans have learned to work around for years does not disappear when AI arrives. It becomes visible. It becomes consequential. And it becomes your most important data problem, regardless of how good your LLM is.
What actually needs to change
There is a version of this problem that gets solved by better prompts. If the ambiguity is narrow and well-understood, you can often describe it in the prompt and get consistent outputs. But that approach has a ceiling. You cannot prompt your way to consistency across a data estate where meaning is systematically implicit. At some point, the only real fix is to encode the meaning into the data itself.
This is what dimensional modeling is actually for, and it is why people who have been doing data engineering for a long time have been saying for years that the fundamentals still matter.
Dimensional models are not a legacy pattern for old-school BI tools. They are the structural mechanism for making business meaning explicit. They organize data around business processes rather than source systems. They separate what happened (facts) from the context required to understand it (dimensions). They declare grain. They make relationships intentional rather than inferred.
The reason structure works where documentation and prompts cannot is worth stating directly. A documented definition of revenue gets ignored by the analyst who never found the Confluence page, and it is invisible to the AI that never reads documentation at all. A prompt can describe which table to use, but only for the one query you thought to write the prompt for. A modeled definition of revenue is enforced at query time, for every query, automatically, whether or not anyone remembers the rule exists. AI cannot read intent. It can only navigate structure. That is why the fix lives in the data layer, not in the prompt layer.
When data is modeled this way, the questions AI can reliably answer expand dramatically. Not because the model becomes smarter, but because it has fewer opportunities to be wrong. The structure communicates intent. The interpretation is constrained. The answer space is bounded in ways that align with how the business actually defines its processes.
This is not about going back to a rigid schema that cannot accommodate modern analytical needs. It is about recognizing that flexibility without structure is not an asset when AI is doing the reasoning. AI needs guardrails in the data layer that tell it what things mean and how they relate to each other. Dimensional modeling provides those guardrails.
A question worth asking before you buy anything else
Before you evaluate a new model, hire a prompt engineer, or stand up another AI platform, ask your team one question: Can you point to a single authoritative dataset for each of your core business processes?
Not a general answer. A specific one. If someone asks "which table is the source of truth for customer lifetime value," is there an answer that everyone agrees on? If someone asks "how is an active customer defined," does the data enforce that definition, or does it live in someone's head and get applied inconsistently?
If you cannot answer those questions with confidence, the problem is not your AI. It is the foundation the AI is trying to reason over. And no amount of model tuning or prompt engineering is going to fix a foundation that was never designed to communicate business meaning in the first place.
The good news is this is a solvable problem. Organizations that invest in getting the foundation right do not just get better AI results. They get better analytics, better reporting, and better alignment across teams. The AI becomes an accelerant rather than an amplifier of existing confusion.
The organizations that skip this step and keep tuning the model instead of the data will find themselves in the same meeting six months from now, still trying to explain why the numbers do not match. And that meeting (the one where two teams present the same metric with different values and neither of them is technically wrong) is worth understanding in its own right. Because that scenario is not an edge case. It is the default state of most enterprise data environments, and AI is about to make it impossible to ignore.
If you want to go deeper on the structural conditions that need to exist before AI can operate reliably on your data, I've written a full white paper on this topic. Building the Foundational Layer for Reliable AI on Structured Data covers why dimensional modeling functions as trust infrastructure, what it actually means for data to be AI-ready, and why organizations that skip this foundation keep struggling in the same ways.
Where dbt and phData fit
This is exactly why phData and dbt fit so naturally together. dbt provides the implementation home for the kind of intentional, process-oriented data modeling this argument is built on, with model-layer structure, testing, documentation, and the dbt Semantic Layer giving teams a practical way to encode business meaning directly into the transformation layer. phData’s role is to help teams operationalize that in practice: aligning on definitions, designing models around real business processes, and turning the principles described here into systems that can actually be built, governed, and trusted in production.
Get started in dbt
Join the analytics engineers building data infrastructure that actually scales.
Fivetran + dbt Labs: What's shipping. Live Q&A.
Tristan Handy and Taylor Brown answer your questions directly — what the Fivetran + dbt Labs merger means for your team, and what's coming next. June 25 & 30.





