/ /
Understanding MCP: The missing glue between governed data and AI agents

Understanding MCP: The missing glue between governed data and AI agents

Daniel Poppy

on Jul 14, 2025

Imagine asking your best data analyst to make a decision—without giving them access to the source data or its documentation. That’s how most AI agents operate today: disconnected from the governed, trusted data that real business decisions rely on.

The result? Misleading outputs and missed insights—not because the model lacks intelligence, but because it lacks context.

Enter the Model Context Protocol (MCP): a new open standard designed to close this gap. In this article, we’ll explore how dbt’s integration with MCP helps AI agents access structured, governed data and metadata—bringing transparency, accuracy, and trust to AI-powered workflows.

What is Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is an open-source standard that enables AI agents to access structured, trusted data in real-time without requiring custom connections.

MCP was released by Anthropic in November 2024. It replaces fragile, one-off data integrations with a shared method for connecting AI to any data source, such as databases, APIs, or business tools.

AI Agents With vs. Without MCP. Without MCP: hardcoded, brittle integrations per tool; no awareness of data models or relationships; hallucinated metrics or inconsistent queries; heavy human oversight for validation. With MCP: one standard interface for all data sources; discovers model relationships, lineage, and metadata; uses governed metrics from dbt’s semantic layer; autonomous, accurate, and auditable responses.

MCP solves model isolation from siloed data and legacy systems. It creates a standardized protocol with discovery, semantic, and execution tools. These tools connect AI agents to data sources. By using these tools, AI agents pull live data from databases, APIs, and business applications via MCP servers.

These tools enable AI agents to explore data models, understand relationships, and perform analytics tasks. This provides accurate context without relying on hardcoded connections.

Given its utility, MCP has garnered strong industry support, with major players such as Google, Microsoft, and OpenAI backing the protocol.

Why AI needs access to governed data

Without structured, validated data, AI agents generate unreliable outputs or require labor-intensive manual oversight. Common challenges include:

  • Custom pipelines: These are hand-coded data flows, such as Python scripts or ETL jobs, that connect systems. They’re fragile; a small schema change can break them, causing silent errors or inaccurate results for AI systems.
  • LLM hallucinations: When AI models don’t have access to clearly defined, trusted metrics, they guess. For example, "monthly revenue" might be interpreted in different ways, sometimes including tax, discounts, or refunds. This creates inconsistent and unreliable answers.
  • Operational friction: In many teams, AI agents depend on data specialists to explain models, clarify terms, or write SQL queries. This creates delays in delivering insights and increases the risk of errors, since every step requires manual involvement.

This raises a critical challenge: How can AI agents reliably understand and use the rich, governed context of enterprise data?

Moreover, agents need to do so without relying on brittle, custom connections. That’s where a MCP server, like the dbt MCP Server, functions as the missing glue between governed data and AI.

The dbt MCP Server gives AI agents direct, structured access to your dbt project. A set of tools built into the dbt MCP Server enables a Large Language Model (LLM) to gain a deep understanding of your data models, documentation, and underlying metadata.

With these capabilities, MCP addresses the above challenges. It gives AI agents direct access to dbt’s Semantic Layer, lineage, and documentation. This enables AI agents to understand the structure and business logic of your data without requiring manual intervention.

AI agents gain structured visibility into how your business logic is defined and implemented.

For example, an LLM can query monthly revenue by region, using the exact definition in dbt’s revenue metric, ensuring that AI outputs align with organizational truth.

Inside the dbt MCP Server architecture

The dbt MCP Server translates LLM requests into dbt-native operations and returns structured context. Its architecture rests on three pillars:

Discovery tools

This layer includes methods like get_all_models, get_model_details, get_model_parents, and get_mart_models. These tools enable autonomous metadata ingestion, allowing AI agents to automatically explore project structure, relationships, and column-level lineage. This provides an entirely contextual understanding of your data without manual effort.

Semantic layer

The Semantic Layer provides governed analytics. It uses tools such as list_metrics, get_dimensions, and query_metrics. AI systems can query validated business metrics and dimensions, like monthly revenue. They query directly from the source of truth defined in dbt’s Semantic Layer. This eliminates guesswork and reduces hallucinations.

Execution engine

The Execution engine drives operational orchestration. It uses commands like dbt run, dbt test, dbt compile, and dbt build. These tools enable seamless automation, allowing AI agents to run dbt pipeline runs, tests, or compile operations directly through conversational interfaces.

How it works

These pillars activate in a coordinated workflow when requests arrive:

Request ingestion & tool selection

A user or AI agent asks a question, like "Run tests for customer models," in an MCP-enabled client. The request structures into a standardized protocol message. This message routes to the dbt MCP Server.

Context-aware routing

The server’s protocol layer acts as an intelligent dispatcher. It analyzes the request to select the optimal tool based on intent:

  • For metric-focused questions, such as "Show last quarter’s revenue trends," it activates the Semantic Layer. It then uses the query_metrics tool to ensure that the results are governed.
  • For dependency investigations, such as "What’s upstream of the customer table?", it uses Discovery Tools like get_model_parents to map lineage.
  • For operational commands, including "Test payment models," it triggers the Execution Engine to run dbt test.

Instant context hydration

The server loads pre-compiled project knowledge, such as lineage graphs and metric definitions, from memory before execution. This eliminates slow data warehouse queries.

Permission -bound execution

Every action runs within strict guardrails:

  • Metric queries run through dbt Cloud APIs for certified results.
  • Commands like dbt run operate in isolated sandboxes.
  • Read-only access is enforced by default.

Real-time result streaming

The dbt MCP Server streams outcomes incrementally. Instead of static reports, it delivers insights:

  • Pipeline test logs appear line-by-line, showing failures during execution.
  • Metric results populate column-by-column, helping detect patterns before queries finish.
  • Lineage maps expand node-by-node, visually tracing dependencies from core models to upstream sources.

Real-world use cases powered by the dbt MCP Server

These use cases demonstrate how the MCP Server transforms dbt from a static data tool into a dynamic control plane for AI-driven operations.

Self-service analytics

Non-technical users can now explore dbt projects conversationally using natural language interfaces. For example, a business stakeholder can ask, What models do we have for customer behavior?” They instantly retrieve a list of relevant models, their descriptions, and lineage context. This works through Discovery Tools like get_all_models and get_model_details. These tools make dbt metadata accessible and explorable without SQL knowledge.

AI agent workflows

Agentic systems can autonomously discover and map model relationships using tools such as get_model_parents and get_mart_models. These tools help understand how models connect to each other and which ones are used for reporting and business decisions. This allows them to dynamically reason through dbt projects and understand dependencies. Based on this context, AI agents can take informed actions, such as investigating upstream changes or mapping column-level lineage, without human prompting.

AI-accelerated dbt development

Beyond simple queries, the dbt MCP Server enables advanced AI-driven development workflows. An AI agent can proactively identify and refactor dbt models, ensuring models align with best practices.

For example, an agent can reference intermediate data instead of raw staging tables. The agent can automatically generate new models or update existing ones, taking into account the project's dependencies.

If the agent runs a model and detects errors, it can automatically analyze the issues, such as join logic failures. It can then suggest or apply fixes directly. This capability transforms dbt into a dynamic partner in your analytics development lifecycle (ADLC).

Trusted data analysis

LLMs often hallucinate when disconnected from ground truth. The MCP Server enables AI systems to query canonical metrics and dimensions like monthly revenue or active users. The server uses dbt's Semantic Layer tools to ensure that outputs strictly reflect definitions within your dbt project, enhancing reliability and trust.

Accelerated dbt operations

AI agents can now drive operational workflows using familiar dbt command line interface (CLI) commands: run, test, compile, and build. The MCP Server acts as a secure bridge between prompt-driven interactions and backend operations. For example, an agent can be triggered to “run daily models” or “compile staging layer,” running safely within scoped environments. That means they run in isolated, permission-controlled contexts that protect production systems.

These use cases show how the dbt MCP Server transforms dbt into a dynamic control plane for AI-driven operations. It ensures:

  • Governance at scale: AI agents operate using standardized semantic definitions, maintaining consistency across reporting workflows.
  • Operational efficiency: Automating builds, tests, and queries removes the need for manual intervention, saving time and reducing human error.
  • Collaboration: Both humans and AI systems interact with the same trusted data context, eliminating silos and improving decision-making.

Together, this multi-pronged value proposition illustrates why the MCP Server is indeed the “missing glue” connecting governed data with AI agents in a trusted and scalable manner.

Limitations and security in using the dbt MCP Server

As an experimental release, the dbt MCP Server presents certain limitations and requires careful security considerations for effective and safe deployment.

Tool selection requires human oversight

During testing, AI models occasionally cycle through unnecessary tools. They might call get_all_models before narrowing to get_mart_models, or pick incorrect tools for requests.

While users can correct this with specific prompts, it currently undermines fully autonomous operation. This behavior stems from the protocol's immaturity. However, it will improve through community feedback and updates.

Uncontrolled SQL execution poses risks

Freeform SQL in the MCP Server supports flexible exploration. However, it bypasses dbt’s semantic safeguards. Uncontrolled use can lead to incorrect results and costly warehouse queries.

dbt Labs recommends limiting SQL tools to sandbox environments. For trusted, production-grade insights, always prefer Semantic Layer tools like query_metrics that enforce certified logic.

Prototype before scaling

The current dbt MCP Server is experimental. Begin with limited, low-risk use cases, such as metadata queries, sandboxed metric checks, or lineage tool testing in development environments.

Only scale after proving value. This isn’t production-ready—prototype first, then expand.

Strict permission scoping is essential

Running tools like dbt run carry inherent risks. Mitigate them through:

  • Command disabling: Block risky operations via flags like DISABLE_DBT_CLI=true.
  • Ephemeral environments: Isolate runs in containers that self-destruct post-run.
  • Granular role-based access control (RBAC): Restrict dbt tokens to specific projects and environments.
  • Read-first policy: Enforce read-only access until safety is proven.

Always audit tool call logs to detect prompt injection or misuse.

How the dbt MCP Server shapes the future of AI-driven data access

The dbt MCP Server is the missing glue that will fundamentally change how AI interacts with your data. It paves the way for AI to drive both business intelligence and data engineering directly. This means AI will deeply understand your dbt projects, transforming how data is used and built.

Data teams will increasingly focus on creating this rich, governed context that feeds into the server. This shifts their role towards making data highly understandable and trustworthy for AI. Giving AI systems access to structured context through dbt establishes a solid and lasting part of the modern data stack.

A key promise of the dbt MCP Server is enabling safe and reliable data access. It offers built-in security features and access controls. Ultimately, dbt is becoming the central data control plane for AI. This ensures AI agents can access structured data reliably, making every AI-driven insight consistent and aligned with organizational truth.

Ready to Get Started with the dbt MCP Server?

The dbt MCP Server is an experimental release that shapes the future of AI-driven data access. It’s available now on GitHub for prototyping AI agents that will benefit from a deep understanding of how your data is structured and used.

To get started:

  • Explore the repository on GitHub for installation instructions.
  • Connect your dbt project to begin experimenting with AI-powered data workflows.
  • Join the conversation in the dbt Community Slack's #tools-dbt-mcp channel to share your findings.

Published on: Jun 16, 2025

2025 dbt Launch Showcase

Catch our Showcase launch replay to hear from our executives and product leaders about the latest features landing in dbt.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt.

Read now

Share this article
The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

100,000+active members
50k+teams using dbt weekly
50+Community meetups