How AI is changing the analytics stack

last updated on Nov 05, 2025

Today’s cloud-native data processing architectures have enabled cost-efficient data warehousing, real-time data processing, and a new level of self-service analytics.

But most enterprises find themselves running up against a data ceiling. Driving insights from today’s data stack takes a huge productivity toll on all involved.

AI is about to change everything for the analytics stack.

Read now: Get the early chapters of "Structured for intelligence," a new report from O'Reilly and dbt Labs on why structured context isn’t just optional, but the foundation for AI that works.

Our fragmented reality

For years, extracting insights from enterprise data has meant wrestling with a patchwork of tools. BI dashboards, SQL editors, data warehouses, ETL pipelines, and governance systems. Each has its own interface and learning curve.

Most business users lack the technical skills to use these tools independently. That forces a dependency on analysts or engineers. The result is reports that arrive days or weeks later, often with insights that have already gone stale.

The numbers paint a stark picture. Enterprise analytics teams now work across an average of 400 data sources. Nearly one in five enterprises juggle more than 1,000. A 2024 industry survey revealed that more than 70% of data teams rely on 5 to 7 different tools just to get through daily workflows. About 10% juggle more than 10.

The productivity toll is measurable. In the 2025 State of Analytics Engineering Report, 57% of analytics and data professionals said they spent most of their time maintaining or organizing datasets. That’s the same level as the prior year, despite 70% using AI to help write code and documentation.

Data scientists spend about 60% of their time cleaning and organizing data, and another 19% gathering datasets. This leaves roughly 20% for actual analysis and insight generation.

Between 60% and 73% of all enterprise data never gets used for analytics. 83% of organizations suffer from data silos, and 97% believe those silos hurt performance. Many users don't even know what data exists within their organization, let alone how to access or apply it.

This fragmented reality is where most enterprises find themselves today. And this is where AI can help.

From dashboards to dialogue

For much of the modern data era, business intelligence has been defined by static dashboards and the technical expertise required to navigate them. Pulling meaningful insights often meant waiting for scarce data analysts to run SQL queries or create tailored reports. This system empowered only those with the tools and training to interpret raw datasets.

This bottleneck is now easing. AI-powered conversational analytics lets users ask questions in plain English and iterate in real time while preserving definitions and controls behind the scenes. The global conversational AI market hit $13.2 billion in 2024 and is projected to reach $49.9 billion by 2031, growing at nearly 25% annually.

Gartner predicts that by 2025, natural language will be the main way people interact with data systems. That change alone is expected to drive a 100x surge in data usage across organizations. As access improves, the value of data doesn't just rise—it multiplies.

Beyond conversational analytics

Beyond one-off questions, agentic AI coordinates multi-step work—planning, writing code or SQL, running checks, and proposing changes. Research from Capgemini found that 50% of enterprises plan to implement AI agents in 2025, with adoption expected to reach 82% in 2028.

User expectations have shifted with tools like ChatGPT, Claude, Gemini, and Microsoft Copilot. The natural language interface has become not only a standard for AI systems but also a key feature for many traditional applications.

As major AI developers roll out new capabilities to enormous user bases, expectations are normalizing around systems that don't just answer—they act within guardrails.

Download now: Get the early chapters of "Structured for intelligence," a new report from O'Reilly and dbt Labs on why structured context isn’t just optional, but the foundation for AI that works.

A glimpse into the future with agentic development

Modern AI IDEs point to where interfaces are heading. Consider what this could look like for data engineering. You've been assigned to build a weekly ETL pipeline to aggregate customer activity, enforce data quality standards, calculate summary metrics, and push the final output into production.

In an agentic system, you define your goal in natural language: "Create a weekly customer activity ETL pipeline. Include data quality checks for nulls and duplicates, calculate weekly active users, and push summary tables to the analytics warehouse."

From that point, the AI agent gets to work. It scans your project, considering schema definitions, naming conventions, current pipeline structures, and warehouse configuration. It outlines a detailed plan—creating a new model, drafting Python scripts for anomaly detection, and preparing orchestration configs aligned with your tech stack.

Once you approve, the agent transitions into automated development. It writes SQL aggregation logic, test scripts, Continuous Integration/Continuous Development (CI/CD) configurations tailored to your stack, and optional README updates. Everything is formatted to match your project's style guidelines.

Next comes execution and validation. The agent runs the full pipeline in a sandboxed environment, executes the SQL, initiates data quality scripts, and runs your test suite. This real-time feedback loop ensures problems are caught before human review.

Finally, when the pipeline passes all checks, the agent handles deployment—opening a pull request or directly pushing updates to your orchestration layer.

Why structured context is critical

For agentic systems to operate safely and autonomously in complex environments, you need more than instructions and advanced AI. You need structured data and metadata—the schemas, semantics, relationships, permissions, and lineage that describe how your data works.

Generative AI (GenAI) has earned its spotlight largely for what it can do with unstructured data. But the emergence of agentic AI has pushed enterprise companies to rethink their entire data foundation.

In this shift, structured context moves from "nice to have" to "non-negotiable." Enterprise automation depends on structured context. Without it, agents can't function safely or effectively.

The two critical integrations

An AI system must integrate with:

Structured data: The rows in your data warehouse or lakehouse from customer relationship management (CRM), enterprise resource planning (ERP), human capital management (HCM), and other enterprise systems.

Structured metadata: Data about models, sources, lineage, dependencies, tags, and governance tags—essentially a map of your data ecosystem.

With structured data and metadata, AI agents can discover tables, understand relationships, check quality, and plan safe actions. This enables powerful conversational analytics powered by planning, reasoning, and autonomous decision-making. But it also includes governance—policies and permissions embedded in metadata that determine who can access which datasets, which transformations are permitted, and how sensitive fields must be handled.

Structured context equips agents with three key capabilities:

Memory via metadata, so they know what assets exist and how they relate
Boundaries via clear definitions, permissions, and rules so they don't wander outside guardrails
The ability to take useful actions with validated tools to read and write safely

When you combine these, agents evolve from chatbots into reliable teammates. They can plan, reason, and execute tasks at scale. We're already seeing agents autonomously modify data pipelines, fix errors, manage migrations, and spin up new data products—all driven by structured inputs and aligned with business logic.

When AI gets it wrong

Weak governance and poor inputs have predictable consequences.

Studies cite that between 70% and 80% of AI projects don't succeed. That’s nearly double the failure rate of traditional IT projects.

In most cases, it comes down to bad data. Gartner puts the average cost of poor data quality at around $12.8 million per year per organization. Some companies lose as much as 6% of annual revenue from flawed AI outputs.

High-profile examples show the impact. A major airline was taken to court after its chatbot promised a bereavement fare refund that didn't exist. The airline had to pay the customer over $600.

Lawyers have faced sanctions for submitting briefs citing fictional cases. News sites have published AI travel content directing readers to unsafe destinations. OpenAI's newest models hallucinate at higher rates than predecessors, with error rates hitting 48% and 33%.

Regulatory considerations

Regulators are making guardrails explicit. Under the EU AI Act, especially Articles 10 and 27, organizations face serious compliance risks.

Article 10 mandates that high-risk AI systems use datasets that are complete, accurate, representative, and error-free. Organizations must document everything—data sources, annotation methods, quality checks, and bias mitigation.

Article 27 requires Fundamental Rights Impact Assessments that examine fairness, dignity, and non-discrimination. Companies must map data flows, retention policies, oversight mechanisms, and risk mitigation steps, sometimes reporting to regulators.

The EU AI Act makes clear that data quality and governance are legal obligations that need to be baked into every layer of the AI pipeline. Real compliance means engineering a governance framework that is automated, auditable, and built to scale.

The industry response

The technology industry is actively working to fix what's broken. Vendors are converging on a context-first, governance-forward model. Quality and policy controls are moving closer to where queries run and transformations execute.

Data quality and observability are receiving an AI boost. Vendors are adding AI and ML capabilities to detect anomalies or quality issues in real time. New features in cloud data platforms can automatically monitor freshness, null spikes, or other data health metrics and alert teams before issues propagate downstream.

Read now: Get the early chapters of "Structured for intelligence," a new report from O'Reilly and dbt Labs that shows why structured context isn’t just optional, but the foundation for AI that works.

The changing role of data engineers

This deeper integration of intelligence signals dramatic changes for data engineers. Tasks that once defined the profession—building ingestion pipelines, wrangling schemas, writing ETL logic, monitoring data flows—will increasingly be handled by intelligent systems.

Yet this evolution doesn't mean data engineers are becoming obsolete. As routine responsibilities fade, engineers are shifting focus to more strategic concerns. They're designing resilient and adaptive data architectures, validating the integrity and semantics of AI-driven pipelines, maintaining rigorous standards for data quality, and embedding ethical and compliance principles into core infrastructure.

Data engineers are evolving from system operators to system stewards. The work now demands fluency in AI-native tools, semantic data modeling, governance strategy, and the supervision of autonomous agents. The shift isn't about doing less—it's about doing more of what truly matters.

For this new era to become reality, there must be a solid foundation where LLMs effectively interact with structured data and metadata. Otherwise, ‌capabilities remain shallow, not grounded in the relevant data, processes, and workflows of the enterprise.

Live virtual event:

Experience the dbt Fusion engine with Tristan Handy and Elias DeFaria.

Save your seat

VS Code Extension

The free dbt VS Code extension is the best way to develop locally in dbt.

Install free extension

Latest posts

Insights14 min

Building a multimodal lakehouse for AI

Daniel Poppy

on Nov 23, 2025

Press3 min

dbt Labs Expands dbt Fusion Engine Ecosystem with Microsoft Fabric Integration

Elaine Green

on Nov 18, 2025

Learn16 min

Modeling for success: Building data structures that last

Kathryn Chubb