How Zscaler cut PR review time by 90% using dbt context and multi-agent AI (OpenAI)

last updated on Feb 25, 2026

Zscaler built an AI-powered, multi-agent PR review system that uses dbt’s structured context (metadata + lineage + CI signals) to automate governance at scale. The result: 90% less reviewer time and a projected 2,100 engineering hours saved annually.

Zscaler is a leading cloud-based cybersecurity company. A pioneer of “Zero Trust” security architecture, protecting thousands of organizations from cyber attacks and data loss.

As the company expanded, so did its enterprise data platform (built on Snowflake, dbt, and Matillion). One part of the scale hit hardest: pull request reviews. Governance was essential, but manual review became the bottleneck.

We'll explore how Zscaler's data team built a multi-agent PR review system they internally called PRISM (PR Review Intelligence System Mentor), powered by OpenAI and MCP tools (dbt, GitHub, and Snowflake). Their multi-agent PR system was built to turn governance into an automated agent workflow, transforming their pull request (PR) process and reducing reviewer time by 90%.

The double-edged sword of self-service

Like many companies, Zscaler started with a centralized model. Every data request flowed through a single data team to ensure consistency and quality control.

But as Zscaler grew, the data team struggled to keep up with data requests. Simple data pulls took weeks — a frustrating experience that undermined trust for stakeholders.

To help the team scale, Zscaler transitioned to a self-service analytics model with dbt. The central data team became a center of enablement focused on building foundational data layers.

Initially, the shift significantly increased data velocity. Business teams were empowered to create their own transformations and took greater ownership of their analytics.

But self-service created a new problem: governance didn’t scale.

Two forces collided:

Review (pull requests) volume exploded. As the number of contributors and dbt models grew, the data team became inundated with 900-1,000 PR reviews every quarter, each requiring careful evaluation, which became overwhelming.
Review complexity grew. Every manual review was time-consuming and required a deep understanding of pipelines that the reviewers hadn’t built: were freshness and data quality tests defined? Did models include proper documentation? Would changes break downstream dashboards? The data team found itself constantly context-switching and devoting hours to educating contributors on best practices.

“We thought we had built a Self-Service Paradise. But enabling self-service can be a double-edged sword,” reflects Rahan Raman, Head of Enterprise Data Platform at Zscaler. “It turned out we had turned the data team into a help desk for peer reviews.”

Self-service had solved the velocity problem. But without a sustainable way to enforce standards, governance had become the new bottleneck.

Building a multi-agent PR review system that run on dbt’s structured context

To built their PR agent, Zscaler used a LangGraph-based multi-agent orchestrator to automate code reviews for dbt models.

Unlike generic coding agents, their PR agent works because it’s a domain-aware reviewer that understands Zscaler’s warehouse, dbt project structure, dependencies, and standards, because dbt exposes that context.

"AI can help automate governance, reduce review burden, and educate contributors, but without real context, it’s just noise,” says Rishi Varahagiri, Senior Data Engineer at Zscaler. “When AI has the right dbt context (lineage, CI performance metrics, and validated compilation), it can give targeted, meaningful PR review feedback instead of generic suggestions.”

The context layer: what dbt gives the agents to reason with

The system gathers dbt structured context by pulling from four key sources:

dbt Discovery APIs, which expose downstream lineage for every model in a PR and establishes dependency context (what’s upstream, what’s downstream, what could break).
dbt CI jobs, which automatically validate changes on every PR and provide performance signals like execution time, bytes scanned, and partitions scanned—turning CI into a baseline for optimization decisions.
dbt Cloud APIs, which compile and validate AI-generated suggestions before they’re surfaced to developers—reducing the risk of “hallucinated” changes.
Snowflake Query Insights, (query plans and operator statistics) show how queries execute and where time and resources are spent—ground truth for performance tuning.

The end-to-end workflow: specialized agents that review, enforce, and improve PRs

With context in hand, now, let’s look at how it comes together inside their multi-agent system.

Everything starts with the pull request. When a developer opens a PR for a dbt model change, the context collection process begins immediately. Their agent pulls:

The file diffs (what changed)
The dbt CI job execution results (did it run end-to-end? What were the metrics?)
Query execution insights from Snowflake (what does the plan show? where are costs concentrated?)

If the dbt CI job fails, the process stops there. No “AI review” layered on top of a failing build—CI is the gate.

Once that context is assembled, the LangGraph-based multi-agent orchestrator takes over. Each agent is specialized and performs a specific task:

Linter agent: Automatically checks structural best practices: naming conventions, SQL & Python formatting, folder structure.
Governor agent: Enforces governance requirements: documentation, tags, metadata policy, owner, groups, and checks for critical patterns like missing freshness configuration or incomplete documentation.
Impact analyzer: Maps downstream lineage and shares exactly what’s impacted—including dependent models and dashboards.
Optimizer-tester-healer trio: refactors long-running queries only when necessary and offers performance improvements based on CI and warehouse execution signals.
Test Reviewer Agent ensures before-and-after results match when optimized code is proposed.
Self-Healing Agent fixes code errors/bugs in generated or refactored SQL and hands it back to validation.

Finally, their agent logs every action into an audit table, capturing recommendations, outcomes, and workflow behavior for observability and adoption metrics. That log also acts as “memory,” so when a PR gets updated multiple times, their agent can understand what it already recommended and when, instead of repeating itself.

Then everything shows up where developers already work: as GitHub PR comments.

Zscaler’s multi-agent workflow doesn’t just “review.” It acts.

It can propose fixes, validate them, and present changes that developers can accept quickly.

What developers experience

PR reviews move from slow, manual, and inconsistent to fast, contextual, and mostly automated—without leaving GitHub.

When a developer opens a PR for a dbt model change, dbt CI runs and the multi-agent review workflow kicks off.

If CI fails, it stops. If it passes, the system posts review feedback directly as GitHub comments—so developers get guidance immediately, in the same place they already work.

What changes day-to-day:

Human-in-the-loop adoption: When the workflow proposes a refactor, developers can merge the optimized code into their branch by simply commenting “Accept.” Less back-and-forth, less waiting, fewer long review cycles.
Automated reviews with guardrails: Their multi agent system auto-approves PRs that follow best practices, pass CI checks, and don't need optimization. It falls back to targeted comments when logic is too complex—so it doesn’t produce risky code
Impact visibility: It surfaces downstream lineage and exposure context so teams can see what’s affected—models, dashboards, and key metrics—before changes ship.

In the “happy path,” the workflow can deliver meaningful performance gains.

“You can see a 30% improvement in runtime—and this is fully vetted code with before-and-after results of the optimization,” says Rishi Varahagiri. “When a developer raises a pull request, two checks kick off immediately: the dbt CI check and the agent check. Once they complete, it automatically posts review comments on the PR, and developers can merge the optimized code by simply commenting ‘Accept’—so there’s less back-and-forth and no long review cycles.”

The balance of automation where safe, human in the loop where uncertain, is what makes agentic automation scalable.

Quantifiable time savings, faster reviews, and higher data quality

Zscalers multi-agent workflow has proven itself as a strategic enabler and have seen measurable efficiency gains:

90% reduction in reviewer time: Decreased reviewer time by 90%, freeing data engineers to focus on higher-impact work.
High volume PR handling. In a single quarter, the system reviewed 956 PRs.
2,100 engineering hours saved annually. Zscaler projects a savings of 2,100 hours of annual time savings, which is the equivalent of one full-time engineer.

“With our multi-agent PR reviewer and dbt context, we reduced review time by up to 90%. We handle about 900–1,000 PRs per quarter. We actually had 956 PRs reviewed by the agent last quarter,” says Ra, “And even if you assume 30 minutes per PR, that projects to about 2,100 hours saved per year, basically one full-time engineer. AI isn’t just a tool anymore, it’s a collaborator that turns bottlenecks into opportunities.”

Perhaps most importantly, the central data team is no longer a bottleneck. Contributors get faster feedback, governance is consistently enforced, and overall data quality improves as the organization ships faster.

For data teams wanting to build agentic automation that can truly operate inside your real system but facing similar governance challenges, feed them dbt’s structured context so they can make trustworthy decisions. When agents can see lineage, tests, metadata, CI results, and warehouse behavior, governance becomes automatable—and self-service becomes sustainable.

Watch Zscaler’s full session to see how a context-driven, multi-agent PR reviewer reduces reviewer back-and-forth and automates governance in the PR itself.

Explore how dbt turns structured context into AI-powered workflows, whether you’re accelerating development with dbt Copilot and Agents or building your own agents with the dbt MCP Server, learn more here or book a demo today.

VS Code Extension

The free dbt VS Code extension is the best way to develop locally in dbt.

Install free extension

Latest posts

Pulse13 min

How ETL tools fit into modern data pipeline architecture

Joey Gault

on Mar 16, 2026

Pulse14 min

Why metadata management is critical for modern data teams

Joey Gault

on Mar 13, 2026

Product5 min

How a global investment firm reduced runtimes by 30–40% with the dbt Fusion engine

Elaine Green

on Mar 11, 2026

The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

Join the CommunityExplore the community

100,000+active members

50k+teams using dbt weekly

50+Community meetups