From prompt to PR: sandbox-validated dbt changes before human review

Breakout session Building agentic data workflows Basic Intermediate Data leaders Data practitioners All industries

A five-person data warehouse team. 100+ concurrent requests. 5,000+ active ETLs. We tried the shortcut: let Claude generate SQL, have engineers review it. It failed: code generation without validation just shifts the burden. Engineers still had to mentally execute every model.

The real answer was full-stack sandbox validation. From prompt to PR, every AI-generated model clears four layers before a human sees it: SQL compilation, dbt tests (uniqueness, range, relationships), orchestration via auto-generated Airflow DAG, and lineage integrity in OpenMetadata. All four must pass. Failures loop back to the agent, not the engineer.

Guardrails are structural: read-only prod service account, macro-enforced dev schema isolation, ephemeral Kyuubi compute per run. In production, 45% of new models carry the "ai-generated" tag.

You'll leave with the full-stack sandbox architecture, the four-layer validation blueprint, structural guardrails, and a model for governed AI promotion via explicit PR reviews.

Check out more sessions

View all sessions

From prompt to PR: sandbox-validated dbt changes before human review

Check out more sessions

Black in Data: Building connections and community during career growth

Data + AI agent hackathon

Meta:Context: a business context schema in dbt's semantic layer

Ready to join us?