AI-assisted analytics engineering: Docusign’s framework for scaling dbt unit testing

last updated on May 18, 2026
This guest post comes from Sundar Subramanyam, Lead Data Engineer at Docusign.
At Docusign, we support millions of customers worldwide in managing critical agreement workflows. As our analytics platform scaled to support new product launches and features, ensuring data quality before production data existed became a key challenge.
Traditional dbt data tests (e.g., not_null, unique) rely on existing datasets. However, for new features and evolving pipelines, we needed dbt unit tests—tests that validate logic by mocking input data and asserting expected outputs.
While powerful in theory, unit testing in dbt introduced a practical problem - the effort required to manually author tests did not scale with the complexity of our models.
To address this, we explored a focused question:
Can AI systematically reduce the friction of dbt unit testing?
This led to the development of a structured approach using GitHub Copilot (GPT-4+) that significantly improved both testing efficiency and adoption. We used our own AI tooling to speed up unit test drafting by ~90%, while dbt provided the structure to govern and enforce those tests in CI, making data quality reliable even before production data existed.
The unit testing bottleneck
Unit testing in analytics engineering is critical for validating:
- Complex
CASElogic - Join conditions and fan-out scenarios
- Filtering rules and edge cases
However, the real challenge lies in the setup:
For each model, engineers must:
- Analyze SQL logic
- Create mock input datasets
- Manually compute expected outputs
- Debug YAML syntax
In practice, this process took up to 5 hours per complex model, making comprehensive testing difficult to prioritize.
Introducing the AI-assisted dbt unit testing framework

To address this, I developed the AI-Assisted dbt Unit Testing Framework — a structured, human-in-the-loop methodology that leverages generative AI to automate the creation of dbt unit tests.
Rather than treating AI as a replacement for engineers, this framework positions AI as an accelerator for repetitive tasks, while preserving human validation for correctness.
Framework workflow
The framework follows a repeatable, multi-step process:
1. Model input
Engineers provide a dbt model (e.g., dim_customer) containing SQL transformations.
2. AI interpretation
A custom AI workflow parses:
- Column-level transformations
- Joins and filters
- Logical branches (e.g., CASE conditions)
3. Logic summarization
The system generates a structured understanding of:
- Source tables and references
- Output columns
- Transformation rules
4. Human validation
Engineers review and confirm the interpretation before proceeding, ensuring correctness and trust.
5. Test case generation
The framework generates:
- Positive test cases
- Negative test cases
- Edge-case scenarios
The AI focuses heavily on generating synthetic mock data, including:
- Null handling
- Boundary conditions
- Join anomalies
- Temporal edge cases
6. YAML output
The system produces a valid dbt unit test file (*_unit_test.yml) with:
- Mock input datasets
- Expected outputs
- dbt-compliant structure
7. Iterative refinement
Engineers refine the generated tests and commit them into the dbt CI/CD pipeline.
Prompt pattern behind the framework
The core of this workflow was a structured prompt pattern rather than a one-off AI request. The prompt guided the AI through a repeatable sequence:
- Interpret the dbt model logic.
- Identify source references and output columns.
- Summarize the logic and ask the engineer to validate the understanding.
- Generate positive and negative unit test scenarios.
- Create mock input data and expected outputs.
- Ensure the `expect` section matches the model output columns.
- Output the result as a dbt-compliant `<model_name>_unit_test.yml` file.
- Allow the engineer to refine the test cases through feedback.
This structure helped make the workflow repeatable and reviewable, while keeping the engineer responsible for validating business logic and expected outcomes.
From SQL to test case
The real power is seeing how the AI handles mocking data.
Model SQL (Snippet):
SQL
CASE
WHEN subscription_status = 'Active' AND renewal_date < current_date THEN 'Overdue'
WHEN subscription_status = 'Active' THEN 'Current'
ELSE 'Inactive'
END as derived_status
AI-generated unit test:
The framework generates test scenarios such as:
YAML
unit_tests:
- name: test_derived_status_logic
model: dim_subscription
given:
- input: ref('stg_salesforce')
rows:
- {subscription_status: 'Active', renewal_date: '2023-01-01'} # Scenario 1: Overdue
- {subscription_status: 'Active', renewal_date: '2025-01-01'} # Scenario 2: Current
- {subscription_status: 'Pending', renewal_date: '2025-01-01'} # Scenario 3: Inactive
expect:
- rows:
- {derived_status: 'Overdue'}
- {derived_status: 'Current'}
- {derived_status: 'Inactive'}
The key advantage - The AI identifies logic branches and automatically generates test data to validate each scenario.
Impact: 10x productivity and test coverage
The results of this small experiment were immediate and measurable:
- 90% Reduction in cycle time: Writing a comprehensive Unit test suite dropped from 5 hours to roughly 30 minutes. Engineers no longer start from a blank file - but they start with a working draft.
- Increased test coverage: Because testing became easier, engineers tested more. We closed the gaps on edge cases that used to slip through manual review.
- Shift-left quality: We caught complex logic bugs (mismatched joins, bad filters) locally, long before they reached the production dashboards. Implementing unit tests helped us catch at least 5–10 data defects that would otherwise have gone unnoticed.
- Scalable Trust: Whether refactoring legacy code or building net-new models for new product and feature launches, we established a consistent baseline of quality without burning out the team.
What worked and what didn’t
Where AI excelled:
- Parsing Jinja and SQL syntax to map logic branches.
- Generating tedious mock data (rows of CSVs) in valid YAML format.
- Identifying edge cases a human might overlook (e.g., "What if this date is null?").
Where humans remain essential:
- Validating the business intent of the logic.
- Ensuring the "expected output" aligns with domain knowledge, not just code patterns.
Industry relevance and adoption potential
The challenges addressed by this framework are not unique to a single organization. Many data teams struggle with:
- Low adoption of unit testing
- High manual effort
- Inconsistent data validation practices
This framework provides a reusable and scalable approach that can be applied across dbt projects and analytics engineering teams.
Looking ahead: From a win to a workflow
This initiative began as a focused experiment but has evolved into a repeatable pattern for integrating AI into analytics engineering workflows.
Future directions include:
- Integrating test generation into CI/CD pipelines
- Generating tests from business requirements (e.g., Jira tickets)
- Expanding the framework to other areas of data engineering
Conclusion
AI does not need to be complex to be impactful.
By addressing a specific bottleneck—unit test creation in dbt—this framework demonstrates how targeted AI applications can deliver measurable improvements in productivity, reliability, and scalability.
The broader takeaway:
“Identify one friction point in your workflow—and use AI to systematically eliminate it.”
VS Code Extension
The free dbt VS Code extension is the best way to develop locally in dbt.





