Creating reliable data products with analytics engineering

on Jun 12, 2025
The need for a mature analytics workflow
Even with cloud scale, technological advances, and growing use of tools, many teams remain stuck with point solutions or fragmented workflows. Consider how, at most organizations:
- Version control for ingestion pipelines is rare.
- Testing and Service Level Agreements (SLAs) for dashboards are nearly non-existent.
- Collaboration often depends on informal, manual processes.
- Incident handling is ad hoc rather than structured.
The result is "data products"—dashboards, models, reports—whose reliability and business fitness is difficult to prove or maintain. For analytics to create value at scale, mature workflows, not just point technologies, are required.
Example
Suppose a marketing manager needs customer conversion metrics updated daily. The data engineering team relies on ad hoc SQL scripts, run manually, and emailed as spreadsheets to the analytics team, who then create dashboards. If a schema changes or data quality issue arises upstream, errors propagate—conversion rates may be reported incorrectly for days before anyone realizes. There is no auditable process to roll back, test, or track changes, and the business loses trust in the analytics system.
Requirements of a mature analytics workflow
A truly reliable data product arises from a workflow embodying the following characteristics:
- Data and collaboration scale: Can handle growing data volumes and more contributors without process breakdown.
- Accessibility: Enables all relevant personas (engineers, analysts, decision-makers) to contribute.
- Velocity and agility: Supports fast, iterative analysis with minimal process overhead.
- Correctness and validation: Embeds automated tests to ensure data accuracy.
- Auditability: Every change and result is reproducible and traceable.
- Governance: Access, compliance, and usage policies are integrated from the outset.
- Criticality, reliability, resilience: Data products can scale from experiments to business-critical use with confidence; error detection and recovery are systematic, not ad hoc.
Imagine a retailer launching a flash sale campaign. The product team needs rapid insights on which channels drive real-time sales. With a mature workflow, campaign data ingestion can scale up quickly, analysis can be pushed to multiple analysts, findings are validated by automated tests, and any errors or schema changes are automatically detected, triaged, and communicated. Confidence in insights remains high, facilitating decisive action during the high-stakes campaign.
The Analytics Development Lifecycle (ADLC)
The ADLC is a structured workflow modeled on proven software engineering practices. It emphasizes the continuous, collaborative, and iterative nature of analytics work and applies equally to every "artifact"—data pipelines, models, dashboards, or derived datasets.
The ADLC consists of the following stages:
- Plan
- Develop
- Test
- Deploy
- Operate
- Observe
- Discover
- Analyze
Each stage interacts in a loop, creating a cycle of improvement and adaptation, not a one-way flow.
1. Plan
Planning sets the foundation for reliability and value. It includes:
- Clarifying the business need or hypothesis driving the change.
- Involving relevant stakeholders early.
- Assessing downstream impacts of any changes (e.g., which dashboards depend on a specific model).
- Designing for maintainability, data security, and stakeholder access.
- Chunking large projects into small, manageable iterations.
Example
A financial analyst wants to introduce a "customer lifetime value" metric. The plan involves identifying required data sources, reviewing existing models for possible reuse, evaluating privacy implications, and setting up a process so marketing and product teams can provide feedback before the metric is finalized.
2. Develop
Development is not just about technical coding; it instills collaboration and best practices:
- All business logic is captured as code, regardless of interface (SQL, Python, visual tools).
- Development environments are flexible; contributors can use tools suited to their workflow.
- A style guide enhances consistency, making code maintainable by others.
- Functionality and clarity are prioritized over premature optimization.
- Code review by peers ensures robustness and knowledge sharing.
- Vendor lock-in is minimized by favoring open standards.
Example
Two data modelers, using different interfaces, collaborate on the same data model representing "active users." Thanks to standardized code in source control, reviews, and a shared style guide, the team can efficiently merge improvements without confusion.
3. Test
Testing is the ADLC's backbone for reliability:
- Unit tests: Ensure each function or model behaves as intended.
- Data tests: Validate that actual data conforms to logic and assumptions.
- Integration tests: Catch issues arising from the combination of upstream and downstream components.
Testing is mandatory before promotion to production and is run automatically as part of continuous integration.
Example
An update to the sales pipeline model triggers automated tests that check for referential integrity, realistic sales values, and non-breaking schema changes in downstream dashboards. Any failures block deployment, avoiding disruptions.
4. Deploy
Deployment is automated, transparent, and safe:
- Triggered by merging code into a main branch.
- Handles environment promotion (dev → staging → production).
- Does not create user-facing downtime.
- Supports automated, safe rollback in case of unseen errors.
Example
When a new product categorization model is approved, deployment happens automatically upon merge. If an error is detected in production (e.g., a missing category), it is quickly reverted, minimizing business disruption.
5. Operate and 6. Observe
Once live, the system is actively operated and observed:
- Production systems are always-on, or have clear and minimal planned downtimes.
- Resilience is designed in: error handling, monitoring, and automated remediation.
- Incidents—data load failures, slow queries, stale reports—are detected and investigated before users notice.
- Key metrics (such as uptime, freshness, latency) are tracked and guide process improvements.
Example
The analytics team for an e-commerce site is notified by monitors that yesterday's transactions failed to load due to an upstream API change. Automated incident tracking and on-call procedures ensure rapid diagnosis and resolution, with zero impact on downstream sales reporting.
7. Discover and 8. Analyze
The final (and looping) stage is where business value is extracted. Discovery involves making all data assets (datasets, dashboards, metrics) easily searchable, accessible, and understandable—removing friction for both analysts and decision-makers.
Analysis builds on these assets, using governed, trusted data to conduct investigations, answer ad hoc questions, iterate on hypotheses, and produce shareable, maintainable outputs.
Key requirements:
- Search and access to all governed data artifacts without bottlenecks.
- Direct feedback, annotation, and improvement loops embedded in the tools.
- Analysis outputs can themselves cycle back into the Plan stage for continued improvement.
- Environments (development, staging, production) are transparent and selectable by users based on their needs.
Example
A supply chain analyst discovers a new anomaly in warehouse returns using a standardized, documented dataset. Her exploratory notebook, once validated and reviewed, becomes a maintained dashboard, with lineage and reproducibility guaranteed by the ADLC workflow.
Stakeholders: collaboration across personas
A mature analytics workflow recognizes that roles are flexible—individuals may put on different "hats" depending on the need:
- Engineer: Builds reusable data pipelines and models.
- Analyst: Explores data, validates hypotheses, produces recommendations.
- Decision-maker: Consumes insights and acts upon them.
The ADLC's greatest value emerges when these roles collaborate seamlessly within the same workflow and tooling. Organizational agility is maximized when hand-offs disappear, and individuals can transition between hats as projects demand.
Example
A small SaaS startup's product leader creates a quick usage metric, validates initial results, and, after peer review, productionizes it for quarterly board reporting—all within the ADLC framework, skipping no quality gates.
Instituting the ADLC: principles and long-term value
To create reliable data products, organizations must treat analytics systems as software systems—inherently collaborative, modular, testable, and auditable. This means:
- Every artifact is versioned, tested, and documented.
- Feedback loops are explicit, encouraging continuous improvement.
- Errors are anticipated; processes for detection, mitigation, and communication are built-in.
- SLAs for data products (availability, correctness, freshness) are defined, measured, and met.
- Governance is not an afterthought, but intrinsic.
Long-term, this creates data products with high trust, low maintenance overhead, and scalability as business stakes rise.
Conclusion
The analytics development lifecycle (ADLC) offers a definitive, end-to-end workflow for building mature, reliable, and value-generating data products. By adopting its principles—drawn from decades of software engineering experience—organizations can align people, processes, and tools, achieving both agility and governance.
The path to data maturity is ongoing—a shared endeavor among practitioners, leaders, and technology providers. By consistently applying the ADLC, companies can close the gap between the promise of analytics and practical, dependable delivery of insights, driving better outcomes for every stakeholder.
Learn more about analytics engineering best practices at getdbt.com/blog
Analytics engineering FAQ
What is analytics engineering?
Analytics engineering is a discipline that combines software engineering principles with data analytics. It focuses on creating reliable, maintainable data products through structured workflows. Analytics engineering implements processes like version control, testing, automation, and governance to ensure data products (dashboards, models, reports) are accurate and trustworthy. It bridges the gap between raw data and actionable insights by applying software development best practices to analytics workflows.
What do analytics engineers do?
Analytics engineers build and maintain data pipelines, models, and other analytics infrastructure while ensuring reliability and scalability. Their key responsibilities include:
- Creating reusable data pipelines and models using code
- Implementing testing frameworks to validate data accuracy
- Setting up version control for analytics assets
- Automating deployment processes
- Establishing monitoring systems to detect issues
- Collaborating with analysts and decision-makers
- Building discoverable, well-documented data assets
- Ensuring governance and compliance requirements are met
- Supporting the entire Analytics Development Life Cycle (ADLC)
Published on: Apr 02, 2025
2025 dbt Launch Showcase
Catch our Showcase launch replay to hear from our executives and product leaders about the latest features landing in dbt.
Set your organization up for success. Read the business case guide to accelerate time to value with dbt.