When and why to upgrade to dbt Cloud
Here at dbt Labs, we build, maintain, and iterate on two products:
- dbt Core: an open-source framework for transforming data
- dbt Cloud: a managed service which provides Git-integrated code editing, job orchestration, and data quality controls on top of dbt Core’s transformation engine
If you’re an open source dbt Core user, you may be thinking, “why would I upgrade to dbt Cloud?”. We’re here to tell you when and why to consider a change.
When to upgrade to dbt Cloud
We often work with data engineers who move comfortably through the initial setup of dbt Core and complementary infrastructure. They’re capable of standing up a separate cloud compute service, an orchestration tool to run dbt jobs, and custom scripts to fill other gaps.
The problem starts when teams scale, and requests and incidents in their backlog start to grow. Below are common scenarios we see in the field.
Data quality issues
A team of data engineers manages every change in the ELT pipeline. The engineers are highly technical but distant from product, operations, finance, and other business team contexts. On the flip side, the business teams who rely on the data pipeline to make decisions are removed from how the pipeline process actually works.
All of these factors contribute to an endless cycle:
- The engineering team learns of new data use cases (a new product surface, changes to a pricing model, a new website experiment) only when a stakeholder files a ticket or reports a bug.
- Each new issue requires an engineer to learn and gather critical business context, slowing down development.
- The team falls behind on an always-growing backlog of data modeling and transformation requests.
- Data in the pipeline becomes stale, out of step with analytics needs.
Data literacy & scaling up the team
The data engineering team wants to enable other teams (analytics, operations, product) to create or validate data transformations — but finds it slow and resource-intensive to onboard new contributors. The team may struggle due to a combination of these factors:
- User provisioning is complex: Engineers need to support each new user with installs and configuration of locally-stored passwords and environment variables.
- Users are unfamiliar with Git: Engineers need to train contributors who are new to Git, and guide them through a new way of working.
- Documentation for the pipeline is sparse, hard to share, or downright non-existent — Engineers don’t have reusable materials to help new teammates learn where and how to plug in.
- Lack of automation: Engineers need to develop and maintain a custom CI/CD process to check and integrate new code programmatically — or get slowed down with a manual process to coordinate and merge changes across multiple developers.
Why use dbt Cloud?
dbt Cloud delivers several improvements out of the box to address the scenarios above.
Built-in Git guardrails for collaboration
dbt Cloud’s built-in Git integration makes version control intuitive, enabling teams to coordinate and commit code changes via a user-friendly IDE.
Faster onboarding and less context switching
dbt Cloud simplifies the onboarding process to dbt by eliminating the need for local installs and configurations.
The platform also consolidates role-based user access, environments, continuous integration, and job deployment into one application and UI. This reduces context-switching for developers and administrators alike, and eases routine maintenance work (like coordinating an update to the latest version of dbt Core).
CI to prevent issues from reaching production
- Automate code checks before merge
- Automate job runs upon pull request
- Limit test runs to only modified code — to shorten the test-and-fix feedback loop and avoid the cost and delay of running unchanged code.
Logging, alerting, and visual indicators for data quality
dbt Cloud comes with multiple features to ease data governance and improve data quality and transparency across multiple stages in the pipeline:
- The Model Timing dashboard displays the run time for every job run in dbt Cloud, and highlights the top 1% of run durations — making it easy to identify bottlenecks.
- Outbound webhooks trigger real-time notifications to other applications and tools (such as Slack, PagerDuty, or Jira) when certain events take place in dbt Cloud.
- The Metadata API powers integrations with data-monitoring tools, and supports GraphQL queries to diagnose the health of your pipeline.
- Dashboard status tiles give downstream dashboard users a quick, visual way to gauge the freshness and reliability of underlying data
Metrics governance for consistent analytics
Teams can cut down on inconsistent reports and “my number doesn’t match yours” fire drills, by using the dbt Semantic Layer to:
- define and govern key metrics in one central place
- use a BI tool integration to propagate metrics to dashboards, enabling stakeholders to filter, slice, and dice on demand.
Advanced security controls & dedicated support
At the Enterprise level, dbt Cloud customers can tap a dedicated account team to help structure and optimize their dbt project to custom needs.
They can also meet higher security and compliance requirements with:
- SSO & Role-based access control for fine-grained permissioning in dbt Cloud, using an identity provider of your choice
- Custom deployment regions to ensure compliance with data residency requirements
- Audit logs, exportable records of changes to users, groups, jobs, and projects
Data catalog for efficient knowledge management
dbt Cloud provides a shareable website that documents everything in the dbt project — an auto-updating, visual resource for new teammates to learn about pipeline components and dependencies. The website comes out of the box with every dbt Cloud account (no configuration or separate hosting solution needed), and includes:
- Side-by-side text and code documentation for every view, table and column
- Auto-updating DAG to make data lineage and dependencies in your project visible and transparent
How to learn more
If we’ve convinced you to give dbt Cloud another look, find out more from the resources below:
- Compare plans and pricing
- Talk to a dbt expert about your specific needs
- Join a bi-weekly demo to get a tour of dbt Cloud features
Last modified on: Sep 22, 2023