dbt was developed in 2016 as a way for anyone who knows SQL to deploy trusted data, faster. Inspired by software engineering best practices like modularity, testing, version control, and documentation, dbt became synonymous with the complete analytics engineering toolkit.
Over the years, as adoption of the open-source solution grew, so too did demand for a faster and more accessible way to develop dbt code in production. dbt Labs released dbt Cloud in 2019, complete with job scheduler, hosted documentation, and an all-in-one IDE for centralized development and testing. 2 years, 1,500 customers, and more than 21,000 community members later, we’re setting our sights on the next frontier – to make dbt a universal standard at the enterprise.
Making dbt Cloud an enterprise standard
Partnering with some of the world’s largest data-driven organizations like JetBlue, NASDAQ, Domain, and HubSpot has revealed the unique needs enterprise data teams face. Through these and many more conversations, we’ve learned critical requirements fall into six main buckets:
- Security and compliance that works
- Accessible tooling to operate, scale, and productionize workflows quickly
- Enterprise-grade performance and reliability
- Trusted insights about data health and architecture
- Platform extensibility for customization
- Interoperability with top tooling in the modern data stack
In support of these six areas of focus, we’ve made significant investments in dbt Cloud. Here’s a look at what we’ve accomplished recently, and what’s in store over the next year:
Security and compliance that works
In the last 18 months, we’ve added new security functionality to dbt Cloud, and expanded our own internal security program:
- Single Sign-On (SSO) and User Provisioning: dbt Cloud supports SSO via SAML. This helps integrate dbt Cloud with your organization’s Identity Provider (IdP) while also giving you an automated way to provision users from your IdP.
- Authorization and Access Control: Group users based on role to enforce least privileged access and meet compliance requirements.
- [NEW] SOC2 Type II: In addition to undergoing regular internal security audits and third-party penetration testing, dbt Cloud recently underwent a SOC 2 Type II examination that concluded we meet the required standard of operational effectiveness.
- [NEW] ISO27001 and ISO27701 certified: In order to achieve these certifications, dbt Labs had to demonstrate an ongoing and methodical approach to managing and protecting company and customer data.
As we look ahead into 2022 and beyond, we will build upon our foundational security offerings while expanding into more advanced security functionality:
- Customer managed keys: Bring your own encryption key to encrypt any data in dbt Cloud. Control your data while meeting compliance requirements.
- Audit Logging: Maintain a centralized audit trail in dbt Cloud to track changes, troubleshoot issues, and meet compliance requirements.
Tools to operate, scale, and productionize dbt quickly
Over the last 18 months, we’ve worked to make dbt Cloud both faster and more accessible. A big part of that work was a focus on adding more functionality to the dbt Cloud IDE and job scheduler for fast and flexible deployments, but we’ve also expanded our training and support programs to reduce ramp for new users.
- Integrated Development Experience (IDE)]: Write, run, test, and version control dbt project code all within the browser —- no command-line knowledge required. Visualize and contextualize project lineage with an embedded Direct Acyclic Graph (DAG).
- Job Scheduler: Set up custom schedules to run your production dbt jobs on a certain day, time, or recurring interval right within the dbt Cloud UI or via the API.
- [NEW] Environment Isolation and Promotion: Maintain different environments (staging, testing, production), and treat each differently. Environment variables in dbt Cloud enable teams to build context-aware code for greater control and flexibility.
- [NEW] dbt Rapid Onboarding: We just launched a new training option for our largest deployments. This program helps users to quickly ramp in dbt, using their own data, through dedicated private instruction.
- [NEW] 5 new courses on dbt Learn: We’ve expanded our free, self-guided dbt course offerings to include video tutorials on dbt fundamentals, macros and packages, materializations, analyses and seeds, and project refactoring.
The dbt Slack Community is now 22,000 members strong, and incredibly active. Here you’ll find advice on building data models, data teams, and data trust.
For 2022 and beyond, we plan to invest in the following functionality:
- Code-driven configuration for data workloads: Standardize and automate job scheduling for speed and compliance by expressing job definitions via code (3rd party plugins like Terraform or native dbt Cloud built) instead of a UI.
- Expanded integration support: dbt Cloud provides out-of-the-box support for integrations with the top version control tools like GitHub and Gitlab, and we’ll be adding support for other git providers in the coming year.
Enterprise-grade performance and reliability
Performance, reliability, and stability were our biggest product initiatives for 2021 but will remain an active pursuit to ensure dbt Cloud exceeds the performance, scale, and reliability standards of our most complex customers.
- [NEW] dbt Core v1.0 release: The latest version of dbt Core—-which powers the dbt Cloud experience—-offers 100x faster parsing, and easier upgrades with no breaking changes. This is an enormous improvement for organizations with large-scale deployments.
For 2022 and beyond, we are looking to make significant performance and scale improvements in the following areas:
- Job Scheduler: We will continue to make performance improvements to the job scheduler to bring enterprise-grade speed and performance
- IDE: Our goal is to ensure IDE performance in dbt Cloud scales as your project and model complexity increase. Even with thousands of models, dbt Cloud will provide responsive and real-time interactions.
- dbt Core: Up next for dbt core is a focus on even better support for dbt deployments that span multiple projects/packages, dedicated resources for each database adapter.
Insights for data health and architecture
dbt provides both a framework for building analytics projects as well as insight into how those projects and models can be improved for the sake of time, resources, and extensibility. Our investments in this area over the last year have proven to be most exciting for current enterprise customers:
- [NEW] dbt Metadata API: We launched the dbt Metadata API to ensure every user has the information they need pertaining to job run accuracy, recency, configuration, and table and view structure. This data is exposed via a GraphQL API enabling users and partners to build custom experiences on top.
- [NEW] Dashboard Status Tiles: The dbt Metadata API enables users to drop a tile directly into their dashboard that indicates data freshness and quality.
- [NEW] Model Bottlenecks (Beta): Quickly identify long-running models ripe for refactoring (or rescheduling) in order to save time and resources to help prioritize refactoring or consider reducing run cadence to speed up development
The Metadata API provides a rich and powerful framework to help understand various insights about your data and dbt Cloud usage. In 2022 and beyond, we will be building upon this foundational work to enable various experiences:
- Metadata API functionality: We have been invested in building out our Metadata API functionality. More NodeTypes, fields, and historical information coming in 2022!
- Data Discoverability: dbt Docs are used by organizations today to promote transparency into data assets. In 2022, we will be further investing in helping teams discover and understand the state of their data.
Platform extensibility for customization
Our goal is to provide you with a rich set of platform tools to help you customize dbt Cloud deployments to suit the unique needs of your organization:
- dbt Cloud APIs: The dbt Cloud API allows you to enqueue runs from a job, poll for run progress, and download artifacts after jobs have completed running.
For 2022 and beyond, we will continue to invest in adding more platform services to dbt Cloud
- API Coverage and Docs: In 2022 we plan to bring more stability and consistency to our API endpoints while expanding support for more objects and operations.
- Development flexibility: Make dbt Cloud the preferred development environment for your entire team (analysts, engineers, and data scientists) whether they prefer the command line or IDE.
- Make dbt metrics-aware: Define metrics in dbt projects and encode crucial business logic in tested, version-controlled code. Further, these metrics definitions can be exposed to downstream tooling to drive consistency and precision in metric reporting.
- Webhooks: Share data across disparate systems with webhooks that allow you to subscribe and act on specific events that happen in dbt Cloud. E.g., receive a Slack notification when a new job is triggered.
Interoperability with top tooling in the modern data stack
The modern data ecosystem is a complex array of interdependent tools. We recognize that for dbt to work well at scale, its ability to integrate effectively with the rest of the modern data stack is key.
That’s why we’ve worked closely with our partners to build out a considerable breadth of product integrations. Increasingly, dbt has emerged as the de facto standard for data transformation in the modern data stack.
Today, dbt integrations or adapters exist for all the following:
- Data Platforms: Snowflake, Databricks (Spark), BigQuery, Redshift. This year, dbt Labs was recognized as a Snowflake premier partner, launched on Snowflake Partner Connect, ran a joint workshop with Snowflake that drove over 2,000 attendees, and more. Meanwhile, a new, dedicated dbt-databricks adapter is now in public preview, and dbt will be available in the coming months on Databricks Partner Connect.
- Business Intelligence: Mode, Thoughtspot, Hex. Integrations with our BI partners enable a new way of working for data practitioners, increasing the amount of context they have access to within their analysis tooling of choice.
- Operational Analytics: Hightouch, Census. dbt equips more than just data practitioners with higher quality data; with the help of Reverse ETL tools, business users across the organization have access to reliable, transformed data in the tools they use every day.
In addition, dbt Labs has established partnerships with over 50 data consulting and service providers across the world. These partnerships include training, enablement, and more so that when customers choose to work with external service providers they’re able to do so with experts providing the highest possible level of dbt support.
We’re grateful for every user that’s joined our mission to create and disseminate organizational knowledge, and truly believe dbt is well on its way to becoming the default choice for every modern data team thanks to their guidance and support. It’s been an exciting few years — but 2022 has much more in store!
Last modified on: May 19, 2022