Organizations are standardizing on dbt as a central data transformation framework that allows the entire data team to collaborate more effectively. It’s proven to be a robust and flexible workflow – we’ve seen increasing adoption of dbt to support data science use cases in addition to analytics ones.
Databricks, in turn, offers an open, unified lakehouse on which data teams can execute the entirety of their data analytics, engineering, and data science workloads, removing the data silos that traditionally separate and complicate them. When used together, teams can use dbt Cloud to develop, test, and deploy data models, while leveraging the highly performant Databricks platform with Databricks SQL’s Photon engine to centralize their data and execute against.
This has served as a strong foundation in our partnership with Databricks since last year, and we’re thrilled to share some new developments in that partnership, including new product and community experiences.
dbt Cloud is now on Databricks Partner Connect
The experience of getting started with dbt Cloud on Databricks just got even simpler.
Through Databricks Partner Connect, Databricks customers can now automatically connect and experience a trial of dbt Cloud in just a few clicks directly through the Databricks interface. This allows users to quickly get going by significantly reducing the amount of configuration required – all while still benefiting from a secure and governed connection.
It has never been easier or faster for Databricks users to try dbt Cloud on the lakehouse.
This exciting product development follows closely after the Databricks engineering team led the development and release of a new, dedicated dbt-databricks adapter in close collaboration with dbt Labs. Since its initial release, we’ve seen invaluable feedback from the dbt Community and Databricks customers, allowing the teams to further improve the end-to-end experience.
The open source dbt-databricks release, built on the existing dbt-spark adapter, brings several important benefits including a simplified installation and connection process. Recent optimizations include using Delta tables by default, and merge for incremental models and snapshots. Over time, users will benefit from a roadmap of features that leverage Databricks-specific capabilities to power benefits like faster metadata and docs generation via the Databricks catalog features.
We’re happy to share that Databricks customers will be able to use the latest release with dbt Cloud by the end of this quarter.
Data teams are buying in
Since introducing support of Databricks for dbt, we’ve seen hundreds of organizations across industries such as technology, financial services, media, manufacturing and industrials already using dbt with Databricks for their analytics workloads.
Felippe Felisola Caso, Business Analytics Manager at Loft, a Brazilian prop-tech company, had this to say:
“dbt running on Databricks has made modeling accessible directly to business analysts. It all lives in one place and it’s all access controlled, so we don’t have to worry about writing to a separate data warehouse or a separate cloud… Having everyone in the same environment and accessing the same version of the same data, every time, is huge.”
An important indicator we pay close attention to is what members of the dbt Community find exciting, and on that front we’re pleased to say we have seen over 1,000 members join the Databricks and Spark channel within the dbt Community Slack.
Strategic investment in dbt Labs
Databricks Ventures participated in dbt Labs’s recent funding round as one of our strategic investors. This investment is a testament to both the proliferation of dbt in the ecosystem and how the two companies have aligned on a vision to enable data teams working with the modern data stack.
“Like the dbt Labs team, we believe that all data users should have access to the same powerful tools and workflows used by engineers,” said Ali Ghodsi, CEO of Databricks. “With Databricks SQL, we’re now enabling data warehouse workloads. And with dbt, analysts are able to access Databricks machine learning and data science capabilities. We’re looking forward to deepening our partnership with dbt Labs over the coming years as, together, we expand what’s possible in the modern data stack.”
Getting started with dbt Cloud and Databricks
On April 20th, the Databricks and dbt Labs team are hosting a joint hands-on workshop. This will provide practitioners with a detailed look at what a workflow using these two solutions together looks like in practice, allowing them follow along and setting up their own projects. There will also be space to interact directly with experts from dbt Labs and Databricks.
Sign up here to take part.
Showcasing a shared vision
Meanwhile, Databricks Data & AI Summit 2022 in June is set to be one of the largest data-oriented conferences of the year, and dbt Labs is excited to be a Diamond Sponsor. We look forward to engaging with the data community at our booth and at sessions around the conference. Be sure to keep an eye on the agenda, and catch some of dbt Labs’ several talks – including CEO Tristan Handy’s at the Day 1 Keynote.
In addition, Databricks will have an active presence at Coalesce 2022, the annual analytics engineering conference hosted by dbt Labs. Details on this will be forthcoming in the next few weeks.
Last modified on: Oct 11, 2022