Why dbt and Databricks
dbt works on top of your Lakehouse to provide analytics teams access to a central environment for collaborative data transformation. Now anyone on your data team who knows SQL can collaborate on end-to-end transformation workflows in Databricks.
Maintaining separate architecture for data analytics, data engineering, and/or ML workflows multiplies complexity and compounds cost. With dbt, Delta Lake, and Databricks SQL, your entire data org can now work out of the same platform, eliminating the need for redundant infrastructure.
A familiar analytics experience
dbt and Databricks SQL provide a SQL-based development experience familiar to your analytics team analytics team is familiar with, so they can bring their workflow to Databricks without missing a step. Data engineers and data scientists can continue using their preferred tooling, powered by Delta Lake, in the same Lakehouse.
Reduce data bottlenecks
dbt empowers analytics teams to build and troubleshoot production-grade transformation pipelines on their own, within clear guardrails. This enables data engineers to work on higher leverage projects with confidence that architecture is secure.
Automate documentation and testing
dbt makes use of Apache Spark SQL commands to automatically populate documentation and depict data lineage, which is hosted in dbt Cloud and accessible to anyone in the organization. Pre-configured and custom tests within dbt help verify data assumptions and ensure broken code never makes it to production.
All the benefits of open source
dbt, Delta Lake, and Apache Spark are all open-source projects with broad community adoption and support. This ensures ongoing innovation, eliminates the risk of lock-in, and provides an invaluable source of learning opportunities for data practitioners.
The Analytics Engineering Workflow
With dbt, analytics teams work directly on the Lakehouse to produce trusted datasets for Business Intelligence use cases.