Databricks + dbt

Close the gap between data analysts and data scientists with the dbt Cloud <> Databricks Spark integration. This integration enables data analysts to build, test, and deploy data transformations for structured and unstructured data sets within a single unified analytics platform.

Why Databricks + dbt

1

Unify Teams & Tech: Maintaining separate data workflows multiplies infrastructure and divides teams. Databricks + dbt enables analysts to model complete datasets from the same platform trusted by their data science counterparts.

2

Go Big: Machine Learning and AI datasets can be extremely large, making them difficult to clean and query in a consistent manner. By combining Databricks and dbt Cloud, organizations can apply analytic best practices like version control, testing, scheduling, and documentation without sacrificing speed or reliability.

3

Stay Flexible: dbt and Spark are both open source projects with a number of unique contributors. By putting open source solutions at the heart of your ecosystem you benefit from continuous innovation, community support, and a much higher degree of flexibility than closed-source alternatives.

Learn more about Databricks

dbt and Databricks are a great couple. Spark has the power I need to process ridiculous volumes of data while dbt helps structure pipelines using software development best practices. Together they improve data quality and confidence in a way that’s much more accessible for data analysts everywhere.

Fokko Driesprong
Principal Code Connoisseur, GoDataDriven