dbt
Blog Getting started with Data Vault on AutomateDV and dbt Cloud

Getting started with Data Vault on AutomateDV and dbt Cloud

Jun 21, 2024

Product

This is a guest post authored by Alex Higgs, AutomateDV product manager at Datavault.


Learn how to build a Data Vault in an on-demand walkthrough for scalable data architecture.


If you are looking for a trusted and reliable tool to streamline your data warehouse development, leveraging AutomateDV, dbt Cloud, and Data Vault might be your solution.

The Data Vault method is a proven data warehousing methodology for building reliable and integrated data warehouses on your data platform. It is an open-source method available to anyone. One of the key advantages of Data Vault is that it is pattern-based and therefore lends itself to automation.

AutomateDV is a dbt package providing Data Vault automation to dbt.

In this AutomateDV quickstart guide, you will understand exactly what is required for a successful project with the Data Vault automation tool built on dbt.

What is a Data Vault?

Data Vault provides an approach for deploying enterprise data warehouses which includes three core pillars: architecture, modelling and method. Importantly it is well-suited for Agile implementation, which is vital for the rapid scalability required by modern organisations.

Successful Data Vault implementations focus on integration at the business concept level rather than being source-system driven, so developing some understanding of Data Vault modeling and standards helps prevent wasted time and false starts.

Why use Data Vault?

Long-term historical storage: Data Vault supports storage of data in an integrated way from multiple operational systems for the history of your data and business. All the data, all the time, within scope.

Auditing and Traceability: Data Vault provides built-in auditing and traceability of data over time which is vital for certain industry sectors such as financial, healthcare, education and more. Get access to historical insights and understand what you knew when as a result of Data Vault’s carefully considered design.

Optimized for load: The Data Vault model is designed to optimize loading time by supporting highly-parallel loading, which is crucial for handling large datasets, multiple source systems and real-time feeds. This also reduces the time to insight for the business.

Scalability: The modular structure of Data Vault allows it to scale alongside your business as it grows, allowing rapid adaptation to changing business requirements. Build a data warehouse with agility.

Single version of truth: By separating business rules from the raw data, Data Vault ensures a single version of truth, enhancing data trustworthiness and auditability. Reduce the risk of business transformations being calculated inconsistently, increasing understanding and trust in how metrics are calculated.

How does dbt Cloud and Data Vault work together?

Enhanced Visibility and Governance: Combining dbt and AutomateDV is smart—they work seamlessly together. dbt features included in the dbt Mesh offering include data security, lineage and contracts as well as model versioning support many of the paradigms also built-in to Data Vault. All of this is defined using files, so it is all tracked in version control as well.

Scalability and Efficiency: AutomateDV on dbt Cloud supports scalable Data Vault components and has been tried and tested in larger companies and production environments. Combined with enterprise features in dbt Cloud, it provides effective management of extensive Data Vault projects.

Higher Standards and Reliability: The dbt ecosystem of packages and features provide a solid foundation for doing Data Vault projects consistently and to a high standard with less maintenance overhead. dbt Cloud streamlines development, ensuring speed without sacrificing quality. Paired with Data Vault's pattern-based approach, it lets developers focus on business needs, not just technical tasks.

What is AutomateDV?

AutomateDV is the open-source tool that, when combined with dbt Cloud, provides Data Vault templates to your dbt Project. The dbt package allows developers to automatemany of the tasks involved in setting up and maintaining a Data Vault, making it easier for data engineers to manage their projects.

Don’t just take our word for it, though. Organizations like McDonald’s Nordic, Betway, and NHS Digital have used AutomateDV to develop andmaintainlarge-scale Data warehouse platforms, allowing them to integrate many disparate systems and meet business needs faster than a traditional approach.

AutomateDV is a package for dbt available on the dbt hub here.

Recommended content

Here are several books that can help you get started with Data Vault and AutomateDV, including:

We also recommend the following online resources:

Recommended training

Before diving into AutomateDV, we recommend that you familiarize yourself with the core concepts of Data Vault and undergo AutomateDV training. This will ensure you have a solid foundation to work from.

AutomateDV documentation

Once you’ve gone through the recommended content and training, it’s time to put your knowledge into practice. A worked example can help you understand how to apply what you’ve learned.

The AutomateDV ReadTheDocs site is a comprehensive resource that provides detailed information about the tool.

What’s included in the AutomateDV documentation?

  • Worked example – Demonstrates AutomateDV. Guiding you through developing a Data Vault based on the Snowflake TPC-H dataset, step-by-step using pre-written dbt models using AutomateDV templates.
  • Best practices – Outlines the standards for using AutomateDV, including hashing, loading, and NULL handling.
  • Tutorials – Provides a detailed understanding of the Data Vault concepts which AutomateDV supports, and how to use AutomateDV to create a Data Vault.
  • Much more: Explore additional resources to deepen your understanding and proficiency with AutomateDV and Data Vault.

Get started with Data Vault in this on-demand walkthrough by AutomateDV and dbt Labs.

Last modified on: Oct 15, 2024

Build trust in data
Deliver data faster
Optimize platform costs

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now ›

Recent Posts