dbt

60 sources and counting: Unlocking microservice integration with dbt and Data Vault from Coalesce 2023

Team members from Guild Education discuss their implementation of Data Vault, a data warehousing model.

"Data Vault brought us that process that we needed to help our employees build the objects that they needed in a common way..."

- Brandon Taylor

Brandon Taylor and Rebecca DeBerry from Guild Education discuss the implementation of Data Vault, a data warehousing model, for managing microservice integration in an organization. They explore the benefits of Data Vault in handling schema drift, tracking point-in-time data, and managing data from multiple sources. Brandon and Rebecca also outline the steps to implementing Data Vault and share their experience of managing the changeover to this new architecture.

Adapting the Data Vault model increases data reliability and flexibility

Guild Education faced several challenges when it came to managing a microservice environment with rapidly changing schemas. To manage these issues, the company transitioned to using the Data Vault model. It provides them with increased flexibility and reliability when dealing with their data, regardless of the rapid changes in their source systems.

Brandon explains, "Data Vault really helps with trust because we can show the business if a data point changes…We can show them upstream: ‘Well this is what the data looked like yesterday.’" By using Data Vault, Guild Education was able to keep track of the changes in their data over time, increasing credibility and accountability.

Rebecca adds, "Data Vault allowed us to incorporate ever-changing upstream data without needing to rebuild our models... We can compare future changes and be a little bit more future-proof, which has been huge for us." This helps them adapt to changes more effectively, without constantly needing to rebuild their models.

Implementing Data Vault requires strong collaboration and open communication with business stakeholders

"The reality is we still need to ensure that our existing data warehouse can meet business needs, so it was really important to our business leaders to say we can do this while also not having any sort of issues." - Rebecca DeBerry

Implementing the Data Vault model isn't a straightforward process. It requires strong collaboration and open communication with various stakeholders within the company. This is crucial in understanding the key business objects and processes that need to be modeled in the Data Vault.

Taylor explains, "I've been there, and it's really, really hard to get time with the higher-ups in the organization, like, ‘Hey! If we want to do [Data Vault] right, you need someone who can talk to you.'" This helps ensure that the Data Vault model accurately reflects the business’ key objects and processes. He also described creating presentations and showing them to stakeholders and executives to build a grassroots understanding of Data Vault from the ground up.

By keeping the business stakeholders informed, Taylor and Rebecca were able to maintain trust during implementation.

Using a combination of various tools and resources makes Data Vault implementation more effective

"We have dbt on every side of the orchestration pipeline and it's organized with a nice folder structure and the folder structure matches the schemas that we have in our warehouse."

- Rebecca DeBerry

Guild Education utilizes a variety of tools and resources to effectively implement Data Vault. This includes dbt for model development, Snowflake for data warehousing, and Amazon for automating data ingestion and managing data contracts upstream.

Taylor underscores the importance of these tools, saying, "We leverage dbt, and we have a dbt project that's managing our existing warehouse.” These tools allow Guild Education to effectively manage and process their data.

Rebecca adds, "We have a data ingestion pipeline that runs data contracts upstream that's also automated and running in Amazon as well.” This allows them to anticipate and manage the impact of upstream changes on their data.

Key insights from Rebecca and Brandon

  • Data Vault helps Guild Education scale their analytics by allowing them to manage schema drift, track point-in-time data, and handle data from multiple sources
  • The implementation of Data Vault forced a conversation on defining business concepts and establishing standards
  • Data Vault's architecture is insert-only, which helps manage changes in upstream data sources
  • The number of tables in Data Vault can significantly increase, making it complex, but it provides tremendous value depending on the business case
  • The implementation of Data Vault requires business buy-in and collaboration