dbt

Siemens' data evolution: dbt Cloud and the data mesh from Coalesce 2023

Team members from Siemens, Tobias Humpert and Nuno Pinela, share their data management and analytics strategies.

“Because we also do it in an automated way, we don't really need to have someone in the morning checking the data... And of course, it brings trust and keeps us from having sleepless nights.”

- Nuno Pinela, finance lead data architect at Siemens

Team members from Siemens, Tobias Humpert, product owner of the Siemens Data Cloud, and Nuno Pinela, finance lead data architect, share their data management and analytics strategies. They focus on their transition to dbt Cloud and their implementation of data mesh principles. They also give an overview of Siemens' business operations and scale and discuss their approach to data product creation and management.

Siemens leverages dbt Cloud for data analytics and AI projects

Tobias and Nuno highlight Siemens' journey in harnessing dbt Cloud for its numerous data analytics and AI projects. Siemens, a global company with a significant footprint, needed an efficient and flexible data management system for its diverse projects.

Tobias explains, "We realized there was the need to evolve, to go into the next direction, to go into Cloud…In January 2022, we finally decided also to go for dbt Cloud." He adds that dbt Cloud has allowed Siemens to work independently and scale their projects.

Nuno also reflects on their data mesh approach, stating, “W also decentralize data products across different teams, which can also be called the data domains themselves. And by doing so, we distribute the responsibility and ownership across those teams, and at the same time, we also make those teams accountable for the data.”

Siemens' implementation of dbt Cloud has significantly improved data management and reduced costs

Siemens' implementation of dbt Cloud for handling their vast data analytics and AI projects significantly improved the efficiency of their data management processes. Nuno and Tobias highlight the significant reduction in loading times and costs as a result of using dbt Cloud.

According to Nuno, "To put all of this into numbers...we were not able to do a daily load in less than 6 hours, whereas now, we are able to do so in 25 minutes. And when it comes to costs, we also saw a reduction of 90% of the costs." This indicates the transformative impact of dbt Cloud on Siemens' data management infrastructure.

They also improved their data governance and visibility through dbt Explorer. "The current status of our Siemens Data Cloud is…we are using a single dbt instance that is now serving 550 plus developers, working on those existing 85,000 different active models," Nuno added, illustrating the scale of the impact of dbt Cloud on Siemens' operations.

Siemens' approach to decentralizing data products and establishing single sources of truth aligns with the principles of a data mesh

Siemens' approach to data management using dbt Cloud aligns with the principles of a data mesh. The decentralization of data products across different teams and establishing data as a product principle were key aspects of Siemens' approach.

Nuno states, “We also established a data product to be a single and alternative source of consistency and accuracy that basically aims to reduce all the duplication—duplication of effort, duplication of records, everything—walking into the direction of embracing a culture of a single source of truth."

This implementation has allowed Siemens to fully embrace a culture of a single source of truth, leading to more efficient and accurate data handling.

Tobias and Nuno’s key insights

  • Siemens has over 300,000 employees worldwide and operates in more than 200 countries
  • Siemens is in a phase of digital transformation and supports other companies in their digital transformations with their solutions
  • Siemens uses dbt Cloud for their data warehousing and has over 800 projects running on their ecosystem
  • Siemens follows the principles of data mesh, which includes self-service data, data as a product, clear ownership based on domains, and federated governance