dbt

Creating a data warehouse at TradeMe: Planning, operating models, and architecture from Coalesce 2023

Watch Lance Witheridge of Trade Me describe his team's migration to a cloud-based, Snowflake data warehouse.

"The mission statement that we came up with was to build a data warehouse that analysts love to use."

Lance Witheridge, Data Modernization Lead at Trade Me, describes Trade Me's journey to modernize its data stack–migrating from an on-prem, SQL Server environment to a cloud-based, Snowflake data warehouse.

Trade Me's data migration project evolved from a lift-and-shift operation to a complete re-architecture and modernization

Lance initially describes the project as a straightforward migration from an on-prem, SQL Server environment to a cloud-based, Snowflake data warehouse. However, viewing the project through a product lens and considering the problems their customers were solving transformed the project. "...this project morphed from being a kind of simple, lift-and-shift cloud migration to a…total modernization of the tech stack," he explains.

The shift towards a product-oriented mindset enabled Trade Me to redirect the project to better serve their customers. "The way this came to be was by applying some product thinking to what we were doing," he adds.

Trade Me's data architecture strategy was to build a data warehouse that analysts would love to use

Lance emphasizes the importance of designing a data warehouse that would be user-friendly for analysts. Instead of simply replicating the existing data warehouse, Trade Me’s team wanted to create something that analysts would find intuitive and efficient. "We came up with the next important question which is, ‘What product are we trying to build?’” he explains.

The focus on the end-user experience was a crucial part of their reengineering process. Lance elaborates, "Our product is the data platform from which those reports are built. Our customers are not the commercial team. They're not the finance team. Our customers are the insights analysts who are building those reports."

This user-centric approach ultimately informed the design and structure of Trade Me’s new data architecture. Instead of structuring data according to production systems, they structured it to meet end-user requirements. "The data was structured and named for the end user requirements...it's structured for how analysts want to see it," Lance concludes.

Trade Me adopted dbt for their data migration due to its SQL base and built-in documentation capabilities

The choice of dbt for Trade Me's data migration was influenced by its SQL base, which appealed to the analysts. The tool enabled them to write SQL, which they were comfortable with, instead of navigating low-code or no-code tools. "Back when we were looking at getting insights analysts to do the data modeling, we looked at some sort of these ‘low code, no code’ tools… the data insights analysts were just saying to us, ‘Can we just write SQL?’”

dbt also had the advantage of consolidating everything in one place, which simplified the process. The tool's built-in documentation was another significant benefit. "dbt does make it easier, but it still is a little bit painful… particularly if you've got a 40-column table…” says Lance.

Despite the challenges, the adoption of dbt has proven to be beneficial for Trade Me. It not only enhanced their data migration process–it also fostered a culture of rapid iteration and experimentation.

Lance’s key insights

  • Trade Me's migration to Snowflake was initially a simple lift-and-shift cloud migration but evolved into a complete re-architecture redesign and modernization of the tech stack
  • The project shifted from being engineering to product thinking, considering who the customers were, their problems, and how to best serve them
  • The new data architecture focused on structuring data for end users rather than being tied to the production databases
  • dbt was chosen for its SQL basis. It encouraged modularity, built-in testing, and documentation as you go
  • Trade Me managed to reduce their Snowflake costs significantly by making changes in their data management and optimizing their processes