Car & Classic’s speeds up their report load time by 10x with Metaplane, Snowflake, and dbt Cloud
This is the story of how Car & Classic ensured trust in their data with Metaplane, Snowflake, and dbt Cloud
Report load time improvement
identifying data incidents
Car and Classic is Europe’s leading classic car marketplace trusted by millions of buyers and sellers monthly. Because cars are bought and sold on the platform, often quickly and at high prices, the infrastructure needs to be reliable, robust, and secure.
James Sharwin, Head of Data, leads a nimble team of himself and one other data specialist. They’re responsible for collecting, organizing, and understanding Car and Classic’s influx of data, which is used by more than 100 stakeholders every day. This includes data in the core database, the CRM, marketing platforms, and even the finance platform.
“Our chief responsibility is helping all teams across the company use data to drive better decision-making,” said James. “That means organizing the data, providing dashboards and visualizations to our teams, and working hand-in-hand with our engineers so we have the data they need to drive the business forward.”
Even though widespread accessibility was a priority, it was difficult to make it a reality with their current tools. After thorough evaluation of different solutions, James leveraged Snowflake, dbt, and Metaplane to overcome the challenges the data team faced.
Challenges: database performance, lack of centralization, and long iteration cycles
James and his team faced a number of challenges when it came to making data accessible to the team.
Because the team was using a MySQL database and often ran multiple complex queries at the same time, the data was either slow and took more than 30 seconds to load, or was unusable because it never loaded at all.
The data was also not centralized nor easily accessible. Not having transformations defined in one place coupled with the slowness of analyzing data had several painful side effects like repetition and redundancy; teammates would often create different definitions of metrics without realizing it. Attribution in some areas, such as marketing spend, was virtually unanswerable.
Performance and a lack of centralization not only impacted how frequently data was used — it also introduced delays in the iteration cycles of the data team. It’s difficult to empower teammates to analyze data and find insights when the underlying database is slow and they have no consistent metrics to rely on.
“The amount of time it took to go from the inception of an idea to its implementation in our analytics was far too long.”
All of these challenges led to the degradation of trust in the quality of the data. “Our data lacked integrity and our team didn’t trust it– they were even skeptical of our raw data,” said James.
“You can have the best data setup and fanciest dashboards but if people don’t trust the data then it’s all for nothing.”
James knew he needed to redesign how Car and Classic approached data so that it could be trusted throughout the organization and be a driver in pushing the business forward. With trustworthy and accessible data, all teams at Car and Classic could make better-informed decisions.
Choosing Snowflake, dbt, and Metaplane
To solve his challenges, James implemented Snowflake, dbt Cloud, and Metaplane together. Each tool offered benefits on its own, but the best results came with all three tools working in sync.
Snowflake: seamless set-up with unmatched ease-of-use
James began looking for a data cloud platform and quickly landed on Snowflake. For James and his team, Snowflake was an easy decision because of the immediate performance gains, their relatively straightforward cost model, integrations with every tool they wanted to use, and their quick and easy setup process.
“Snowflake was ridiculously easy for us to set up. It took 2 hours, maybe less, for it to be ready to go.”
After adopting Snowflake, James and his team felt the benefits immediately. Given the data cloud’s ability to handle any analytics use case and the helpful built-in functions, the team improved load times of every report by 10x.
“Snowflake was easy to use for any analytical use case and has every function under the sun. Load times of reports became near-instant, with at least 10x performance improvements.”
dbt Cloud: engineering best practices, data quality, and velocity
For James and his team, there was no better tool than dbt Cloud to increase the speed at which data products were built, in addition to helping teammates have visibility into the data and gain an understanding of how things are derived.
Before dbt Cloud, the data team was creating massive nested queries using Metabase. After implementing dbt Cloud, James and his team created simpler modeling and orchestration all while introducing engineering best practices like opening pull requests, reviewing code, and adding tests. “In some ways, it’s going slow to move fast,” shared James. By defining models using code, the team avoided having to re-develop the same models 50 times—a problem that wasted time and resources, and caused bugs in production.
Tools like the dbt Cloud IDE made it simple for the team to get started and scale. The straightforward API made it easy to run more complex orchestration outside of the dbt Cloud environment when required. For example, James added a job to push metadata from dbt Cloud to Metabase, which provided documentation to teammates where they viewed the data.
“With the dbt Cloud IDE, we have a consistent environment where everything is the same. You don’t need to deal with Python package dependencies or VS Code extensions”—simplicity essential to onboarding new members of the team.
James and the team felt the benefits of dbt Cloud by increasing the speed at which data products were built and reducing the number of data quality issues. “Before I joined and implemented dbt Cloud and Snowflake, it used to take ages to build data products because the data team was re-inventing the wheel with code each time. If you wanted to build a dashboard about auction sale rate, you’d need to copy and paste the same 50 lines of SQL and adjust it,” shared James.
Not needing to maintain hundreds of definitions scattered across different places also translated into fewer data quality issues. With dbt Cloud’s standardized model views of data, data engineers and analytics can adjust and re-use previous work in one place, reducing time and complexity in developing dashboards. The result was game-changing; analysts now build dashboards with three lines of SQL.
The way dbt maps out complex transformations was also immediately helpful to the broader team. For example, engineers might have questions about a particular line of data in our dashboards, and the data team can point them to specific models, SQL, and lineage.
“With dbt, I can give someone access to the project; they can look through and see everything like metric definitions and lineage to see where every piece of data comes from and which reports it feeds.”
In terms of where dbt fits in the stack, James describes it as the “top of the Christmas tree.” Engineers can see everything from the tool. It provides a repeatable and consistent way to provide visibility to the broader organization.
Metaplane: trust across the data stack
For James, one of the worst recurring things he’s experienced is a downstream data consumer telling him that published data is incorrect. There have been numerous times in his career where someone mentions data that looks wrong in a dashboard, and the problem is obvious, but even with tests, it’s easy to miss given the scale of data he manages.
“You will never be able to write a test suite that covers every eventuality, and there is nothing worse as a data professional than when bugs and erroneous source data slip through your controls and you’re one of the last to notice. It erodes trust in both the technical architecture and you as a data practitioner.”
James wanted to make sure there were no issues slipping through the cracks. Metaplane offered a solution that integrated better into Car and Classic’s existing stack than any other tools. After a 15-minute setup, the platform has been automatically monitoring their data.
Since implementation, the team has caught issues in production-level systems at least four times, saving hours of work necessary to identify the issue and weeks of work to fix affected data.
In one case, Metaplane proactively alerted James’ team about an anomalous decrease in the volume of website log data. This data included important metadata about auctions like auction statuses and timelines. After receiving the alert, James was able to identify a cron script that was accidentally deleting data older than a year. If Metaplane had not identified this issue, data would have been lost and difficult to recover.
Another interesting discovery was brought to James’ attention because of Metaplane’s ML-based monitoring strategy. Over time, the distribution of auction prices changed in a way that was not consistent with seasonal changes. This was an interesting insight that the team would not have noticed without active monitoring but were able to explain to the broader team.
In addition to receiving proactive alerts, James’ team leverages Metaplane’s downstream impact analysis to make every alert actionable. For example, James has received alerts regarding important data that is operationalized by the marketing team; he can immediately notify the team and show the organization that his team is on top of any data incident, further preserving trust. James is now the one catching the issues as opposed to having team members bring them to his attention. “Before anyone in the organization brings an issue to my attention, I know about it,” he said. “Thanks to Metaplane, I’m the one telling people, which puts me in the driver’s seat and fosters more trust.”
- A scalable, trusted data stack built on Snowflake, dbt Cloud, and Metaplane that can serve various teams and over 100 stakeholders across the organization.
- Immediate performance improvements thanks to Snowflake; seamless integration with Car and Classic’s existing and future tooling.
- Dependable centralization and visibility provided by dbt, particularly in helping the team understand data lineage.
- A consistent development environment with near-zero engineering overhead via dbt Cloud.
- Increased data trust with Metaplane; the data team is now catching issues before stakeholders.
- 8 hours/week saved by catching issues proactively with Metaplane.
- Reduced time to identify data quality issues from weeks to hours.
James is planning on increasing the team size over the next six to twelve months to tackle new projects. He hopes to increase the use of event-level data to enable Car and Classic to better understand marketing attribution, provide recommendations, and more thoroughly test and measure changes to their platform.
“I believe investing in the right tools to do the job is often better than hiring two full-time engineers to do the same thing,” concluded James.