Driving scalability at Carsales: The democratization of our data platform for enhanced efficiency from Coalesce 2023
Adam Carbone, Data Engineering Lead at Carsales, discusses his team's approach to data engineering, silos, and data mesh.
"We moved into a centralized data team structure, so we kind of brought all the different data people from around the business into a single team."
Adam Carbone, Data Engineering Lead at Carsales, discusses the data journey of Carsales, a company that operates an online marketplace for buying and selling cars. Adam discusses the challenges they faced and shares insights about their approach to data engineering, data silos, and the concept of data mesh.
The evolution of data teams in organizations and the challenges they face
Adam explains how the data teams at Carsales were initially separate entities embedded in different parts of the business, which led to data silos.
"We had these different data teams that were kind of embedded in different parts of the business, and what that situation led to was kind of these data silos. So, the teams operated pretty independently of each other," Adam says.
He highlights the changes that had to be made, such as centralizing the data team and grouping everyone by discipline. He explains, "About two and a half years ago. we moved into a centralized data team structure, so we kind of brought all the different data people from around the business into a single team."
The role of dbt in maintaining and enforcing standards in data transformation
Adam discusses the challenges faced by the data engineering team in maintaining various pipelines and how this led to the exploration and implementation of dbt for data transformation.
He explains the drawbacks his team experienced doing data transformation without dbt: "...we had a range of different python code that was orchestrated in a number of different ways...a very fragmented system."
He emphasizes how dbt played a crucial role in enforcing standards and practices, maintaining open and ongoing communication among the team members who were doing analytics engineering work. "dbt brings software engineering type practices to data...it's just about having a consistent approach to the way we're building our data models," says Adam.
The hybrid approach to data mesh implementation
Adam also touches on the topic of data mesh, acknowledging the risks and challenges in its implementation. He describes how they decided on a hybrid data mesh model to balance centralization and decentralization.
"We’re still somewhat centralized, but also still giving the platform as a self-service tool that other teams can still use for their own purposes," explains Adam.
He emphasizes that this approach does not have to be a binary choice between total centralization and total decentralization, but rather can evolve based on the business needs and requirements. He adds, "It doesn't need to be one or the other. You can kind of have something in between, which is where we've landed."
Insights surfaced
- Carsales faced significant challenges with data silos due to different data teams operating independently of each other
- The company moved to a centralized data team structure to improve communication and eliminate data silos
- Carsales adopted Snowflake as a platform and dbt for data transformations to streamline their data processes
- The company implemented a hybrid data mesh model, where some core data pipelines are built and owned by the centralized data team, while other teams build and own their data pipelines for their specific purposes
- Regular communication and knowledge-sharing sessions are crucial to maintaining consistent standards and practices in data engineering