dbt

Scaling analytics engineering: Establishing a data-driven ecosystem within Virgin Media O2 using dbt from Coalesce 2023

Team members from Virgin Media O2, Oliver Burt, Lead Analytics Engineer, and Jason Jones, Analytics Engineering Manager, discuss scaling data at Virgin Media.

“We've noticed a 10x productivity increase compared to the legacy deployments...And we have 100+ hours of analyst technical debt and issues that we've now removed…”

- Jason Jones, Analytics Engineering Manager at Virgin Media O2

Team members from Virgin Media O2, Oliver Burt, Lead Analytics Engineer, and Jason Jones, Analytics Engineering Manager, discuss scaling data at Virgin Media. They describe their journey, explain how they’ve scaled their environment, and dive deep into their data modeling process.

Virgin Media's journey towards digital transformation and the role of modern data tools

Virgin Media's journey towards digital transformation started two years ago when the company decided to transition its traditionally on-premise infrastructure to a modern data stack. This shift involved using modern tools like dbt, GitLab, GCP, and Looker, among others. The company transitioned from a central warehousing team and multiple SQL servers to a hybrid model consisting of a centralized data team and various domain-specific teams.

"Two years ago, Virgin Media decided to go in a different direction as a large telecom business. Historically, it's been a large on-premises infrastructure. Going forward, they realized as a corporation that as a digital transformation, they needed to move to a modern data stack," Jason explains. He highlights the importance of centralizing their data, stating, "We look after the central, core, gold standard data models which are then fed to other spoke teams."

The importance of a gradual approach and choosing the right tools when scaling

In scaling its operations from zero to over 200 users, Virgin Media focused on three key areas: people, processes, and tooling. They emphasize the importance of not trying to build everything at once, but rather focusing on growth and gradually adding maturity to the processes and tools. They also highlight the choice of dbt Cloud over dbt Core due to the advantages like easy setup, maintenance, and training.

Jason emphasizes the need for gradual scaling, saying, "Don't try to build everything at once… Instead, focus on the growth. Focus on the scale. Then, as you go along, start to add that maturity in." He expands on their choice of dbt Cloud, stating that "the obvious benefits are ‘software as a service.’ This allows us to get up and running straight away. It also means that we don't have to spend time turning cogs, looking after infrastructure…"

The role of denormalized data in scaling

For data modeling, Virgin Media adopted a denormalized approach rather than using the often-preferred star schema. They found that denormalized tables, although initially intimidating due to the large number of columns, could be more efficiently queried and processed, reducing costs and the time taken for analysts to write queries. This approach was particularly beneficial when used with structuring mechanisms and modular processes.

"We see a denormalized approach as optimized for big data. It definitely helps to standardize the approach… It reduces that barrier to entry for any kind of new starters,” says Oliver. He also explains the cost and time benefits, stating, "We think it increases scalability and gives faster insights, again because those teams don't have to worry about building out their own models. They can take yours and just add to them, which gives you more agile decision-making, and then it helps to break down silos…”

The dole of continuous improvement, standardization, and respect for people in data modeling

Virgin Media’s team emphasizes the importance of continuous improvement, standardization, and respect for people in their data modeling process. They believed in being responsive to their customers' needs, catching errors early, and encouraging everyone on the team to speak up if they saw an issue. Standardization was achieved by using macros wherever possible, and they aimed to reduce waste by removing redundant data and code from the system.

"We believe in continuous improvement, so we don't have such a thing as ‘sign off.’ Once a model is available for everyone to use, we will continue to develop upon that model, increase on the new features, and fix any bugs that come into it," explains Oliver. He also highlights the importance of respect for people, stating, "Anyone in the analytics engineering team should be able to stop anything. They're the ones who understand the data and the potential risks."

Oliver and Jason's key insights

  • Scaling an enterprise requires focus on people, processes, and tooling
  • Adopting modern tooling such as dbt, GitLab, GCP, and Looker can help in digital transformation
  • It's important not to try and build everything at once when starting from scratch or scaling
  • dbt Cloud offers benefits such as “software as a service,” allowing quick setup, and reduced infrastructure maintenance
  • A modular approach to data modeling can help reduce complexity and improve efficiency
  • Denormalization can help in optimizing and reducing the barrier to entry for new starters.
  • Continuous improvement, standardization, and error-proofing are key principles in their team's work
  • Virgin Media has seen a 10x productivity increase and removed 100+ hours of analyst technical debt through its transformation and scaling efforts