Watercare builds a resilient and scalable data warehouse with dbt Cloud
This is the story of how Watercare went from manual processes to a modern analytics architecture using dbt Cloud.
tracked in 1 dashboard
in the number of data models produced in a year
From Water Pipelines to Data Pipelines
Watercare provides essential water and wastewater services to over 1.7 million people across the greater Auckland region in New Zealand. As a utility company responsible for critical water infrastructure worth billions of dollars, Watercare relies heavily on data to optimize its operations and assets.
However, Watercare’s legacy architecture made it difficult to get value from the vast quantity of data it was generating. Siloed systems led to unclear data lineage, hindering troubleshooting, while a lack of data modeling and transformation capabilities limited the team’s ability to perform analytics.
Diego Morales, Ex-Analytics and Insights Practice Lead, said: “Watercare has terabytes and terabytes of data to manage. We easily fall into the ‘big data’ category. But when I stepped into the role of Analytics and Insights Practice Lead, there wasn’t a solid strategy on how to harness this data for insights.”
This lack of insight increased in severity as the business handled growing volumes of data from customers, facilities, and users. With no way to easily manage this information, it became clear that Watercare needed a modern data solution.
“We couldn’t drive trends, and we couldn’t understand when something changed in our system,” Diego explained. “We didn’t have any data lineage. The only way we could track data was by looking directly at the code…which made it hard to understand and trust the information it was producing.”
“It was clear we weren’t going to get much out of our existing set-up. We needed a tool that allows us to build reliable data models.”
Leveraging high-volume smart meter data
The team began exploring options for a modern data stack. After receiving a recommendation from a contact in the data sector, Diego evaluated dbt Cloud.
“After doing the research into the best tools for the job, it became obvious that dbt Cloud was the standard” he noted. “If you speak to people in the industry, everywhere everyone’s talking about dbt.”
Despite their enthusiasm for the tool, however, the team still needed to demonstrate its effectiveness to the rest of the business. To do this, they used dbt to power a proof of concept (POC) project to develop dashboards drawing on data from the business’ smart meters.
These meters are intelligent, networked devices that digitally measure and record water consumption in real-time. Diego explained that the project turned out to be extremely useful, as the support team is regularly contacted by customers querying unusual billing patterns. When these unusual patterns include a sudden increase in usage, the Watercare team investigates potential leaks.
“With our new data model, we can offer a clear visual representation of their consumption trends and easily spot any spikes,” Diego said. “This data means we can swiftly notify our smart meter team about potential issues, such as leaks, tampering, or changes in pipe pressure. This is extremely valuable for teams that are deployed in the field.”
The project was, in Diego’s words, “incredibly successful.” Despite starting as a small proof of concept, the dashboard quickly became Watercare’s most frequently used data tool.
“Our customer service team uses it every day,” he said. “Customers might ask them to confirm they don’t have a leak, and with a quick check of the dashboard, they can provide answers. Out of the 600 dashboards deployed within the organization, it’s the one people access the most. That’s big.”
Building Trust and Speed with dbt Cloud
The success of the POC project encouraged the organization to realize the benefits of trusted analytics and responsive dashboards. The data team was entrusted with NZ$2.5 million in funding to expand the use of dbt Cloud. Over the next year, Watercare went from using just 25 data models across its business to being able to produce 175 with the help of dbt Cloud.
As the use of this data grew, a solid foundation of testing and documentation ensured analysts and other users that they could trust the numbers they were seeing. For example, the team was able to integrate its CI/CD pipelines with the dbt Cloud API, which made both collaboration and governance much easier.
“We have started documenting our dbt models in the open so different teams can discover and reuse existing logic. And we can work together on new transformations that meet multiple needs. On the data team side, adopting dbt has helped us work more efficiently and align us more tightly with the goals of the analytics users we aim to empower,” said Diego.
“It’s a complete game-changer.”
At the same time, Watercare has been making use of dbt’s community hub. The open-source community enables the team to accelerate their development using ready-made packages supplied and documented by members of the dbt community and adapt them to suit their own needs.
“All of the improvements driven by dbt Cloud are starting to add up,” explained Diego. “We can put testing in place and enforce different rules to ensure that the data going to users is going to be correct without having to spend time checking them. “
Delivering Value Across the Business: Machine Learning and Custom Analytics
Watercare successfully transitioned from a series of disjointed legacy data silos to an agile, analytics-focused data architecture. This modern foundation has prepared Watercare to support advanced use cases like working with IoT sensors and has allowed more and more areas of the business to start making the most of its data.
The business’ critical financial and work order reporting is now built on top of dbt models, and users are being encouraged to self-serve—using dbt Cloud to explore the data without the need for an expert to walk them through it.
The risk-free accessibility is helping Watercare enable teams on the value of data as an asset rather than a static means to an end.
“We’re bridging that gap because we’re providing self-service analytics,” he explained. ”The POC project created a model and used it to generate reports, but we’ve also used it to provide capacity for self-service analytics.
“We are using dbt Cloud to produce and create many data models, but now people can take all this highly organized data in dbt and use it to create new use cases for themselves.”
Watercare’s trusted dbt models also feed machine learning pipelines, used to create tools that run automatically and spot potential errors.
“We managed to produce such a high-quality data model that now consuming it into machine learning became easy,” Diego explained. “Immediately after that, because the data was in good condition, we started a process to build an ML model to detect any anomalies in the data.”
Building for Better Water Quality
Looking ahead, Watercare has plans to further enhance analytics by integrating Databricks, a lakehouse platform for machine learning workloads. The team is also looking to integrate new data sources into its pipeline, pulling in information from sensors able to measure parameters like temperature and pollution levels.
“That data is, as you can imagine, incredibly fundamental for a company that works with water. We’re going to take all of that information and move it into dbt Cloud,” concluded Diego.