How SpotOn reduced time to actionable insights by 6x
This is the story of how SpotOn built a scalable data platform with Snowflake, dbt Cloud, and Metaplane
in time to actionable data
in engineering contribution
in annual engineering costs
SpotOn is a rapidly growing business that offers mobile payment processing and management software for restaurants and small businesses. As the company has scaled, data has increasingly become a differentiator to drive the business forward. More than 500 team members rely on data on a daily basis to make decisions and SpotOn’s customers rely on data for merchant reporting and a recommendation engine to power better online ordering experiences.
With this widespread integration of data across the business came new challenges for the data team. Ben Cohen, the Data Engineering Team Lead at SpotOn, and his team were running into bottlenecks in the performance, accessibility, and engineering workflows for iterating on data.
The performance of their Postgres database was routinely slow, causing delayed ETL jobs and degraded BI reporting experiences. The database was undersized and tuning was difficult; “ingestion jobs would fail because of whatever the new Postgres error was,” explained Ben. Data was either missing or delayed.
These performance issues had a ripple effect—data was not accessible because it was often slow or broken. Users were unable to run queries because resources were being consumed by upstream ETL jobs for several hours every morning. New and advanced analytics use cases were impossible to create on top of the existing warehouse because Postgres couldn’t handle reprocessing large-scale data aggregations.
When the data team needed to address incoming requests or improve models, only a small subset of the team had the skills necessary to deploy changes in a timely manner. Ben and his team couldn’t keep up with the number of data requests, and they wanted to start using data for new use cases that could unlock more growth for the entire company. The data stack became a bottleneck, and the team needed to move quicker as the company scaled.
With these challenges in mind, Ben decided to implement Snowflake, dbt Cloud, and Metaplane to scale the analytics capabilities and create a new team culture, all without adding undo complexity or cost.
Solution: Scaling analytics capabilities with Snowflake, dbt Cloud, and Metaplane
How Snowflake improved performance, made data more accessible, and improved engineering efficiency
When Ben and his team migrated from Postgres to Snowflake, new advanced analytics use cases were immediately possible. For example, the recommendation engine for online ordering platforms is powered by large-scale pre-aggregated data augmented from multiple sources. Whereas Postgres could not support these types of aggregations, Snowflake handled them with ease thanks to the scaling capabilities provided by the separation of storage and compute.
“With the scale of data and infrastructure we had, we couldn’t even begin to solve these problems using Postgres. We knew we had to upgrade to a data cloud like Snowflake”
Other work that used to take up much of the team’s time, like tuning Postgres instance sizes, re-indexing data, or designing physical table structures, simply went away with the power of the Snowflake data cloud.
Snowflake’s easy-to-use integrations with tools like Snowpipe made ETL’ing and operationalizing data fast and simple. It wasn’t necessary to build out complex pipelines that needed to be maintained and scaled. For example, the SpotOn data team used Snowpipe to ingest a weather data set from OpenWeather to augment order data. They were also able to seamlessly integrate with their product analytics platform, Heap, and use data sharing to repurpose data without needing to build new pipelines.
“There are pretty expansive, geospatial cross-joins that we need to do on a regular basis that Postgres would never be able to handle,” explained Ben.
Snowflake also became a key part of data integration after SpotOn acquired Appetize—an existing Snowflake customer. The data team could securely and instantaneously share data between SpotOn and Appetizes’ Snowflake instances.
Migrating from Postgres to Snowflake was a game-changer for not only the data team, but the entire company. The performance improvements led to faster reporting load times and more advanced analytics use cases. Next, Ben turned to building on top of their new warehouse with dbt Cloud to make this data even more accessible and improve engineering workflows.
How dbt Cloud made data accessible and improved engineering contribution by 8.5x
When Ben joined SpotOn, there were only two engineers who could consistently contribute to their ETL project. Adding data sources took months, data models were built in siloes, and the testing and deployment process was painful. Their engineering workflows were centered around custom Python jobs in Airflow which required advanced engineering skills.
If data wasn’t already in Postgres, the team needed to update Python scripts to ingest this new data. The Airflow instance required testing and deployment, and if historical data was needed, the team had to backfill Postgres—which could take days or a week depending on the size of the dataset. Only then would they start the modeling process…which meant another Airflow deployment. After all of this, there was no guarantee that the data was modeled in the way the team needed. If a model was delivered, and then an additional piece of data was needed, the entire process would start over again.
The cumbersome workflow limited the team’s velocity and capacity. The only way to scale was to hire more people, which was not a reasonable path forward.
“Before dbt, only two engineers could contribute to modeling. Bringing in dbt Cloud changed all of that for us. The number of contributors has grown by 750% and there is no limit to scale.”
After Ben and his team adopted Snowflake to improve performance and scalability, they migrated all existing models to dbt Cloud—financial, operation, sales, and product analytic data were all transformed using dbt. With their core business logic in dbt, the data team’s workflow became much easier due to greater collaboration and accessibility.
“dbt Cloud enables more people to build models, self-serve, and feel empowered. That translates to people being more engaged all around.”
After SpotOn implemented dbt Cloud, creating data models went from taking days to hours. dbt quickly became the workhorse that powered the entire SpotOn warehouse, internal BI, and analytics. Ben and his team also operationalized the data to be used in the SpotOn product. By using dbt to pre-aggregate data, the product teams can power merchant-facing reporting as well as a recommendation engine. For example, they can augment ordering data with weather data to make better recommendations in the online ordering platform.
With dbt Cloud’s analytics workflow in place, SpotOn went from two to 17 contributors to their ETL code. Product teams, engineers, and finance all contribute to their dbt project, including documentation.
“We use existing dbt packages to push all of our definitions directly to Snowflake and Metabase in an effort to make our documentation available where our internal stakeholders work everyday. We even have some people on the finance team contributing to YAML files and building out documentation.”
Not only are more people contributing, but dbt Cloud bakes in software engineering best practices, challenging the team to learn new skills and implement testing and version control as a default.
“An analyst can work with our data pipeline and feel engaged and empowered. Stakeholders are happier and ask for help in creating new data use cases. It’s a virtuous cycle of improvement.”
How Metaplane increased trust in data and reduced time to identify issues from days to seconds
Ben and his team quickly noticed that as they improved performance and made the data more accessible, they created a data feedback loop: more capabilities allowed the organization to move more quickly, which resulted in more teammates asking for data and analytics. On the one hand, this feedback loop was driving SpotOn’s entire organization to leverage data and make more informed decisions. But with this came more attention and scrutiny of the data team’s work, and trust in data became top-of-mind for the data team.
“As stakeholders use more data and have new capabilities, they ask more from your team, and you need to move quickly. You can’t test everything yourself. The other way to grow is to hire more people that add data quality checks, but that doesn’t scale well from a cost and efficiency standpoint.”
Metaplane helped the SpotOn data team scale this feedback loop. By providing observability across their data stack, the team was able to build and retain trust; with Metaplane’s machine-learning-based testing approach and ability to automatically add hundreds of tests, they saved engineering time and always received context about potential root causes and downstream impact when data incidents arose.
For example, the data team ingests transactional data from payment processor partners. This data is critical in helping the company determine and report on key KPIs. The source systems can be legacy databases and don’t always deliver data on a regular schedule. At times, the data was incomplete or contained values outside the expected data contract. All of this fed data into reports that executives used every morning. By proactively catching data incidents like these, Metaplane helped Ben’s team get in front of any issues that would impact downstream stakeholders, helping the data team maintain trust in the data. After receiving data incident alerts, Ben and his team could pull back scheduled reports until they verified that the data was fixed after an issue.
“We were always behind the 8 ball in terms of communicating with the executive team when there was an issue. We were starting to lose trust and they weren’t going to use the reports. If they can’t use our data, that’s bad not only for our team but also for our business. Metaplane helped us get in front of those issues.”
Data quality issues don’t just impact executive reporting. SpotOn is core to their customers’ businesses because they process all of their transactions. When SpotOn experiences data quality issues, it affects their customers and potentially impacts how they earn money and serve their own customers. Being the first to know about data quality issues allows the data team to quickly triage and fix issues, preventing them from ever impacting downstream customers.
Ben’s team went from chasing down data bugs and data anomalies to proactively finding out about them and spending more time actually fixing the issues. Time-to-identify data quality issues went from hours or days to seconds.
“Metaplane is key to preserving trust in our data. You can spend all your time moving to this great modern stack, but if you lose people’s trust and they won’t use it, that work is for nothing.”
- After migrating from Postgres to Snowflake, Ben and his team don’t need to plan database changes like tuning and migration nor do they have to build complex ETL pipelines for product analytics and clickstream data, saving the team at least 10 hours every week.
- With dbt Cloud, the SpotOn data team increased engineering contribution by 750% percent and building models went from taking days or weeks to hours.
- After SpotOn adopted Metaplane, the time to identify data quality problems dropped from hours to seconds. The organization went from mistrusting their data, to asking for more data capabilities to power advanced analytics use cases.
- With Snowflake, dbt Cloud, and Metaplane, SpotOn quickly integrated with Appetize’s data stack post-acquisition. Zero-second latency data sharing between databases, a shared skillset of dbt, built-in documentation, and guaranteed data quality helped the teams work together.
What’s Next For SpotOn
Over the next year, Ben and his team will continue to invest in Snowflake, dbt Cloud, and Metaplane.
To create more consistency, the data team plans on migrating some legacy pipelines that ingest and cleanse important analytical and operational data. In addition to this migration, Ben and his team will be looking into ways to continue to use dbt to power modeling and decrease latency so the data can be provided closer to real-time.
In an effort to reduce the number of data quality issues introduced by code changes, the SpotOn data team is adopting Metaplane’s CI/CD tooling to automate impact analysis and data test previews.
Lastly, the company’s organic growth and the Appetize acquisition means a large number of new team members need to be incorporated into the larger data culture. Ben is focused on leveraging the dbt semantic layer to help bridge this gap and make analytical data more accessible and consistent across the company.