impact.com

impact.com scales a $100B+ data platform with dbt to power decision-making and AI readiness

2–3 days/monthengineering time reclaimed from pipeline maintenance
2x output, zero new hiresmore models shipped without adding headcount
From 1-2 days to 1 hourtime to troubleshoot and find the right data

Before dbt, finding the right data could take a day or two of hunting around, but it now takes about an hour. When you’re working across more than $100B in partnership-driven commerce, that makes a real difference.

Paul Kotze, Head of Advanced Analytics

Operating a global partnership platform at massive scale

impact.com is a global partnership management platform, powering more than one million active partnerships and processing over $100B in partnership-driven commerce each year. The platform connects brands, publishers, and content creators across affiliate, influencer, mobile, B2B, and referral programs – with data at the center of how the business operates.

Internally, a central Data and Analytics Group based in Cape Town leads the company’s analytics function, reflecting its role as a growing hub for data and engineering talent. The group supports every value stream across the business, from finance and marketing to customer success and product.

A data platform that couldn't keep pace with the business

As impact.com scaled, so did the complexity and importance of its data. The organization needed a data architecture that could keep pace for reporting, day-to-day decision-making, and a growing set of AI initiatives.

Earlier data models had been built in Databricks notebooks with manually managed dependencies. This approach was difficult to maintain and increasingly fragile as the system expanded.

Their data team began self-hosting dbt Core. This brought structure and standards, but introduced new challenges as the team scaled.

Most analytics engineers came from SQL and BI backgrounds, not software engineering. Local environments made lineage opaque and modeling practices inconsistent — and there was no easy way to onboard someone new without weeks of ramp-up.

At the same time, operational overhead continued to slow progress. Self-hosting dbt Core meant orchestration lived in Jenkins, not dbt. Troubleshooting a failed job meant navigating multiple systems. Deployments required other teams so analytics engineers could not operate independently. This ultimately slowed development, limited the team’s ability to iterate and slowed delivery of insights.

Data reliability also suffered. Without systematic testing or monitoring, issues often went unnoticed until stakeholders flagged them, sometimes during critical quarterly business review cycles.

At scale, this created a deeper problem. Teams did not always trust the data, and time was spent reconciling numbers instead of using them to make decisions.

From tickets and workarounds to self-service and trusted data

impact.com transitioned to dbt platform to improve how its data platform was built and operated, with a clear focus on developer experience and team enablement.

“The main problem we were trying to solve was having a team who could write SQL and understood the business, but didn’t yet have the fundamentals of building a data warehouse - dbt helped us close that gap,” explained Kotze.

One of the most immediate changes was improved visibility and discoverability with lineage, documentation, and dependencies, all accessible within dbt platform. Analytics engineers could finally see how their models fit together, and monitor and troubleshoot their jobs without filing a ticket, waiting on another team, or relying on external orchestration tools. Most importantly: the time taken to locate data reduced from ~2 days to around an hour, supporting faster decisions across the organization.

Another major shift was improved data quality. Before, data quality issues were invisible until they weren't. There was no systematic way to catch problems before they reached the business. That changed with a comprehensive testing framework spanning over 2,800 tests across the warehouse, covering freshness, uniqueness, and integrity. The team now catches issues before they reach stakeholders, and the team spends far less time firefighting and more time building.

The third major shift was architectural. impact.com restructured its warehouse from a fragmented collection of reporting datasets built in Databricks notebooks with manually managed dependencies, into a layered, governed architecture that models how the business actually operates. Instead of different teams maintaining their own versions of the same logic, there is now a single source of truth used consistently across teams.

Empowering the team to build and problem-solve

The warehouse grew from ~800 to 1,700 models, without adding headcount. “We doubled the rate at which we could architect, conceptualise and deliver models,” said Kotze, adding, “We were spending two to three days a month maintaining pipelines but moving to dbt freed that time up so the team could focus on building instead. It gave the team capacity to focus on other important work and freed up their headspace, so they didn’t have to worry about running the warehouse.”

The visibility and structure included in dbt platform also changed how the team catches and resolves problems. With proactive testing and clear lineage, problems are detected proactively before they surface in reports, reducing rework, cutting technical debt, and rebuilding trust in the numbers.

That trust has changed how the business uses data. Leadership operates from shared dashboards built on a single source of truth.Operational teams rely on data as part of their day-to-day workflows, not just occasional analysis. Ad hoc requests have dropped because insight is built directly into the platform.

What’s next

What started as a central data team initiative is now becoming infrastructure for the whole company.

Engineering teams are building customer-facing products directly on top of the warehouse. Others are experimenting with the semantic layer to support AI use cases. As more teams start to work with the data, maintaining consistent definitions becomes increasingly important across impact.com.

Impact.com isn't waiting on AI readiness. They're building for it now. dbt’s Semantic Layer gives every agent and application a single governed interface into the warehouse, where metric definitions and relationships are explicit, not inferred.That means no hallucinated connections between tables, no inconsistent definitions as more teams build on top. Some product and engineering teams are already piloting it, with a broader rollout targeting the end of June 2026. Their ultimate goal is that every team, not just the data team, runs on trusted metrics.

Read more customer stories

Trusted by the best teams in data and AI.

How Kaizen Gaming cut costs and delivers insights faster

Learn how dbt helped Kaizen strengthen its data foundation, increase consistency, and improve operational efficiency

Read Case Study

From onboarding to AI-readiness: J.Crew’s modernization journey with dbt Labs Services

By working with dbt Labs Training and a Resident Architect, J.Crew’s data team accelerated a complex modernization during the peak holiday season.

Read Case Study

WHOOP gagne en efficacité en passant de dbt Core à dbt Platform

WHOOP a choisi de migrer vers dbt Platform pour améliorer la fiabilité et la qualité des données

Read Case Study

See dbt in action today.

Speak directly with our data experts and discover how dbt can accelerate your analytics strategy and drive measurable results.

Great data professionals never work alone

The dbt Community connects you with 100,000+ data professionals—people who share your challenges, insights, and ambitions