Blog Optimize Costs with dbt Cloud's Defer to Production Feature

Optimize Costs with dbt Cloud's Defer to Production Feature

How to develop your data pipelines faster and save on storage and compute costs using dbt’s Defer to Production feature. Read now
Optimize Costs with dbt Cloud's Defer to Production Feature

Are you being asked to do more with less when it comes to your data pipelines? (Join the club.) 

Cost optimization can cut across multiple dimensions. It can mean simply spending less on computing and storage costs. But it can also mean reducing operational overhead and gaining more time to focus on higher-value work. It can also mean reducing the complexity of your data pipelines, making them easier to understand and less error-prone. 

dbt Cloud’s Defer to Production feature helps save time and reduce costs across all three dimensions. In this article, I’ll discuss how Defer to Production works, how to use it, and how it saves your dev team both time and money. 

Saving data costs 

In general, there are three ways to reduce data costs: 

Champion cost-efficient data development. You can use tools that enable your teams to produce clean, optimized code that reduces duplication. Features like dbt Cloud’s Semantic Layer enable you to define standardized business metrics in one place, while dbt Explorer accelerates troubleshooting and reduces the time required to perform root cause analysis. 

Reduce data platform complexity and cost. Reducing computing time can save significantly on data pipeline development costs. dbt Cloud helps reduce costs by canceling stale or duplicative CI builds, intelligently running select jobs in parallel, and providing timing statistics to enable identifying and refactoring long-running jobs. 

Spend more time on high-value work. The easier you can make data pipeline development, the faster your team can produce new data products with significant business impact. dbt Cloud’s intuitive IDE means even non-data engineers can troubleshoot data pipeline issues and contribute to projects. Features like rich documentation generation mean users can self-service answers to their most burning questions about the data model. 

Defer to Production is one of several features in dbt Cloud that reduces overall costs in the data workflow. Let’s see in detail how it accomplishes this. 

Overview of Defer to Production 

Before diving into Defer to Production, let’s discuss the problem it’s intended to solve. 

Your dbt models connect to one another to form a Directed Acyclic Graph (DAG) that represents your full data model. Models that your current model references are considered upstream of it. 

You can build your models in different environments that represent the different stages in your development process. For example, you may have separate dev, staging, and production environments. You’ll typically make changes in a personal, dedicated dev environment so that your work doesn’t interfere with that of other team members. 

Let’s say you need to change your customers model by adding a new field. Your customers model references an upstream model, stg_customers, that represents customer data imported from another system. You would express this in dbt using the {{ ref() }} function: 

customers as {
  select * from {{ ref('stg_customers') }}
}

Assume you’ve added the new field to customers and want to test it. To do that, dbt needs to build and run customers so it has these changes and the latest data. But it also needs the latest data for all the other models upstream of it. 

In other words, dbt will rebuild, not just the customers model, but the stg_customers model. It’ll also rebuild any models that that model references. If the chain of upstream models connected to the customers model is, let’s say, 12 models, you’ll need to wait for all 12 to rebuild before testing this minor change. And you’ll need to wait again if you have to change your fix and re-test it. 

This is where Defer to Production comes in. By default, dbt builds your change against the current environment. By using Defer to Production, you can rebuild only a single model that you’ve selected. For the other models, dbt will use the definitions and the latest data from your last successful production run. 

Using Defer to Production is simple. First, you toggle Defer to production on in the dbt Cloud UI. Then, you edit a single model and run it. dbt Cloud will automatically obtain the data for the non-edited models from the production environment instead. 

Defer to Production offers additional flexibility beyond this. For example, you can configure it to use an environment for deferrals that differs from your production environment. This might be, for example, a pre-prod environment that doesn’t contain sensitive customer data. 

How Defer to Production reduces costs and frees up time

Defer to Production creates cost efficiencies in several different ways: 

It reduces model builds in dev. Without Defer to Production, our customers model would need to rerun 12 different models every time you needed to test a change. By using Defer to Production, you can eliminate that overhead. 

This speeds up development in even small schemas. It can lead to drastic performance increases when a model has dozens or hundreds of upstream dependencies. That’s time developers can use productively instead of waiting for a build job to finish. 

It reduces build times for extremely active developers. For devs who are regularly adding new objects, rebuilds can be painful. That’s because adding new objects involves rebuilding the entire model and any related models. With Defer to Production, you eliminate the need to rebuild the world, resulting in much shorter dev cycles. 

It removes the need to import lots of data into the dev environment. Testing your model changes requires data. That means you have to source data in your dev environments from somewhere. That takes time to develop - and time for each dev to import into their dev environments. 

With Defer to Production, however, there’s no need to load data outside of the current model you’re editing and testing. You can use production or another environment’s data instead. For example, you could have a single pre-prod environment loaded with mock data. Every team member could use Defer to Production to refer to this master data set during development instead of importing their own copies.  

This also saves you on storage. Because you only need to keep a single data set, you can eliminate the costs associated with storing n copies of your dev data. 

It reduces onboarding time. Finally, Defer to Production speeds up development for new arrivals to the team. Instead of importing test data and building the entire DAG from scratch, a new data engineer can use Defer to Production to run changes on a model immediately. 

Defer to Production is one of many dbt Cloud features that reduce computing and storage costs, reduce data development complexity, and save data developers countless hours with reduced build times.

New to dbt Cloud? Sign up for a free account and try it out for yourself.

Last modified on: Jan 11, 2024