Reverse ETL is the process of getting your transformed data stored in your data warehouse to end business platforms, such as sales CRMs and ad platforms. Once in an end platform, that data is often used to drive meaningful business actions, such as creating custom audiences in ad platforms, personalizing email campaigns, or supplementing data in a sales CRM. You may also hear about reverse ETL referred to as operational analytics or data activation.
Instead of only exposing data in a business intelligence (BI) tool, reverse ETL pipelines allow data to be synced down to ad platforms (Facebook, Linkedin), sales CRMs (Salesforce), customer service platforms (Zendesk, Kustomer), and other destinations. There are a variety of platforms, such as Hightouch, Rudderstack, and Census, that help data teams harness the power of reverse ETL.
In this post, we’ll look at the practical advantages, use cases, and methods for implementing reverse ETL.
How reverse ETL works
In the reverse ETL process, transformed data is synced from a data warehouse to external tools in order to be leveraged by different business teams.
The power of reverse ETL comes from sending down already transformed data to business platforms. Raw data, while beautiful in its own way, typically lacks the structure, aggregations, and aliasing to be useful for end business users off the bat.
After data teams transform data for business use in ELT pipelines, typically to expose in an end business intelligence (BI) tool, they can also send this cleaned and meaningful data to other platforms where business users can derive value using reverse ETL tools.
Data teams can choose to write additional transformations that may need to happen for end business tools in reverse ETL tools themselves or by creating additional models in dbt.
Why use reverse ETL?
There’s a few reasons why your team may want to consider using reverse ETL:
Putting data where the work is happening
While most data teams would love it if business users spent a significant portion of their time in their BI tool, that’s neither practical nor necessarily the most efficient use of their time.
In the real world, many business users will spend some time in a BI tool, identify the data that could be useful in a platform they spend a significant amount of time in, and work with the data team to get that data where they need it. Users feel comfortable and confident in the systems they use everyday—why not put the data in the places that allow them to thrive?
Manipulating data to fit end platform requirements
Reverse ETL helps you to put data your business users need in the format their end tool expects. Oftentimes, end platforms expect data fields to be named or cast in a certain way.
Business users shouldn’t have to worry about manually entering fields that can easily change; instead, that data should be populated by our version-controlled, QA’d, and peer-reviewed dbt models. To do this, we leverage reverse ETL pipelines to create automated workflows that send data down data to a variety of end business tools that we specify as exposures in dbt.
Supporting self-service efforts
By sending down data-team approved data in reverse ETL pipelines, your business users have the flexibility to use that data however they see fit. Soon, your business users will be making audiences, testing personalization efforts, and running their end platform like a well-oiled, data-powered machine.
Reverse ETL use cases
Just as there are almost endless opportunities with data, there are many potential different use cases for reverse ETL. We won’t go into every possible option, but we’ll cover some of the common use cases that exist for reverse ETL efforts.
Personalization
Reverse ETL allows business users to access data that they normally would only have access to in a BI tool in the platforms they use every day. As a result, business users can now use this data to personalize how they create ads, send emails, and communicate with customers.
Personalization was all the hype a few years ago and now, you rarely ever see an email come into your inbox without some sort of personalization in-place. Data teams using reverse ETL are able to pass down important customer information, such as location, customer lifetime value (CLV), tenure, and other fields, that can be used to create personalized emails, establish appropriate messaging, and segment email flows. All we can say: the possibilities for personalization powered by reverse ETL are endless.
Sophisticated paid marketing initiatives
At the end of the day, businesses want to serve the right ads to the right people (and at the right cost). A common use case for reverse ETL is for teams to use their customer data to create audiences in ad platforms to either serve specific audiences or create lookalikes.
While ad platforms have gotten increasingly sophisticated with their algorithms to identify high-value audiences, it usually never hurts to try supplementing those audiences with your own data to create sophisticated audiences or lookalikes.
Self-service analytics culture
We hinted at it earlier, but reverse ETL efforts can be an effective way to promote a self-service analytics culture. When data teams put the data where business users need it, business users can confidently access it on their own, driving even faster insights and action.
Instead of requesting a data pull from a data team member, they can find the data they need directly within the platform that they use. Reverse ETL allows business users to act on metrics that have already been built out and validated by data teams without creating ad-hoc requests.
"Real-time" data
It would be amiss if we didn’t mention reverse ETL and the notion of “real-time” data. While you can have the debate over the meaningfulness and true value-add of real-time data another time, reverse ETL can be a mechanism to bring data to end business platforms in a more “real-time” way.
Data teams can set up syncs in reverse ETL tools at higher cadences, allowing business users to have the data they need, faster. Obviously, there’s some cost-benefit analysis on how often you want to be loading data via ETL tools and hitting your data warehouse, but reverse ETL can help move data into external tools at a quicker cadence if deemed necessary.
All this to say: move with caution in the realm of “real-time”, understand your stakeholders’ wants and decision-making process for real-time data, and work towards a solution that’s both practical and impactful.
Reverse ETL tools
Reverse ETL tools typically establish the connection between your data warehouse and end business tools, offer an interface to create additional transformations or audiences, and support automation of downstream syncs. Below are some examples of tools that support reverse ETL pipelines.
The dbt approach to reverse ETL
We tackle reverse ETL projects with the following step-based approach:
- Understand stakeholder expectations: What’s the problem we’re solving for here?
- Identify the data models and fields needed to be pushed to the tools
- Transformation time! Create export models to power reverse ETL pipelines
- QA it up, create syncs in our reverse ETL tool, and involve stakeholders for training
- Setup dbt exposures for governance and documentation
- Establish alerts for reverse ETL pipelines to provide transparency to end business users
Below, we’ll walk through these steps in greater detail using a user segmentation example with the jaffle shop, a sample dataset and dbt project.
Before we jump into that, a few prerequisites you may need to create and manage reverse ETL pipelines are:
- Data stored in a data warehouse
- A reverse ETL tool to perform syncs from your data warehouse to your end platforms
- An end platform that you’re looking to push data to, such as a sales CRM or ad platform
Step 1: Understand stakeholder expectations
We summarize stakeholder needs by determining how often data ingestion, dbt Cloud jobs, and reverse ETL syncs should occur. This ensures alignment between business goals and technical capabilities for timely, actionable insights. Our approach blends stakeholder input with expert analytics intuition to define realistic SLAs.
Step 2: Map out entities & fields
Now we’re getting into the fun stuff! At this stage, we need to understand the data and dbt models needed to really build export models. We do the following forms of planning: entity matching and field identification.
Entity matching
It’s important to understand how the grains differ between our existing dbt models and the end business tools’ system. For example, how does a Salesforce contact match or map to a dbt Cloud account? This is the type of question we solve in this stage of planning.
Field identification
At this stage, we need to determine if we have all the fields already modeled out or if we need to create new fields for the project. Most of the data business users want exported down are derived from our fct_ and dim_ models. Oftentimes, the fields stakeholders want in their downstream tool already exist in our data models, but may need to be cast, renamed, or grouped differently.
Step 3: Create export models
Just as we have marts, staging, and metrics directories for our dbt project, we create a separate models/exports directory for all of our data models we want to sync to business tools. For every existing data model we want to sync out to different destinations, we have one export model.
For example, if we have a customers model like in jaffle shop that we want to be synced down to multiple destinations, we would create a single export_customers.yml model.
The primary goal of these export models is to supplement any existing data or modeling in our fct_ and dim_ models required by stakeholders in the structure that the tool we’re syncing to expects. I just said a lot of words, so let’s break down what I really mean by this.
Our fct_ and dim_ tables stored in our models/marts directory form the backbone of analytics at dbt Labs. An immense amount of work goes into making our fct_ and dim_ tables what they are:
- Closer to the end of the DAG
- Have undergone considerable transformation
- Likely experienced some entity joins to get them in their final shape
- True and proven: these models have undergone immense testing and peer review to ensure the data they hold are accurate and usable
Our ultimate belief: Export models shouldn’t be doing the heavy lifting that happens for fact and dimensional models.
Step 4: QA, set-up reverse ETL sync, & train stakeholders
We conduct thorough quality checks on export models, then sync data to a sandbox environment and validate with stakeholders. This ensures data accuracy, usability, and stakeholder confidence before affecting production. dbt emphasizes stakeholder training to empower teams with self-service capabilities and clarity on data generation.
Step 5: Create exposures
Just as we would for key dashboards in our BI tool, we create dbt exposures for our reverse ETL syncs. Exposures are a great way to provide ownership, identify dependencies, and context into reverse ETL projects. They also allow us to visualize all of our downstream dependencies in the DAG.
We create a separate exposure_export.yml file in the models/exports directory where we store exposure information specifically for all of our export models. Below is an example of how we would document the exposure for the export_customers model we created for the jaffle shop in Step 3.
- name: email_crm_export
type: application
maturity: medium
url: <reverse ETL tool sync URL>
description: '{{ doc('email_crm_export') }}'
depends on:
- ref('export_customers')
owner:
name: brandon t.
email: b******@dbtlabs.com
Step 6: Set up alerts in reverse ETL tool
We establish alerts for sync failures caused by upstream changes, destination outages, or tool issues, notifying stakeholders promptly. This maintains transparency and ensures timely resolution of data pipeline disruptions. We integrate these alerts with Slack channels for effective communication and resolution tracking.
Example: mapping out entities in customer segmentation
As we mentioned earlier, a common use case for reverse ETL is to send customer information down to email platforms to support better customer segmentation. For this scenario, we can use the jaffle shop’s customer table to create a new export model. While we’re using the jaffle shop here as an example, it represents a similar flow to what we do internally for syncing to our email platform.
The table
The customers table here has seven columns: customer id, first name, last name, first order date, most recent order date, total number of orders, and customer lifetime value (CLV).
All of these fields could potentially be used by business users in an email platform–maybe they want to pass in a customer’s number of orders into the email subject line to make the emails more personalized or create different customer segments based on CLV for a loyalty program.
As a result, we’re going to make sure we pass down all possible fields. In addition, the marketing team requested a few features about this export model:
- The marketing team wants to create a new segment, buyer_type based on the number of orders a customer has placed
- The team is interested in having the most recent order date and first order date extracted out to the month-level so that they can contact new, bounced, and longtime loyal customers with different messaging
The code
Use the following code to create an export_customers.sql model that lets your friends on the marketing team go wild and free with email segmentation.
with import_customers as (
select * from {{ ref('customers') }}
),
segment_customers as (
select
customer_id,
first_name,
last_name,
date_trunc('month', first_order_date) as first_order_month,
date_trunc('month', most_recent_order_date) as
most_recent_order_month,
number_of_orders,
customer_lifetime_value,
case when number_of_orders >= 3 then 'super_purchaser'
when number_of_orders >=1 and number_of_orders < 3 then
'regular_purchaser'
else 'non_purchaser'
end as buyer_type
from import_customers
)
select * from segment_customers
As you can see from this example, there’s not really any fancy or mind-boggling data modeling happening here.
All we did was extract some dates, alias a few fields, and add a simple case when statement.
Further reading
If you’re interested learning more about reverse ETL and the impact it could have on your team, check out the following:
Last modified on: Jan 17, 2025
Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.