Overriding Schema Generation

Blog

Product

Brian Gillet

on Oct 15, 2024

The first dbt Live: Expert Series session is a wrap 🎉!

What is dbt Live?

The dbt Live: Expert Series is a 60-minute interactive experience where dbt Labs Solution Architects share practical advice on how to tackle the sticky problems we see come up over and over again in the field—like how to safely split environments with CI/CD, or how to implement blue/green deployments. Event agendas are shaped by audience requests, so whether you’re new to dbt, or just want to sharpen your skills, we’d love to see you there! Register to take part in the next live session.

Session Recap: Overriding Schema Generation

The first dbt Live session was led by Randy Pitcher, Solutions Architect at dbt Labs. Watch the full replay here.

Randy kicked things off by live coding a solution to a common problem he’s seen at many organizations: How to split dev and prod environments across databases.

Let’s start with what organizations are trying to achieve through this setup. Many organizations want to follow an environment promotion process like the one illustrated below, to make sure developers do not overwrite each other’s work in development and to put changes through a staging or QA check before merging them with production.

Breaks in production can produce incorrect or unavailable data leading to missed market opportunities or bad decisions. Organizations struggle with overriding default dbt behaviors for schema and database generation to arrive at their intended environment promotion process.

There are different ways to set this up if you’re running self-hosted (dbt Core) or dbt Cloud. But, Randy showed makes a more consistent process no matter how you deploy dbt. To achieve this, we can use macros to modify how dbt interacts with databases and schemas instead of using profiles.yml files (in dbt Core) or the environment configuration page (in dbt Cloud).

Using a macro to define this configuration is useful because it can be version controlled to monitor any changes to the script over time. This helps teams align with their company’s analytics governance processes.

Step-by-step: Walking through Randy’s approach

Getting started, Randy opened up dbt Cloud (or you can use your IDE of choice if you're using dbt Core) and created a new file within the macros/config folder named generate_schema_name.sql

Begin by defining the macro and what it will return (custom_schema_name) when called within your models. Defining the macro as generate_schema_name will overwrite the default dbt behavior for this macro.

{%- macro generate_schema_name(custom_schema_name, node) -%}

{%- endmacro -%}

Log the custom schema name output by the macro when it is run to help with troubleshooting. Node generates the node that is currently being processed by dbt such as {{this.database}} or {{this.schema}}.

{%- macro generate_schema_name(custom_schema_name, node) -%}
{{ log( node ~ '\n custom schema name: ' ~ custom_schema_name, info=True) }}
{%- endmacro -%}

Define each of your target environments, which are located in dbt Cloud environments or your profiles.yml file in dbt Core. For Randy’s demo, he defined default, pull request, and production environments.

{%- macro generate_schema_name(custom_schema_name, node) -%}
{{ log( node ~ '\n custom schema name: ' ~ custom_schema_name, info=True) }}
  {% if target.name == 'default' %}

  {% elif target.name == 'pr_test' %}

  {% elif 'production' in target.name %}

  {% endif %}
{%- endmacro -%}

Add to your target.name conditional statements the name you want dbt to generate.

{%- macro generate_schema_name(custom_schema_name, node) -%}
{{ log( node ~ '\n custom schema name: ' ~ custom_schema_name, info=True) }}
  {% if target.name == 'default' %}
    {{target.schema}}{{ '_' ~ custom_schema_name if custom_schema_name else '' }}
  {% elif target.name == 'pr_test' %}
    {{target.schema}}{{ '_' ~ custom_schema_name if custom_schema_name else '' }}
  {% elif 'production' in target.name %}
    {{ custom_schema_name if custom_schema_name else target.schema }}
  {% endif %}
{%- endmacro -%}

Once you’ve developed your code, save your file and compile your code to see if there are any compilation errors.

The code above will append the project name to the schema name if the target name equals default or pr_test. To keep individual developers from overwriting each other’s work in dev and pull requests, all while working within one database, Randy suggests using the best practice of naming schemas with this pattern: dbt_username_projectname.

Now, you’ve overridden the default generate_schema_name function and you can further customize this for your processes. You can check out a replay of these steps and hear Randy’s commentary on the process by watching the replay of the session below.

If you followed along with the video and were wondering if a copy of Randy’s repo would become available, we’ll save you some clicks by sharing his generate_schema_name script with you:

{%- macro generate_schema_name(custom_schema_name, node) -%}
{{ log( node ~ '\n custom schema name: ' ~ custom_schema_name, info=True) }}
  {% if target.name == 'default' %}
    {{target.schema}}{{ '_' ~ custom_schema_name if custom_schema_name else '' }}
  {% elif target.name == 'pr_test' %}
    {{target.schema}}{{ '_' ~ custom_schema_name if custom_schema_name else '' }}
  {% elif 'production' in target.name %}
    {{ custom_schema_name if custom_schema_name else target.schema }}
  {% endif %}
{%- endmacro -%}

If you’re not familiar with the target variable, config, or custom schemas, we suggest checking out the dbt docs for more info on how they work and more use cases.

Participant Questions

After the live coding session, Randy answered Community member questions live and in the Slack channel.

Here are a couple of the questions:

Great question, Daisy! To resolve this:

Go into dbt Cloud and if you are an account admin, go to Account Settings.
Select what project you want to change the GitHub repo.
Click the repository and disconnect the repo. This will not delete anything in the repo. Try to do this during off hours to not disrupt your developers.
Go to your Profile and select Integrations.
Click Link your GitHub profile and follow the instructions in dbt Cloud docs to select and configure it to connect to your organization’s GitHub repo.
Go back to the Account Settings page and select the project where you removed the GitHub repo earlier.
Click Configure a repository, select Github and then search to find the repo to import for your project. This will become the repo used for your branching process, IDE development, and running jobs in dbt Cloud for this project.

Thank you James, this question comes up frequently within the dbt Community in the #analytics-craft and #advice-data-modeling channels.

From Randy’s experience he sees most customers work with a mono-repo. The major reason is to have all of your code in one place to help with testing. If teams do break up a mono-repo into smaller projects this can result in duplication of models and work.

However, Randy said there are exceptions. Some customers with thousands of models do end up splitting their projects up and Randy advises:

Split up projects by domains, not maturity. For example: marketing, manufacturing, finance
Ensure there is one set of imports within a single foundational project
Define the outputs of the single foundational project as sources in your domain projects
Import the foundational project in your sub-project (domain project)
Follow the same pattern as you develop other sub-projects

Join the dbt Community Slack and the #events-dbt-live-expert-series channel to see more question responses and to ask your questions for upcoming sessions.

Want to see more of these sessions? You’re in luck as we have more of them in store. Go register for future sessions with more members of the dbt Labs Solution Architects team.

Until next time, keep on sharing your questions and thoughts on this session in the dbt Community Slack!

Published on: May 18, 2022

2025 dbt Launch Showcase

Catch our Showcase launch replay to hear from our executives and product leaders about the latest features landing in dbt.

Watch the launch event replay

Set your organization up for success. Read the business case guide to accelerate time to value with dbt.

Read now

Latest posts

Learn13 min

How dbt and Tableau bring better governance to analytics

Kathryn Chubb

on Jul 11, 2025

Product7 min

How Roche unified global data, enabled AI at scale, and improved operational efficiency

Hrishi Kulkarni,Ernesto Ongaro

on Jul 10, 2025

Product6 min

Virgin Media O2 rebuilds its data stack and speeds up the customer experience

Hrishi Kulkarni,Ernesto Ongaro

on Jul 10, 2025

The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

Join the Community Explore the community

100,000+active members

50k+teams using dbt weekly

50+Community meetups

Overriding Schema Generation

The first dbt Live: Expert Series session is a wrap 🎉!

What is dbt Live?

Session Recap: Overriding Schema Generation

Step-by-step: Walking through Randy’s approach

Participant Questions

2025 dbt Launch Showcase

Share this article

Latest posts

How dbt and Tableau bring better governance to analytics

How Roche unified global data, enabled AI at scale, and improved operational efficiency

Virgin Media O2 rebuilds its data stack and speeds up the customer experience

Join the largest community shaping data