Dec 6 - 10, 2021

Register for Free
Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization

Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization - Erica Louie

Keynote: How big is this wave? Keynote: How big is this wave?

Keynote: How big is this wave? - Martin Casado & Tristan Handy

dbt 101: Stories from real-life data practitioners + a live look at dbt dbt 101: Stories from real-life data practitioners + a live look at dbt

dbt 101: Stories from real-life data practitioners + a live look at dbt - Natty (Jon Natkins) & Alexis Wong Baird

How to build a mature dbt project from scratch

How to build a mature dbt project from scratch - Dave Connors

Analytics Engineering for storytellers

Analytics Engineering for storytellers - Winnie Winship

The modern data experience

The modern data experience - Benn Stancil

Identity Crisis: Navigating the Modern Data Organization Identity Crisis: Navigating the Modern Data Organization Identity Crisis: Navigating the Modern Data Organization Identity Crisis: Navigating the Modern Data Organization Identity Crisis: Navigating the Modern Data Organization

Identity Crisis: Navigating the Modern Data Organization - Jillian Corkin, David Jayatillake, Caitlin Moorman, Barr Moses & Stefania Olafsdottir

Git for the rest of us

Git for the rest of us - Claire Carroll

You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs) You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)

You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs) - Drew Banin & Reynold Xin

Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!

Share. Empower. Repeat. Come learn about how to become a Meetup Organizer! - Rosie Cardoso

The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation

The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation - Tejas Manohar

Refactor your hiring process: a framework (Workshop Sponsor) Refactor your hiring process: a framework (Workshop Sponsor) Refactor your hiring process: a framework (Workshop Sponsor)

Refactor your hiring process: a framework (Workshop Sponsor) - Ilse Ackerman, Ezinne Chimah & Rocío Garza Tisdell

Tailoring dbt's incremental_strategy to Artsy's data needs

Tailoring dbt's incremental_strategy to Artsy's data needs - Abhiti Prabahar

Optimizing query run time with materialization schedules

Optimizing query run time with materialization schedules - Ola Canty

How dbt Enables Systems Engineering in Analytics

How dbt Enables Systems Engineering in Analytics - Jorge Cruz Serralles

When to ask for help: Modern advice for working with consultants in data and analytics

When to ask for help: Modern advice for working with consultants in data and analytics - Jacob Frackson

Smaller Black Boxes: Towards Modular Data Products

Smaller Black Boxes: Towards Modular Data Products - Stephen Bailey

The Modern Data Stack: How Fivetran Operationalizes Data Transformations

The Modern Data Stack: How Fivetran Operationalizes Data Transformations - Nick Acosta

Analytics Engineering Everywhere: Why in Five Years Every Organization Will Adopt Analytics Engineering

Analytics Engineering Everywhere: Why in Five Years Every Organization Will Adopt Analytics Engineering - Jason Ganz

Down with

Down with "data science" - Emilie Schario

So You Think You Can DAG: Supporting data scientists with dbt packages

So You Think You Can DAG: Supporting data scientists with dbt packages - Emma Peterson

Operationalizing Column-Name Contracts with dbtplyr

Operationalizing Column-Name Contracts with dbtplyr - Emily Riederer

Data Paradox of the Growth-Stage Startup

Data Paradox of the Growth-Stage Startup - Emily Ekdahl

Batch to Streaming in One Easy Step Batch to Streaming in One Easy Step

Batch to Streaming in One Easy Step - Emily Hawkins & Arjun Narayan

The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation

The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation - Lewis Davies & Erika Pullum

Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team.

Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team. - Akia Obas

Observability Within dbt Observability Within dbt

Observability Within dbt - Kevin Chan & Jonathan Talmi

Inclusive Design and dbt

Inclusive Design and dbt - Evelyn Stamey

Built It Once & Build It Right: Prototyping for Data Teams

Built It Once & Build It Right: Prototyping for Data Teams - Alex Viana

Coalesce After Party with Catalog & Cocktails Coalesce After Party with Catalog & Cocktails

Coalesce After Party with Catalog & Cocktails - Tim Gasper & Juan Sequeda

How to Prepare Data for a Product Analytics Platform (Workshop Sponsor)

How to Prepare Data for a Product Analytics Platform (Workshop Sponsor) - Esmeralda Martinez

Toward a Polyglot Environment for Analytics

Toward a Polyglot Environment for Analytics - Caitlin Colgrove

Automating Ambiguity: Managing dynamic source data using dbt macros

Automating Ambiguity: Managing dynamic source data using dbt macros - Eric Nelson

The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness

The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness - Kevin Hu

Data as Engineering

Data as Engineering - Raazia Ali

Building On Top of dbt: Managing External Dependencies

Building On Top of dbt: Managing External Dependencies - Teghan Nightengale

Data Analytics in a Snowflake world: A conversation with Christian Kleinerman and Tristan Handy Data Analytics in a Snowflake world: A conversation with Christian Kleinerman and Tristan Handy

Data Analytics in a Snowflake world: A conversation with Christian Kleinerman and Tristan Handy - Tristan Handy & Christian Kleinerman

Keynote: Building a Force of Gravity

Keynote: Building a Force of Gravity - Drew Banin

dbt Core v1.0 Reveal ✨

dbt Core v1.0 Reveal ✨ - Jeremy Cohen

Firebolt Deep Dive - Next generation performance with dbt Firebolt Deep Dive - Next generation performance with dbt

Firebolt Deep Dive - Next generation performance with dbt - Kevin Marr & Cody Schwarz

dbt, Notebooks and the modern data experience dbt, Notebooks and the modern data experience

dbt, Notebooks and the modern data experience - Allan Campopiano & Elizabeth Dlha

No silver bullets: Building the analytics flywheel No silver bullets: Building the analytics flywheel No silver bullets: Building the analytics flywheel

No silver bullets: Building the analytics flywheel - Kelly Burdine, Lewis Davies & Erika Pullum

Don't hire a data engineer...yet

Don't hire a data engineer...yet - Stefania Olafsdottir

dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake

dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake - Ricardo Portilla

The Future of Data Analytics The Future of Data Analytics The Future of Data Analytics The Future of Data Analytics

The Future of Data Analytics - Sarah Catanzaro, Jennifer Li, Astasia Myers & Julia Schottenstein

Implementing and scaling dbt Core without engineers

Implementing and scaling dbt Core without engineers - Elliot Wargo

Building an Open Source Data Stack

Building an Open Source Data Stack - Katie Hindson

This is just the beginning

This is just the beginning - Alan Cruickshank

dbt in a data mesh world

dbt in a data mesh world - José Cabeda

Introducing the activity schema: data modeling with a single table

Introducing the activity schema: data modeling with a single table - Ahmed Elsamadisi

From Diverse

From Diverse "Humans of Data" to Data Dream "Teams" - Prukalpa Sankar

From 100 spreadsheets to 100 data analysts: the story of dbt at Slido From 100 spreadsheets to 100 data analysts: the story of dbt at Slido From 100 spreadsheets to 100 data analysts: the story of dbt at Slido

From 100 spreadsheets to 100 data analysts: the story of dbt at Slido - Daniela Barokova, Michal Kolacek & Andrej Svec

To All The Data Managers We've Loved Before To All The Data Managers We've Loved Before

To All The Data Managers We've Loved Before - Paige Berry & Adam Stone

Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines (Workshop Sponsor)

Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines (Workshop Sponsor) - Francisco Alberini

Upskilling from an Insights Analyst to an Analytics Engineer

Upskilling from an Insights Analyst to an Analytics Engineer - Brittany Krauth

Modeling event data at scale (Workshop Sponsor)

Modeling event data at scale (Workshop Sponsor) - Will Warner

Building a metadata ecosystem with dbt

Building a metadata ecosystem with dbt - Darren Haken

New Data Role on the Block: Revenue Analytics

New Data Role on the Block: Revenue Analytics - Celina Wong

Using dbt to understand open-source communities

Using dbt to understand open-source communities - Srini Kadamati

Getting Meta about Metadata: Building Trustworthy Data Products Backed by dbt (Workshop Sponsor) Getting Meta about Metadata: Building Trustworthy Data Products Backed by dbt (Workshop Sponsor)

Getting Meta about Metadata: Building Trustworthy Data Products Backed by dbt (Workshop Sponsor) - Angie Brown & Kelechi Erondu

🍪 Eat the data you have: tracking core events in a cookieless world

🍪 Eat the data you have: tracking core events in a cookieless world - Jeff Sloan

Trials and Tribulations of Incremental Models

Trials and Tribulations of Incremental Models - Vincey Au

Sharing the knowledge - joining dbt and

Sharing the knowledge - joining dbt and "the Business" using Tāngata - Chris Jenkins

SQL Draw Artworks Review Panel

SQL Draw Artworks Review Panel - James Weakley

Answering community questions

We encourage dbt users to ask troubleshooting questions on Stack Overflow, and welcome in-depth discussion in our Slack workspace — as a rule of thumb, if there is an objectively “correct” answer to a question, it belongs on Stack Overflow. If the question requires opinions or discussion, it belongs in Slack.

As our community grows, we appreciate your help in answering questions in both forums! That being said, we want to make sure we continue to build goodwill in the community, and want to encourage a certain level of quality.

In general, if a user has put effort into asking their question, pay it forward by putting effort into providing a good answer. In our opinion, when writing an answer you should:

  • Demonstrate empathy for a wide range of experiences.
  • Solve the actual problem, which may not be the same as the problem a user is asking about.
  • Explain the why
  • Take the opportunity to teach

Demonstrate empathy #

Many new dbt users come from a background where software engineering practices may be new to them. Some of our users come from a data engineering background, and may be more familiar with an ETL paradigm (as opposed to ELT).

When we answer questions on Slack, we should consider the experience of the user, and adjust our answers accordingly (see: curse of knowledge).

When answering questions on Stack Overflow, consider that a number of people with different experience levels might read this question in the future, so it’s worth taking the time to put some additional context into your answer.

Help solve the actual problem #

Sometimes users ask for help because they are trying to work against dbt, and while there’s normally a reasonable workaround, that may just be a bandaid for a bigger problem. If something feels off with the way someone is using dbt, you should ask what they are trying to achieve. In Stack Overflow use a comment to request clarification.

Along these same lines, if you’re not sure of the answer, it might be best to let someone else provide an answer rather than leading someone down the wrong path! Of course, some problems end up being thornier than they appear on the surface, so it’s ok if you find yourself in too deep. Just let the user know that you’re not the best person to answer their question.

Explain the why #

Sometimes users ask how to do something with dbt that is simply unsupported. If that’s the case, try to add context to your answer rather than just saying “you can’t do that”.

If we’ve chosen not to design dbt to do that, explain why this design decision was made. If it’s something we want to change in the future, share a link to a GitHub issue or if one doesn’t exist, ask the user to create a new GitHub issue. Alternatively, provide a workaround if one exists!

Take the opportunity to teach #

If someone is asking a simple troubleshooting question, still take the time to explain why that’s the solution along with the snippet of code — hopefully in the future they will be able to avoid similar problems, or someone else experiencing a similar, but different, problem will be able to solve their problem too!

Where possible, link to documentation so that users can understand where to find answers to these questions independently in the future.

Examples of great answers #

How do I pass a string to a macro? #

The question

I’m using a macro like so:

{% macro cents_to_dollars(column_name, precision=2) %}
    ({{ column_name }} / 100)::numeric(16, {{ precision }})
{% endmacro %}

But when I call it in my model:

select
  id as payment_id,
  {{ cents_to_dollars(amount) }} as amount_usd,
  ...
from app_data.payments

I get back this error:

Completed with 1 error and 0 warnings:

Database Error in model stg_payments (models/stg_payments.sql)
  001003 (42000): SQL compilation error:
  syntax error line 10 at position 6 unexpected '/'.
  syntax error line 10 at position 12 unexpected '::'.
  compiled SQL at target/run/jaffle_shop/models/stg_payments.sql

Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

An unhelpful answer

For an experienced dbt user, it’s tempting answer:

Add quotes! {{ cents_to_dollars('amount') }}

And leave it at that.

A good answer This answer answer instead:

  • Provides the solution
  • Explains why they are getting back a Database Error, and how to dig deeper into it (empowering the user to debug future problems)
  • Explains why you need to quote the string

To fix this, you need to quote your argument, like so:

{{ cents_to_dollars('amount') }} as amount_usd,

The quotes tell the Jinja parser to use amount as a string in the macro. Without the quotes, the Jinja parser is looking for a variable named amount to use instead. Since no variable named amount exists, your macro is getting compiled to the following:

select
  id as payment_id,
  ( / 100)::numeric(16, 2) as amount_usd,

You’ll be able to see this SQL if you open the file listed in the error message (target/run/jaffle_shop/models/stg_payments.sql).

Since the above is invalid SQL, when dbt tried to run the query, Snowflake gave back an error that dbt passes on. This is why we see a Database Error (docs) even though it’s a Jinja problem.

How do I create temporary tables with dbt? #

The question

I can’t find a way to create temporary tables in dbt. Using an ephemeral model that gets injected as a CTE slows down my run significantly compared to using a temporary table, so I worked around this by creating a temporary table using a pre-hook, but then I don’t see it in the lineage graph.

A good answer

The following answer:

  • Demonstrates empathy for a wide range of experience by taking the time to explain what temporary tables are
  • Explains why temporary tables aren’t supported in dbt
  • Attempts to understand the actual problem, i.e. that a table is more performant that using a CTE
  • Empowers the user to make a decision for their use case by listing the pros and cons, and links the docs so that they can answer any additional questions.

You’re right in thinking that dbt does not support temporary tables. That’s because temporary tables only persist in a single session, and dbt opens one connection/session per thread. Therefore any temporary tables created on one thread would not be visible to a model running on a different thread.

It sounds like CTEs are a performance drag for you though — out of interest, which warehouse are you using?

You’ve identified two workarounds, and there’s another one worth discussing:

Option 1: Materialize your model as CTEs using the ephemeral materialization (docs)

Pros:

  • The models show up in the lineage graph
  • You can re-use these transformations in multiple downstream models by ref-ing them
  • You can test and document these models

Cons:

  • At some point there is a performance degradation with too many stacked CTEs (especially on older versions of postgres, where CTEs are an optimization fence)
  • Compiled SQL can be harder to debug

Option 2: Use pre-hooks to create temp tables

I would generally recommend against this — you can’t test or document your models, and they won’t be in the lineage graph (as you’ve noted).

Option 3: Materialize these models as tables in a separate schema, and drop the schema at the end of a run

  • Use the schema config to materialize a model in a separate schema
{{ config(
  materialized='table',
  schema='my_temporary_schema'
) }}
  • Then, at the end of a run, use an on-run-end hook (docs) to drop that schema — in your dbt_project.yml:
on-run-end: "drop schema my_temporary_schema cascade"

Pros:

  • All the benefits of Option 1 (lineage, testing, reusable)
  • Sounds like it might be more performant than using CTEs

Cons:

  • Make sure you don’t have any dependent views that select from that schema! They might get dropped when you run a drop cascade command! This introduces fragility into your project!