Blog Refactoring legacy SQL to dbt

Refactoring legacy SQL to dbt

As an analytics engineer on the professional services team at dbt Labs, I've helped a dozen or so data teams refactor from a legacy SQL transformation workflow to modular data modeling in dbt.

I've noticed a clear pattern: rare is the team that starts with a completely clean slate. More often than not, teams have accumulated years of SQL code powering business-critical, often complex data transformations.

These SQL transformations may run in a stored procedure, in an Airflow SQL operator, or some other scripting environment---but regardless of where it runs, that legacy SQL code must be migrated!

How can that migration be done with a) minimal headache and b) maximum quality?

Announcing a new (free!) on-demand course

Over the last 18 months, the dbt Labs professional services team has evolved a process to help teams migrate legacy SQL queries to dbt SQL.

And as of today, you can now learn + implement that same refactoring process, with a new on-demand course.

If you're completely new to dbt, you may want to start with the dbt Fundamentals course before diving into this one.

Everyone learns a bit differently, so you can pick up the course materials in a few different formats:

Our SQL refactoring process, in short

For a bit of background, the migration workflow that we follow internally is:

  1. Migrate your legacy SQL code, unchanged---make sure it runs and produces an identical output.
  2. Implement dbt sources (as opposed to raw table references).
  3. Choose a refactoring strategy.
  4. Implement clean CTEs (we ❤️ CTEs).
  5. Separate transformation logic into standardized data model layers (often the longest step).
  6. Audit the output (often using the audit_helper dbt package).

There is, of course, no one true way to refactor legacy SQL into dbt models!

I'd love to hear from you (on Twitter or in #learn-on-demand in dbt Community Slack) how your approach varies from the one we lay out in the course.

Last modified on: Jun 11, 2024

Accelerate speed to insight
Democratize data responsibly
Build trust in data across business

Achieve a 194% ROI with dbt Cloud. Access the Total Economic Impact™️ study to learn how. Download now ›

Recent Posts