class: title, center, middle # Refactoring ??? Start the lesson by connecting to previous knowledge that was either just taught or we can safely assume learners know. Prerequisite knowledge: - models, ref function, materializations - sourses - dbt project design ---
Refactoring | Focus
Build a mental model for refactoring existing transformation code into dbt
Apply refactoring principles to refactor transformation code
??? Set the focus for the lesson with the slides here --- # Refactoring transformation code into dbt 1. Migrating legacy code 2. Translate hard coded table references 3. Choose a refactoring strategy 4. CTE Grouping and Cosmetic Cleanups 5. Centralizing Transofrmations & Splitting up models a. Staging models b. CTEs or intermediate models c. Final model 6. Auditing ??? High level overview - provide a quick overview of the framework for each step in the refactoring process. We will go deep on each of these shortly. --- # 1. Migrating Legacy code - Find the SQL code that you want to refactor from your legacy system - Add a folder to your models directory titled `legacy` - Copy over the snippet of SQL to a new SQL file - Make light adjustments to the query to make sure that you can successfully build the model with `dbt run` --- class: center, middle # Let's see it in action ??? Quick demo: Move the legacy query to the the legacy folder in the model directory. Ensure that dbt run executes and we can actually build this particular model. --- # 2. Translate hard coded table references - For all source tables, we want to leverage the source function in dbt - Build a sources.yml file to configure a source table for each raw table - Update raw table references using the source function .left-column[ **Original SQL:** ```sql select * from raw.stripe.payment``` ] .right-column[ **Updated SQL:** ```sql select * {{ source('stripe','payment') }}``` ] --- class: center, middle # Let's see it in action ??? 1. Create a sources.yml file for configuring the raw tables 2. Update the raw table references. --- # 3. Choose a refactoring strategy Two options: - In place refactoring: Work directly on the original query - Pro: No old models to clean up at the end - Cons: Pressure to get it right the first time, harder to audit, must leave a good git trail to review progess - Side-by-side refactoring: - Pros: less impact on end users, less pressure to get it right the first time, easier to audit - Con: Old files need to be cleaned up at the end. - We generally recommend the latter, this really becomes clear in step 6: auditing **Note: for this lesson, we are going to use side-by-side refactoring** --- # 4. CTE grouping and cosmetic cleanups - Reorganize distinct concepts into CTEs - Rewrite subqueries as CTEs - Ensure that the cleanup align with your style guide --- class: center, middle # Let's see it in action ??? - Separate each of the concepts brought together in the query into separate CTEs and then start to refactor the code. - The first few CTEs should represent only the raw tables that are brought in and then they should start to be joined. --- # 5. Centralizing transformation & splitting up models After refactoring your code into CTEs, it becomes much clearer for how to break up the model into **staging models, intermediate models, and final models.** - **Staging models:** 1:1 with the source, light clean up of field names and casting data types. - **Intermediate models:** Best of pushing code upstream that is NOT 1:1 with source but may be resuable in other future models. - **Final models:** Remaining logic that you want to expose to end users. --- class: center, middle # Let's see it in action ??? Leverage the CTEs from the last step to migrate code into staging and intermediate models. --- # 6. Auditing - We will leverage the audit helper package to be able to compare the original query to the refactored final model. - [Audit helper package](https://hub.getdbt.com/dbt-labs/audit_helper/latest/) --- class: center, middle # Let's see it in action ??? Install the audit_helper package Ensure that the new model and legacy model match up. ---