Detecting Data Anomalies via an Inspection Layer
Let's face it, we can't get enough data these days and often ingest from various sources like vendors, IoT devices, and more. Unfortunately, you've likely encountered times when the data just isn't what you're expecting. For instance; when the data has nulls, duplicates, is arranged differently than the schema specification, or others - this can be a weak point for many data pipelines. We'll showcase a way to handle this using dbt native methods to implement an inspection layer to ensure erroneous data sets can be flagged and quarantined while the rest can load uninterrupted.