Using dbt docs
The newest release of dbt, v0.11, ships with a built-in documentation website for your dbt project. You can check out an interactive example of this documentation here. While we have high-level opinions about why these docs are game-changing, I want to use this post to dig into the low-level features of the site itself.
If video is your preferred medium, you can additionally check out this video of me demoing docs to Emilie and Taylor from GitLab:
Live coding docs with Emilie and Taylor from GitLab
Building the docs site
If you haven’t already, be sure to upgrade to dbt v0.11.0. A full list of the new docs-related commands (and their explanations) can be found here. Once you’re all set up, the docs can be built by running:
$ dbt docs generate
This command will generate the core data files that are used to power the docs site. All that’s left to do is to start a webserver and navigate to http://localhost:8080 in your browser. You can do easily with the following command:
$ dbt docs serve
If you’re interested in following along, you can use this example project.
Exploring the documentation
The first thing you’ll see in the docs site is the “overview” section.
This overview page is useful for understanding how to navigate the docs site. If you’d instead prefer to put organization-specific information here, you can overwrite this overview page with markdown of your own. You can find instructions on how to do that here.
Moving on from the overview, you’ll notice a Project/Database toggle at the top of the left-hand sidebar. This toggle lets you switch between your project’s folder hierarchy, and a database-focused collection of tables and views. While the Project view makes it really easy to find and organize models, the Database view gives you an idea of what your data consumers will see in the warehouse.
You can use this sidebar, or the search bar at the top of the window, to select a model to explore.
After navigating to a model page (like this one), dbt will show you just about everything it knows about that model. On this page, you can see high-level details about the model, including it’s materialization, database owner, and relevant database statistics if applicable.
Scrolling down, you’ll see a Markdown description for the model if one is provided. You can find more information about adding these descriptions to your own models here.
In the bottom half of the page, you’ll find a list of columns for the selected model. These columns are sourced from two places:
- Your schema.yml files
- The database catalog tables (usuallyinformation_schema.columns)
By combining these two data sources, dbt is able to visualize the names of the columns, their data types and descriptions, and any applicable schema tests. These column descriptions also support Markdown, so you can easily create rich documentation, like a table of possible values for a column, or include links to other resources.
Finally, the last pane on the page shows both the source code and the compiled SQL for your model. The compiled SQL view makes it really easy to audit and workshop changes to existing models.
dbt makes it really easy to built complex DAGs of models, and the documentation site makes it really easy to explore these DAGs. By clicking the little green icon in the bottom-right corner of the screen, you can pop open a graph visualization for the immediate neighborhood of a particular model. This graph viz shows the immediate parents and children of the model in question, and looks something like this:
This pop-out viz is useful for understanding the flow of data through your DAG at a glance. An in-depth analysis of your DAG requires the full lineage of your models, as well as some more screen real estate. To expand this graph, click the “View Fullscreen” button in the top-right of the popout. In fullscreen mode, you can see the full lineage for your model (highlighted in purple). From this interface, you can zoom and pan around. Further, clicking different models will highlight their lineage.
Besides looking really slick, this graph viz is packed with features. From here, you can right-click on models to prune and refocus your graph, or jump to the documentation for a particular model.
Crucially, this graph viz is powered by dbt’s model selection syntax. When you enter values into the –models and –exclude input fields, you’ll see a live preview of the changes to your selected subgraph. Further, node selection updates the page url, so it’s super easy to share your graph with your colleagues. A gif won’t do it justice, so just go ahead and take it for a spin:)
Try it out!
If you like what you see here, you can build docs for your own project by installing dbt v0.11.0. This whole docs site is open source, so be sure to send bug reports or feature requests our way in the dbt-docs repo. If you’re interested in sharing docs with your whole team, but don’t want to host them yourself, you can do that with Sinter (now dbt Cloud). Got something else to say? I want to hear it on Slack.
Last modified on: Sep 22, 2023