All session recordings from Coalesce 2021 are now up on our website, and the dbt YouTube channel. If you’re hoping to catch up on what you missed, but think 67 hours of content sounds like a bit much for one weekend… I recommend sliding into one of the following two tracks:
- For the emerging analytics engineer (7 hrs): If you’re just beginning your analytics engineering journey, the previous blog is a great place to start. There you’ll find modules covering core definitions, team dynamics, and introductions to essential tooling in the modern data stack.
- For the team preparing to scale (7.5 hrs): If you’re looking for new ways to structure your team in preparation for rapid growth, or if you’re just interested in staying ahead of the latest tools and frameworks… settle in, this is the track for you.
For the Team Preparing to Scale
Module 1: Adjusting for rapid growth (1 hr 33 mins)
As your organization grows, so too will the volumes and complexity of your data. If that statement hits a little too close to home, it might be time to refactor how you work. These talks examine how four data teams dealt with rapid change to speed development, cut costs, and manage high-velocity data. Bonus: Each also shows how they measure impact and communicate value back to business stakeholders.
- Surviving Schema Changes with Automation (23 mins)
- Optimizing query run time with materialization schedules (19 mins)
- Automating Ambiguity: Managing dynamic source data with macros (25 mins)
- Trials and tribulations of incremental models (26 mins)
Looking for additional resources? Check out this article:
- Analytics Engineering at Spotify, by Peter Gilks
Module 2: Paradigms to reduce risk in dynamic environments (1 hr 55 mins)
Once you’ve established a strong data development workflow, you might want to consider things like how to programmatically spot inefficiencies, eliminate unnecessary human errors, and reduce risk. These sessions can help you build repeatable workflows that work even when the rest of the business is in flux.
- Root cause analysis for your data pipelines (29 mins)
- Build it once, build it right: Prototyping for data teams (28 mins)
- Smaller Black Boxes: Towards Modular Data Products (32 mins)
- Building On Top of dbt: Managing External Dependencies (26 mins)
Module 3: Scaling data teams (1 hr 58 mins)
As you look to grow your team, you might consider revisiting team structure, as well as individual roles and responsibilities. These talks challenge conventional data team logic (if there is such a thing anymore) by working from the bottom up to propose new frameworks that focus more on the humans than the groups they represent.
- From Diverse “Humans of Data” to “Data Dream Teams” (27 mins)
- dbt in a data mesh world (26 mins)
- Refactor your hiring process: a framework (38 mins)
- To all the data managers we’ve loved before (27 mins)
Looking for additional resources? Check out these articles:
- What to look for when scaling your data team, by Sheel Choksi
- How to thrive in the face of disruption: Tips from Shopify’s data team, by Marc-Olivier Arsenault
Module 4: Tools that bridge the gap (1 hr 48 mins)
These presentations share a common theme: how to bridge the gaps between and within data teams. The Tāngata and Metaplane presentations focus on improving org-wide data culture through increased observability and documentation, while the Deepnote and Databricks presentations showcase collaborative tools that are equally accessible to every member of the data team.
- Tāngata: Sharing the knowledge : Joining dbt and the Business using Tāngata (30 mins)
- Metaplane: The Endpoints are the Beginning: Using dbt Cloud API to build data awareness (24 mins)
- Deepnote: dbt, Notebooks and the modern data experience (26 mins)
- Databricks: Boost returns on your SQL pipelines using dbt, Databricks + Delta Lake (28 mins)
The above provides a practical guide for someone looking to grow and mature their data practice, but there’s much to be said for pure hype. If you’re looking for something to remind you why now is the most exciting time to be in this space, look no further than these popular sessions:
- The Metric System: The dbt Labs product keynote by co-founder Drew Banin generated more lasting conversation across social media than any other session. It also garnered 888 comments in the dbt community Slack.
- How Big is this Wave?: Martin Casado (A16z) joined Tristan (dbt Labs CEO and co-founder) in the session that drew the most live attendees to chat about what makes this wave of data tech different from any that came before.
- The Modern Data Experience: Famed Substack author (and founder of Mode Analytics) focused on the gaps between solutions in the modern data stack, and how the way we fill them will make or break the future of this market.
- The Future of Data Analytics: This power panel of VCs discuss the latest data trends from data quality to data ops, and why the seemingly absurd valuations we’re seeing lately might not be entirely unfounded.
The talks highlighted in this blog represent just a fraction of the outstanding content presented by members of the data community. If you do have the time and space to peruse other sessions, I can’t recommend it enough!
Last modified on: Apr 21, 2022