The path to a mature Analytics Development Lifecycle
Jul 21, 2024
OpinionThis post first appeared in the Analytics Engineering Roundup.
It’s been a fun, albeit busy, month since I got home from several weeks spent in the bowels of Moscone for Summit Season. Honestly, I’m having more fun than I’ve had in years. I wonder if the same is true of you?
After four years in data that were defined by massive shifts in the macroeconomic landscape (first up, then down), it feels like the macro has settled. Private and public markets have rationalized, which means that we can mostly stop thinking (and talking!) about them. I ran a business from 2016-2019 and thought about macroeconomics basically never; I’m hopeful that we’re moving back to that world. It’s much more fun to think about data than about interest rates.
Of course, it helps that good things are going on in the data ecosystem. I’m particularly excited by the embrace of open table formats and OSS catalogs. I wrote about this a few weeks ago. I think this trend is going to reshape quite a lot about our world in the coming 2-3-4 years. It will take some time but I think we’re at a point where a multi-data-cloud world is a foregone conclusion for companies of sufficient scale / complexity.
(Expect me to talk about this at my Coalesce keynote!)
Finally: we’re starting to come out of the most extreme part of the hype cycle with AI. I’m a believer in AI, but I’m also a realist. New technology primitives take time to figure out how to deploy against ROI-positive use cases, and we’re still early in that process. For the last 12-18 months, the level of attention AI was getting in conversations with senior data leaders was out of sync with its level of real business value. That divergence is closing, which is good for my personal sanity.
I sit writing this while my kids are at the pool, in the depths of summer on the east coast of the US, with some lofi jazz in my ears. So maybe I’m just in a good headspace personally! I think, though, it might be a great moment to be in data. Good things on the horizon, a lot of important work to do, and the space to do it.
Can’t ask for more.
The Analytics Development Lifecycle (ADLC)
I’m working on one of my most important pieces of writing in several years. I’m still in the early phases, but it’s starting to take some shape. Over the coming months I’ll likely share bits and pieces of it here as a feedback mechanism. I would LOVE to hear your thoughts.
For now, I want to share two specific sections. The first one explains what the paper is about and is from the intro:
In 2016, I authored a blog post entitled “Building a Mature Analytics Workflow.” That post helped launch a community and a product, and many of the assertions from that original post have been realized in the industry. However, eight years in, the original post is in need of an update.
In this white paper, I propose a single, end-to-end model that I call the Analytics Development Lifecycle (ADLC). The ADLC is, I propose, the best path to building a mature analytics capability within an organization of any scale.
Most of the paper will be about defining the ADLC. It’s not about technology, the ADLC is a workflow, and is one that leading data practitioners have continued to refine over the past decade since the advent of cloud-based data technology and devops-style tooling.
One of the sections that I’ve had the most fun writing so far is the section about what expectations users should have of a mature analytical system. User meaning: someone exploring data, consuming dashboards, etc., not someone building or maintaining the system itself. Here’s my current draft of this section:
The ADLC does not express an opinion on exactly how analytical systems are used to generate business value. For example, it does not believe that there is a ‘right’ or a ‘wrong’ way to conduct exploratory data analysis. Rather, it specifies a set of assumptions that all users of analytical systems should have:
- Users should be able to discover and directly interact with the artifacts from a mature analytical system without having to go through any intermediary humans.
- Users should be able to trust the correctness and timeliness of data from a mature analytical system.
- Users should be able to delegate their own access to a mature analytical system to their chosen tools and agents.
- Users should be able to straightforwardly investigate the provenance of any data element in a mature analytical system.
- Users should be able to view a history of all state changes to a mature analytical system.
- Users should be able to leave feedback on any element of a mature analytical system.
- Users should be able to ignore the implementation details of a mature analytical system.
- Users should be responsible for the costs associated with their usage of a mature analytical system.
- Users should be able to use as many resources as they are willing to pay for from a mature analytical system.
- Users should be able to choose the environment of a mature analytical system they interact with.
There might be a couple of items on this list that feel controversial, but I don’t honestly think these are groundbreaking statements. What is shocking to me, though, is just how infrequently the analytical systems deployed inside each of our companies live up to these requirements.
Sometimes I think folks who have been in the data game for a few years now feel like the ecosystem is now ‘mature’ and that most of the important things have been built.
I think that is wrong. Yes, we’ve come a long way. But we have a long way yet to go.
When I stare at the above list it energizes me. There’s a lot of work to do.
Hope you’re well. I’d love to hear from you, and would especially love to see you in person at Coalesce in October!
Last modified on: Jul 22, 2024
Achieve a 194% ROI with dbt Cloud. Access the Total Economic Impact™️ study to learn how. Download now ›