dbt

Standardizing product lifecycle with dbt macros from Coalesce 2023

George Apps, analytics engineer at Travelperk, discusses the standardization of the product lifecycle using dbt macros.

"Hopefully it will show how analytics engineers can add value to a business by thinking of their outputs like a product."

George Apps, analytics engineer at Travelperk, discusses the standardization of the product lifecycle using dbt macros. He talks about the importance of analytics engineers in adding value to a business by treating their outputs like a product and simplifying complex analytics. George uses the example of his experience at Dojo, a company that offers a variety of products, to illustrate the need for a standardized approach in handling data.

The importance of treating analytics outputs like a product

George emphasizes the importance of analytics engineers treating their outputs like a product. This approach allows analytics engineers to add substantial value to a business by simplifying complex analytics. George stated, "Hopefully it will show how analytics engineers can add value to a business by thinking of their outputs like a product to simplify complex analytics."

George explains that the first step in this process is to identify the common ground across all products and to standardize this data. He states, "What common ground is there across all of our different products? This was the starting point for this piece of work. In all products, there's an element of using the product, so we want to know who is using the product and who was using the product but is not right now."

George also notes that the value of this approach was demonstrated in the ability to answer simple questions about product usage and performance. The ability to answer these questions was previously a challenge, but by standardizing the approach, it became much simpler. He recalls, "...simple questions became really, really difficult to answer. So we get stakeholders asking us questions like, ‘How many users are active on product x?’ or ‘What is that as a percentage of those that are eligible, and how does that compare to last month?’ And if you ask two different people in the data team, they'd probably give you two different answers. So it was probably about time that we found a solution to this."

The process of standardizing product lifecycle data

George details the process of standardizing product lifecycle data using dbt macros. He explains that the first step involved aggregating ‌data from all stages of the product lifecycle. He says, "All we do is just cross join that to all of our dates, and we have the first part of our table...And this can become really, really flexible by being a daily, weekly, monthly level."

Next, he explains how to determine the stages relevant to a product and then how to iterate over these stages to calculate aggregated fields. He notes, "We ended up coming up with four commonalities in the lifecycle stage: eligible for the product, aware of the product, onboarding on the product, and enabled on the product. And that final ‘enabled’ step, we divided into three subsections which were ‘active,’ ‘inactive,’ and ‘dormant.’"

Finally, he describes how this standardized data could be used in a variety of ways, including for reporting, training models, and marketing campaigns. He illustrates, "So, within just three clicks, we could create this very pretty graph of product ABC and product XYZ all on the same graph showing a weekly active user rate." He also adds, "...we also used it as a feature store for data science, so it became a really key component in a cross-sell project that they were working on… to be able to see all of this product information and help train their models."

The importance of balance, collaboration, and usability during the process

George highlights the importance of finding the right balance between flexibility and standardization, collaborating with the right people, and considering the usability of the product. On the topic of flexibility and standardization, he states, "These two things are opposing in premise. You can't really have one without the other…products can differ in their nature, and ultimately, if you're making a data product for consumers to use, it needs to be useful to everybody. It can't just be useful for one subset.”

Concerning collaboration, he emphasizes the importance of gathering the right information from the right people, stating "It's about communicating with them why we're doing this process, and what it's going to improve, and getting their opinions on them about how to do that, but not involving too many people because you don't want too many opinions to muddy the water."

Lastly, on the topic of usability, he advises thinking about how the end users would be using the product and working backward from there. He explains, "Think about these things as early as possible. When I started this process, it was very much start-to-finish: ‘How do we start out writing this?’ It's the wrong approach. Think of how you want your end users to be using this and then work backwards from there."

George’s key insights

  • George emphasizes the importance of treating the outputs of analytics engineers like a product, which can add value to a business
  • A standardized approach is critical when dealing with data, especially for companies that offer a diverse range of products to various customers
  • George shares the solution they implemented at Dojo, which involved creating a standardized schema for all stages of the product lifecycle
  • George highlights the importance of collaboration and getting the right information from the right people when implementing solutions
  • When creating a data product, it is crucial to consider the end users and how they will use the product