Table of Contents
- • No silver bullets: Building the analytics flywheel
- • Identity Crisis: Navigating the Modern Data Organization
- • Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization
- • Down with 'data science'
- • Refactor your hiring process: a framework
- • Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team
- • To All The Data Managers We've Loved Before
- • From Diverse "Humans of Data" to Data Dream "Teams"
- • From 100 spreadsheets to 100 data analysts: the story of dbt at Slido
- • New Data Role on the Block: Revenue Analytics
- • Data Paradox of the Growth-Stage Startup
- • Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!
- • Keynote: How big is this wave?
- • Analytics Engineering Everywhere: Why in the Next Five Years Every Organization Will Adopt Analytics Engineering
- • The Future of Analytics is Polyglot
- • The modern data experience
- • Don't hire a data engineer...yet
- • Keynote: The Metrics System
- • This is just the beginning
- • The Future of Data Analytics
- • Coalesce After Party with Catalog & Cocktails
- • The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation
- • Built It Once & Build It Right: Prototyping for Data Teams
- • Inclusive Design and dbt
- • Analytics Engineering for storytellers
- • When to ask for help: Modern advice for working with consultants in data and analytics
- • Smaller Black Boxes: Towards Modular Data Products
- • Optimizing query run time with materialization schedules
- • How dbt Enables Systems Engineering in Analytics
- • Operationalizing Column-Name Contracts with dbtplyr
- • Building On Top of dbt: Managing External Dependencies
- • Data as Engineering
- • Automating Ambiguity: Managing dynamic source data using dbt macros
- • Building a metadata ecosystem with dbt
- • Modeling event data at scale
- • Introducing the activity schema: data modeling with a single table
- • dbt in a data mesh world
- • Sharing the knowledge - joining dbt and "the Business" using Tāngata
- • Eat the data you have: Tracking core events in a cookieless world
- • Getting Meta About Metadata: Building Trustworthy Data Products Backed by dbt
- • Batch to Streaming in One Easy Step
- • dbt 101: Stories from real-life data practitioners + a live look at dbt
- • The Modern Data Stack: How Fivetran Operationalizes Data Transformations
- • Implementing and scaling dbt Core without engineers
- • dbt Core v1.0 Reveal ✨
- • Data Analytics in a Snowflake world
- • Firebolt Deep Dive - Next generation performance with dbt
- • The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness
- • dbt, Notebooks and the modern data experience
- • You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)
- • Git for the rest of us
- • How to build a mature dbt project from scratch
- • Tailoring dbt's incremental_strategy to Artsy's data needs
- • Observability within dbt
- • The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation
- • So You Think You Can DAG: Supporting data scientists with dbt packages
- • How to Prepare Data for a Product Analytics Platform
- • dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake
- • Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines
- • Upskilling from an Insights Analyst to an Analytics Engineer
- • Building an Open Source Data Stack
- • Trials and Tribulations of Incremental Models
How to Prepare Data for a Product Analytics Platform
Companies today need a product analytics platform to enable product insights that are quick to access and easy to understand and act upon. How can you prepare your data for a quick, smooth product analytics implementation?
Browse this talk’s Slack archives #
The day-of-talk conversation is archived here in dbt Community Slack.
Not a member of the dbt Community yet? You can join here to view the Coalesce chat archives.
Full transcript #
Carly Kaufman: [00:00:00] Hi, I’m thank you for joining us at call us today for Day 3. My name is Carly Kaufman and I lead the solutions architecture team at dbt Labs. I am super excited to be your host today. The title of this morning session is How to Prepare Data for a Product Analytics Platform.
And we’ll be joined by Esmeralda Martinez. Esmeralda is the director of product and sales engineering at Indicative and has been at the company for five years. She’s always loved math and puzzles, but would you believe that she used to do math workbooks as a kid for fun? She also loves puzzle games like Sudoku, and even took math classes as electives in college.
How many of you can relate to these hobbies? Please comment in the chat. This is why she loves working with data and analytics. And I’m sure many of you feel the same way today as Esmeralda’s going to lead a workshop demonstrating how to prepare data for product analytics tool. Companies today need a product analytics [00:01:00] platform to enable product insights that are quick to access and easy to understand and act upon.
How can you prepare your data for a smooth product analytics implementation? I’m excited to find out, but first, a few housekeeping items. For those who are new today, all chat conversation is taking place in the coalesce-product-analytics channel of dbt Slack. If you’re not a part of the chat, you have time to join, but do it right now, visit community.getdbt.com and search for Coalesce product analytics. When you enter the space, we encourage you to ask other attendees questions, make comments or react at any point in the channel. Memes, gifts and jokes are always appreciated. Who knows you you might even be featured in Lauren’s Coalesce update tonight. After the session, Esmeralda will be available in the Slack channel to answer all of your questions.
However, we encourage you to ask questions at any point during the session. Let’s get started. Over to you, Esmeralda. [00:02:00]
Esmeralda Martinez: All right. Great. Thanks everyone. I’m very excited to be here, me. Before we get started, let me just tell you a bit about myself and my journey to Indicative and product. So I’m originally from Southern California and I moved to New York for college, where I went to NYU Stern and studied business.
I had always loved technology. And like Carly mentioned, I was a math nerd all through school, taking math electives for fun in college which I heard, not everyone enjoys as much as I did. And fast forwarding a few years, I started at Indicative as an SDR. I’ve had the opportunity to grow and learn a ton over the past five years.
And now I head up the product and sales engineering team here at Indicative. And I’m very excited to share with you all today how to prepare your data for a product analytics platform like Indicative.
[00:02:55] About Indicative #
Esmeralda Martinez: So a bit about Indicative. Indicative is a product [00:03:00] analytics platform that helps you build better products through analytics. And Indicative was designed specifically for the business user.
So we typically work with product marketing and analyst teams to help them understand how their users are interacting with their business. And we are the only product analytics platform that connects directly to your data warehouse. So once the data is in Indicative and users don’t need to use any SQL or have any coding experience in order to actually derive insights from the data.
But someone on the data team does need to set it up and connect it to the data warehouse or data lake. So my experience coming from the sales engineering and customer success side, I have worked with many customers to help get their data set up in Indicative. And what I’ll be sharing with you today are the insights and the best practices that we’ve learned through helping customers prepare their data for product [00:04:00] analytics in Indicative.
And properly thinking through the data strategy with product analytics in mind will unlock many use cases. And since we have a data audience, I’m going to dive into a few of these cause I think you’ll find them pretty good. For example, Indochino and Haven Life are two of our customers on the left and they both needed a platform that would allow them to join data across different platforms or across different user journeys.
For example, Indochino sells suits online and they also have brick and mortar store so they use product analytics to analyze the online shopping experience through to actually purchasing in-person at the brick and mortar. And Haven Life for example, is a life insurance company. There are many actions that happen offline, for example, submitting medical information, signing the contract through DocuSign or through email. And so they use Indicative to unite the [00:05:00] front end and back end data into a cohesive customer journey. And on the right, we have Prezi and Liv Up. So they were running into an issue that I’m sure may have any of you are familiar with where the data team is fogged down by requests from the business users that are fairly simple.
How many people visited the site? How many people engage with the new feature? As well as complex like journeys analysis, funnel analysis, that is a cumbersome to say the least to do in SQL. And so with Indicative the business teams can then ask and answer their own questions. And the data analysts can focus on more high value and complex analysis like machine learning and predictive modeling.
[00:05:47] Indicative and the modern data stack #
Esmeralda Martinez: So in the modern data stack, data warehouse or data lake is at the center of it all receiving data from on the left, for example, from event [00:06:00] collection platforms, ETL tools, and then outputting it into analytics platforms like Indicative or BI platforms and with tools like dbt and other tools that you see at the bottom here.
So tools at the bottom of it becomes easier to manage, monitor and transform your data. And that leads us to our main topic today, which is how can you prepare your data for product analytics? And so what problem are we going to tackle? So let’s say that you are on the data team at a fast growing e-commerce company.
For the purposes of this presentation, we’re going to say it’s a fictitious company called Pet Box. You sell pet products, have an app, website, blog, and some products in a subscription. And so the dev does the analysis for the business teams, also for product, [00:07:00] marketing, customer support and so on. And they depend on your team for insights. Your team needs a SQL to understand user behavior, but now there are new tools that business users can use to ask and answer their own questions and are designed specifically for product analytics.
So you want to be able to quickly understand the customer journey, which can be difficult and more often than not more trouble than it’s worth trying to use SQL to create a Sankey diagram of common paths or a multipath funnel analogy. And you want to be able to focus again on that complex analysis, like predictive modeling machine learning instead of answering simple questions, like how are users interacting with a new feature last.
The other thing is that the product team doesn’t know the data. You don’t want them to misinterpret results. The naming is cryptic and inconsistent across data. You don’t want to hand over the keys to the data and end [00:08:00] up with a team coming to the wrong conclusions, because again, they don’t know that.
So you sign up for a product analytics platform like Indicative to solve the need for that self-serve product analytics tool. But first you have to get the data ready. And so I’ve personally heard this story over and over from companies that come to Indicative, and those that are most successful have really put thought and care into their data strategy and how they model.
So that brings us to the problem. So without a solid plan, you run the risk of not achieving your goals of data democratization, either due to messy or noisy data, poor documentation, which would make it difficult for everyone to use among other issues. So here’s a quote from an Indicative customer whose team was set up with bad data.
They said it was too complex and confusing for a non [00:09:00] data scientist. They can’t understand what events are being tracked and what they mean. And they acknowledged that this is likely a problem with how they are labeling the events and on how they are using the events. So this can lead to your analytics team not providing the value that you originally intended it to provide or worse end up making decisions based on bad data and misinterpreted inside. Just to show you some stats that we have. So the adoption of a product analytics platform by companies that have bad data is significantly lower than those that have good data.
So we looked at our own retention. I used a proxy for good and bad data and found out that companies have a two times higher retention rate by day 30 if they really took the time to have a good data strategy and have what we consider good data compared to those companies that had bad data. And by three months [00:10:00] in the retention is actually three times higher.
So it’s really worth to dedicate the time upfront to have proper data strategy, and again, good data so that you can really get the most value out of a product analytics platform. And I really think this number shows the missed opportunity if the data hasn’t been properly thought through. We get it, right. Got it. How do we actually solve this problem? So there are a few steps to preparing your data for a product analytics. So first we want to determine your goals and use cases, and then we’re going to use those goals and use cases to turn them into data requirements. So they all really stuck onto each other and flow into each other.
So from the data requirements, we will then figure out how to organize your data. Maybe there’s data that you aren’t collecting it, and you need to actually begin collecting them. Next, we want to make sure that we have the [00:11:00] documentation to ensure that end users understand what the data represents and that they know who to go to, to ask questions about the data.
And then lastly, after integration, you’re then able to analyze your data and begin getting insights from the data. So we’re actually going to go through each of these steps and talk through how you would be able to do that.
[00:11:26] Step 1: Determine goals and use cases #
Esmeralda Martinez: The first step is determining the goals and use cases. So we don’t need to solve for every use case.
Every question under the sun in the first phase that I would say is a common mistake people make. Focus first on what you care about the most, what are the critical questions and your critical goals that you want to answer in the short to medium term. And then you can always expand on that in subsequent phases of data tracking and transformations.
So for example, a company may want [00:12:00] to increase new sign-ups. They may want to boost upgrades, lower churn, maximize retention. The goal really dictates what your use cases are and what data you’re going to do. So let’s say for our purposes our goal is to increase new subscriptions. So Pet Box has a product that gets delivered every month that has like pet toys and food.
And so we want to increase the subscription of this pet box. And with that goal in mind, there are certain questions that we want to answer. For example, where are users dropping off in the sign up flow? Where is their friction? So this is telling me that we would want to identify what are those steps in the sign up flow.
So you can properly answer and analyze this question. Other examples are what touch points are leading to conversion. Maybe one include some marketing data which marketing campaigns and channels are driving the highest conversion. What do users do once [00:13:00] they’ve converted? So what are those most common journeys after?
And what are they doing? And then are there specific user segments t hat are converting the most. What are the characteristics of the users that convert versus those that are not converting? And now that we’ve determined our goals and use cases, the next step is going to be to identify your data requirements.
[00:13:28] Step 2: Identify your data requirements #
Esmeralda Martinez: And so we’re going to want to take the use cases from the previous slide and identify what data we need to answer those questions. And so we have a template that we can use to fill this out for your own company. So I believe it should be a link to this, it’s going to be posted in the Slack channel. So again, go to #coalesce-product-analytics or hyphen to go into this sheet.
And [00:14:00] so we’re going to go through this template and you’re definitely welcome to make a copy and then use it as you build out your own data strategy over this. And so here at the top, first we have what is our goal down below? We have, what are our objectives? What are our use cases? Basically what questions are we trying to answer?
And then we have, what are our KPIs? So what are the actual metrics that we’re going to be using to measure the success, in this particular case new subscription and the new subscription flow. Further down under number four, we have the required data. So what data sets are required in order for us to be able to answer these questions.
So here we have web, mobile CRM, marketing, there’s different data sets that we will need. And so we want to outline what those are. And then, [00:15:00] lastly is data governance. Are there any data privacy considerations? This really depends on, your location, of course, and then what your company’s structure and privacy security requirements are. And then lastly, what are our technology challenges and requirements? For example, are we tracking everything that we need to be tracking? If not, we have to figure out how we’re going to solve for that problem.
Where does that data live? How can we get it into a single repository, which we’ll talk about in a bit. And so this is a template that you can use again to answer those questions and really build out and organize your data to prepare for a product analytics platform. And actually one point I’ll mention on the type of required data, you need event data, right? So all of the types of data that we’re really going to be talking about is either events data or user data. And so event [00:16:00] data describes a user action. For example, clicking on a link, viewing a page subscribing, and then user data describes the user. And so we’re really gonna want both of these types of data in order to understand a product analytics and user behavior within the product.
So let’s go back to here. And so again, the outcome may be that you have all the data you need and now you need to prepare it. Or the outcome may be that you don’t have everything.
If that’s the case, you’re going to need to come up with a tracking plan. So if you don’t have the data you need your engineering team is going to need to do a bit of work, to track and collect the data. There also are many platforms that exist that help make this as easy as possible. So for example, platforms that track events data.
Esmeralda Martinez: And in addition to that, there’s also ETL platforms like Stitch data and Fivetran which [00:17:00] can export data from third-party platforms. Maybe you have data in HubSpot or Salesforce or Marketo that you want to include in this product analytics dataset.
[00:17:11] Step 3: Organize your data strategy #
Esmeralda Martinez: So you can use ETL platforms or build something custom to put that data into your data warehouse. And then from the data warehouse, you can then put it into a product analytics platform like indicative. So the next step is to, again, if you are not tracking the data to organize and really think through your data strategy.
So for that, you would need a tracking plan. And we’ll go through an example of tracking plan as well in a moment, but first some key considerations to keep in mind, as you build out a tracking plan is consistency, usability, and complexity. Or simplicity really. So consistency with the data is key. A strong tracking plan ensures [00:18:00] that there is consistency in the events and the properties that you track which means that more people in your org can use it.
So consistency applies both to the event name. For example, if one event is camel case, all event should be camel case versus snake case versus title case. Also, the name should be consistent across, for example, device type should be consistent across all data sets. It shouldn’t be device type and then device TPE, really needs to be consistent. The next part is usability. So difficult or confusing data drives down adoption. If bringing data into the day-to-day takes too long, people aren’t going to do it also. You don’t need to track everything. That’s another common mistake thinking that you need to track literally every single thing in the product.
That’s why we are focusing everything around a goal and if you’re tracking too much or tracking in a [00:19:00] disorganized way, which is worse, people aren’t going to use it. I have seen customers that have thousands of events and that really leads to it being unusable for a business. And then third, we have complexity.
So when you have a dataset that’s well-designed and complete to begin with, it becomes a lot less complicated to scale and evolve it. So putting the time upfront makes it easier and simpler to iterate on your tracking plan to unlock deeper insights and more advanced analysis over. So moving on to the next slide.
[00:19:36] The 2 types of event tracking #
Esmeralda Martinez: You may have heard about explicit versus implicit tracking. If not, so explicit tracking is that you decide explicitly in the code what data you want to track and define. Implicit tracking is that the data is automatically captured with codeless tracking. And then there’s typically a UI where you then define the [00:20:00] events. So a common misconception is that if you are implicitly tracking, you don’t need a data plan. That is definitely wrong. No matter what way you use, you need to have a data plan. I personally, an indicative is an advocate of explicit event tracking for many reasons. A few is benefits of explicit tracking is you only track what you need.
You will have clearer and more reliable data, and you’ll have more control over how to govern your data. So one common issue companies run into is they end up having to block a ton of PII manually. And so that just ends up providing unforeseen complications. And so in general, explicit tracking I believe really is the way to go.
But again, regardless of which option you choose, you absolutely need a tracking plan. [00:21:00] Otherwise you end up with messy data. So if you have three different people using the implicit tracking platform to create events I’ve definitely seen companies that have the same event twice, but derived slightly differently.
And again, that leads to inconsistency in the data and issues further downstream. So the next step after you decide, how are you going to be tracking? What are you going to use to track is to identify the steps in the user journey.
[00:21:33] Identify steps in the user journey #
Esmeralda Martinez: So you’re going to want to begin thinking about what are the different steps and how may a customer go through them.
And so mapping out this process is important because you are then able to visualize what your customer has to do to move from the first touch point through to connecting the pet cam in this case. And once the data is trapped, then using the analytics tool, you can identify points of friction. For example, if all new [00:22:00] users open the app, but then drop off immediately, there may be problems in with the onboarding process. And so here we have some examples, steps that we may want to analyze. For example, we can see here that we want some email data. So delivering the email opening, clicking on the email, each of those actions would be a separate event that we want to track.
Then we have some app engagement, so downloading, opening the app, and then we have some in-app engagement, like creating a profile, subscribing and then opening and connecting the petcam which is a product to watch your pet while you’re away. And so identifying the key steps in your journey is is a critical step in building out that data.
So the next step here is you then need to contextualize the events. What information about each action do you care about? [00:23:00] And as you go through this, you’ll also want to assess how much of this, if any, do you already have. And just some kind of baseline information that event data needs to contain is an event name.
So the event name in this case is subscribe. Typically, we’re doing pretty high level events here, but you’d probably have something like subscription success or subscription failed. So you have those different stages. You also need a timestamp to identify when did the event occur?
And then you need a user ID. So who is the user that completed the action? So this typically can be either an unauthenticated or an authenticated ID or both. And then typically within the product analytics platform, they handle so. Indicative handles identity stitching. So as long as you have one of those, we create a mapping of of unauthenticated to authenticated IDs by [00:24:00] identifying an event that has both. We then associate those two ideas at to provide a single profile of the customer pre to post authentication. But you still need an ID at some point. And then lastly we need the additional metadata. So this would be information like subscription information.
For example, are they a non-subscriber or are they subscribed to our monthly, quarterly or annual plan? Also example device type. So what device are they on when they subscribed? Or when they’ve viewed a page there’s a lot of metadata that you likely will care about tracking. So it’s really important to outline all of the different ways that you’re going to want to filter and slice and dice your data so that once it’s passed on to a developer to implement, they have everything that they need.
In order to build to track it, some other kind of general devices, you want to keep your list of events short but create [00:25:00] a robust set of event properties. So for example rather than creating an event for every single that is clicked for example, in the navigation, you might want to consider creating an event that’s just called button and clicked, and then adding an event property that specifies what was the button that was clicked.
The level of detail that each event represents will really depend on the complexity of the product. So we recommend having between 50 to 300 events, ideally, so that it’s detailed enough that an end user can easily identify what they’re looking for, but it’s not too detailed. Really confusing, right?
If you have five events and you have to add a filter every single time you want to do an analysis, that’s cumbersome and annoying. If you have thousands of events that you have to dig through and try to find what you’re looking for, that is also very frustrating. And so again, 50 to 300 is what we recommend.
So if you go [00:26:00] over to the worksheet again. So at the bottom, you can see that there’s different tabs. So if you go over to the events tab, you can see an example, a tracking plan specifically for events. So here you’re defining what is the event and exactly how will it be named the definition of the event?
Definitions are key. Please do not skip adding definitions that describe what the event represents. This will become incredibly important when the business user then has to go use this data. And then what event properties you want to track, what’s the source of the data. And some additional information to make it again, easy for the the developer to track.
And then we have the properties tab. So you know, some more information about the properties you want to categorize the data and then defined the data. Sorry, just saw the chat 10,000 events. That’s too many. Okay, so going back [00:27:00] to the slideshow
[00:27:01] Form a central source of truth #
Esmeralda Martinez: So next after you have all of the data that you want to track, you need to form a central, a source of truth. And so you want to create a unified data set. And so this is really where we’re going to spend some time really talking through what are the best practices for creating this unified data set. And so we’re going to want to clean and organize in a way that lends itself great to to product analytics.
And again, always keep the business user in mind. And so the first point is you want to establish a unified identification system. And so when I’m talking about identification, user ID. So user identification. So BI and product analytics platforms need to deliver a unified view of your customers.
That ID doesn’t have to be the same ID across sources, as long as there is a way to map the user across [00:28:00] sources. So for example, if you have Zendesk and Salesforce data, and you have a Zendesk ID and a Salesforce ID, there has to be some way to map the ID to ID, maybe it’s email address, or maybe there’s a customer ID, that would be ideal if you have a customer. But regardless there has to be again, some way to track a user across the different sources that you care about. You can do, for example, map online and offline user journey behavior or front end and backend behavior. So the next step is you need to synthesize different sources.
So again, you need to be able to map for example, that service side to the client side data to third-party data to really contextualize that user behavior at a higher level. And ideally these various sources would be unified under one schema. So for example, what we commonly see if someone’s using BigQuery, for example they’ll have different tables that represent the different sources.
And then there will be a [00:29:00] single unified table as opposed to a table per event, for example. And that makes it just way easier to then integrate with a product analytics platform because you have this singular view of the data that you are integrated with, or at the very least a few different things.
The caveat to this is we, I commonly see a no venture table as well as a user table. So it’s very common for customers to also have a user table that includes profile information, for example subscription status, that demographic or location information. And so these are really the two key tasks.
We already talked about consistency, but again, consistency is really important. You want to have a unified, naming conventions and field names. Please do not have device types three different ways in three different data sources, definitely tools like dbt can help you with that. And you’re yeah so again, this is really important.
And you also, so for the next step [00:30:00] is creating lookup and reference tables. So this is also very common. If customers want to enrich the events data with additional context, this is most common with a B2B SAS platform. So let’s say that they have a user table and they have a company table with company information.
So for example, you may want to have a company table with company IDs, company names, subscription type that you want to use to enrich the event. So that’s where these lookup or reference tables come in. A few other things is always have your data tracked in UTC. I have seen customers where the underlying data is actually not in UTC and that causes confusion and issues.
So make sure everything is in UTC so that you never have to worry about what timestamp is this in? And then also be purposeful about which timestamp you use because once you get up and running, it’s pretty difficult to change. So we recommend using the server side [00:31:00] timestamp versus the client side timestamp since the client’s side timestamp can be manipulated by the user.
But they both have their pros and cons. So server side timestamp would cause issues if the event data is batched because you wanted the timestamps to be sent in a subsequent order. Sometimes server side timestamp can sometimes cause us issues, but the pros outweigh the cons when compared to the clients side timestamp because, I’ve seen customers that use the client’s side timestamp and they have some data that was from a hundred years ago before the internet existed or a hundred years in the future, which obviously is not real and the timestamp was manipulated by them.
So some companies also derive a timestamp intelligently. Snowplow is an event collection platform. What they do is pretty cool. So they have a derive timestamp which uses the client and service side timestamps to intelligently like fix wonky [00:32:00] timestamps. So some customers do use that and I think it is a pretty cool fix for having a more accurate timestamping given the pros and cons of using just server or just client side type.
So the next step we want to develop documentation. So after you’ve really taken into account the best practices as we’ve discussed, you have your data all beautiful and ready. We’re not done yet. You need documentation of the data. So documentation serves as a historical guide and an educational resource that makes your data easier to access, interpret, and ultimately use.
So you’ll want to document for example, who is the right resource to answer questions about the . You really need an owner that is an expert in the data model who can field questions from the rest of team. And this may be a different resource from the owner who actually set up the data [00:33:00] or who has the data access.
But most importantly, you need a data dictionary. So this helps end users understand what the data represent. For example, does confirm subscription represent when the user clicks the button to confirm the subscription, or is it a backend event that represents when the subscription was successfully processed and confirmed by the server?
So this is where the definitions come in really important. So that again, the end user understands what the data really represents. And so here you can see a screenshot of Indicative’s data dictionary. And so typically the data dictionary really begins either in a Google sheet or some platform that, there are platforms that account for this.
And then it is transferred over to Indicative. So they had to write there, easy for the users to access, search the data, [00:34:00] explore and really understand what they’re looking for. And then you’ll also notice that they are categorized, and so it’s also really helps to categorize the data typically by the different parts of the platform.
If you have a more complex platform categorizing the data based off of what area or what tool or what section versus like marketing events or support events just helps to organize the data again with the business user in. So then we are ready to start analyzing in a product analytics platform.
So I’m going to show you what you would be able to do once you have, again, you’ve defined your goals and use. You’ve identified the data you need, organized the data, you spent all of this time and effort to build the ideal or the a good dataset. The last step is to drive value and get insights from your work, from this data set.
And I’m going to go over to Indicative just to [00:35:00] show you quickly what the types of analysis that you’d be able to do. So here at the top, we have some simple time series based analysis. That transparently is easy to replicate in SQL, but not easy for someone that doesn’t know SQL. So again, this is for the business user to be able to ask it and answer their own questions without having to know SQL.
But what we will focus on so you can create custom calculations and we’re going to focus on the journeys. We’ll just build a quick journey. So journeys is an exploratory tool that allows you to visualize the most common journeys to a particular end point or from a particular starting point.
And so before we do this, let me just walk you through the other types of analysis that you would be able to do with a product analytics. So below we have the funnel analysis. So you may have, may be most familiar with a [00:36:00] traditional funnel, which, user did ABCD, Indicative is unique in the sense that we understand that not every user has the same journey.
So we built our analysis tools to reflect that. And so here you can easily compare and contrast the difference and the two different journeys or funnels that a user may have taken to subscribe. And really in this case measure the impact that interacting with this feature has on subscription. And so also if you want to follow along in the chat again there’s a link so that you can actually access this data set and follow along with me, you’ll be able to see this exact thing.
And so lastly, we have the cohort analysis. So cohort allows you to measure retention, activation and engagement, and really understand how often are users coming back in, when are users coming back? So for example the way we actually got the metrics for good data, versus that data is by using the cohort tool and really comparing those retention [00:37:00] rates based off of companies that had good versus bad data.
And so now we’re going to go over to the journeys tool and I’m just going to build it from scratch so that you can see how minimal work you have to do. So we want to understand what is the journey to a particular action in this case to purchasing a product. And here, we can see all of the events that we beautifully prepared for the product analytics platform.
We’re going to search for, purchasing a product. And then here we have it at ending with, which is the one we want. We also have, starting with journeys, we can run this. And so all I had to do was click on an event and the tool did literally everything else for me. And so here we can see that there were 132,000 users that purchase a product or journeys that ended with purchasing a [00:38:00] product over the last seven days.
And we can start to look back to explore what are the most common actions or pods that led people to purchasing the product. And so here we can see 68%. All of the users had a site visit prior to purchasing the product. And then 29% connected the petcam immediately prior to purchasing a product.
So this is telling me, I want to first, identify what pages are they visiting. And it’s also really interesting to me that people are engaging with this feature. Maybe this feature really leads people to purchasing a product because they want more. And I can also see that this petcam shows up in multiple of the common paths that are leading people to purchase a product.
So there’s a lot of flexibility. Again, without having to do any SQL, you can expand an event. For example, if you want to look at you want to break out [00:39:00] site visit maybe by the page path or the category or the page name to really see, what those are. And /or you can exclude events.
Any of this can be saved at the dashboard. And again, done by a business user. And all of the analysis tools use a query builder like this that doesn’t require SQL. And again, makes it easy for them to ask and answer their own questions. So I just wanted to show you quickly what the type of insights that you would be able to derive after you have your beautiful data set.
Esmeralda Martinez: And then last thing I will leave you with are some key takeaways. So you want to first dedicate the resources upfront. It will really pay off when you have your business users, your new users getting ramped up quickly and starting to use product analytics. The net number two takeaway is you want to set your goal and use cases [00:40:00] at the start.
So you want them to, you want your internal stakeholders to be aligned on what are we trying to achieve through product analytics so that everyone can, ask and answer their own questions from the platform. And everyone’s on the same page. What the goals are. And then third, we want to form a scene, a central source of truth and keep the data modeling simple so that so that business users are able to use it.
So with that is all I’ve got. I think I’m going to stay on and answer questions, but for now I will pass it back to Carly.
Last modified on: Apr 19, 2022