Sagar Velagala was the first analytics hire at Lola.com, a fast-growing startup with ~$80M in funding and ~100 employees. If Sagar were an average analyst working in Excel, this role would be a nightmareâthereâs just no way one analyst could meet the data needs of the entire organization. And if this story had been playing out five years ago, prior to the advent of modern data tools that gives data analysts so much leverage, things wouldnât have been much better. As it is, heâs nine months into his role and successfully supporting the entire organization with no current plans to hire more people on the data team.
âThe traditional analyst workflowâdownloading data from multiple SaaS tools, applying business logic in Excel, the monthly reporting treadmillâit doesnât scale,â Sagar says. âIn organizations using traditional analyst workflows, the number of analysts always grows in proportion to company headcount.â He wasnât interested in laying the foundation for an analytics function that scaled with more people, he wanted to build an analytics function that scaled with code.
Sagar was a dbt user before joining Lola and already believed that analysts should work like software engineers. He says, âThis workflow allows me to use technology and code to accelerate my productivity. As an analytics organization of one, Iâm able to support the analytics needs of a growing company.â Sounds great in theory, but the reality is that when an analyst joins a company, people expect data, insights, and reports today. He prepared for this.
At the start of his tenure at Lola, Sagar laid out his analytics strategy, a 10-page document that described his plan to make Lola a âlean, fact-driven organization.â His plan described an approach that started with enabling business users to self-serve in the near-termâmeeting the data needs of the organization todayâwhile he implemented an analytics engineering workflow that would scale to meet future data needs.
Itâs common that companies using dbt have analytics teams that are significantly smaller than companies still using traditional workflows, but setting up an analytics engineering workflow takes time. Pulling this feat off as a one-person team is a notable accomplishment. Sagarâs approach is valuable for any data analyst who is ready to get off the analytics treadmill.
Priority #1: Enable business users to understand what is happening in their department today
Sagar took inspiration from Carl Andersonâs book, Creating a Data Driven Organization, which outlines the six types of questions that can be answered using analytics:
His top priority: âHelp business users accurately answer âWhat is happening now?â and have directionally correct answers on âWhat happened in the past?ââ Questions like this can be answered well-enough using the reporting that comes standard with the SaaS products that power your businessâSalesforce, Intercom, Shopifyâif those reports are set up properly.
At this stage, Sagar simply wanted to ensure that business users had good data that they could use today to make better decisions about their work. This doesnât involve complex analysis, but itâs a foundational set of information that buys an analyst time to implement a more scalable analytics process. However, there are long-term benefits of this work as wellâwhen the time comes to begin consolidating data into a central warehouse, that data is going to be a higher quality. If a report isnât right in a front-end tool itâs usually because a field isnât being tracked in quite the right way, better to catch that now than when youâre building your first models inside your data warehouse.
As part of this priority, Sagar was also looking to plug any gapsâwhere werenât they collecting data that they should be? He identified event tracking and bug tracking as two gaps for the team. Implementing the right tools and tracking early on meant that, even if there wasnât a business need for insights from those datasets today, the infrastructure would be in place to answer these questions tomorrow.
Priority #2: Enable business users to understand what is happening across departments using a single source of truth
In order to get the âinsightsâ level of the six types of analytics questions in the chart above, business users need to be able to look outside of their department. âTeams are able to execute their core functions using specialized front-end tools,â Sagar says. âBut they canât easily operate cross-functionally without combining different data-sources.â
- If customers churn after a bad support experience, you wonât know that unless you combine revenue data with data from your help desk.
- If the close rate is improving for sales, the answer might be that your sales training worked or that marketing is running a particularly effective campaign.
Traditionally, analysts have solved the need for these cross-functional insights by living in a world of Excel spreadsheets. Sagar outlines that workflow as looking something like this:
Today, analysts can automate the vast majority of this process. Sagar calls this modern workflow âworking in codeâ. It looks something like this:
He explains, âBy writing analysis in code Iâm no longer a chokepoint in a recurring business process. Instead of spending 80% of my time cleaning data, I spend my time building tools that enable business users to do it themselves, and generating real insights that can help scale the business.â So instead of churning out monthly Excel reports, Sagarâs job is now to:
- Extract data from front-end tools and load it into Snowflake using an automated data pipeline tool. He chose Stitch.
- Transform data using dbt to write SQL transformations within the data warehouse.
- Deliver clean, tested, and ready-to-use data to Looker for analysis.
- Analyze data and provide insights and recommendations across the organization. Nine months into his role and Sagar has most of Lolaâs core business data migrated to this workflow.
Priority #3: Empower business users to do their own data exploration
In Sagarâs strategy doc he wrote: âEventually, the organization will need to explore data and deliver insights at a rate that isnât possible if all queries must filter through a centralized analytics team. At this point, there are two options:
- Invest in a large, decentralized analytics team to support growing business needs
- Enable business users to explore data sets and generate insights independently
The emphasis on point #2 is Sagarâs. He went on to write: âOption #2 is significantly more efficient, and achievable by leveraging a modern tech stack. Looker is currently the best  business intelligence platform on the market that enables analysts to build data tools for business users, and enable them to support themselves.â
In Sagarâs view, it makes sense for analysts to build some reporting for business users, but long term, âOur goal is to have each vice president be able to do their own data work. You need to be able to own your data.â He points to Airbnb as an inspiration for this approach, âOn some teams, 80% of users regularly use data tools, and over 50% regularly query in SQL. They hired the right people and built an amazing training program.â
This is certainly an achievable goal if the data end users are given access to is high quality, which Sagar defines as: âaccurate, consistent, defined, and complete.â In other words, someone needs to do the preparatory work to get data into a state where even a non-analyst can make sense of it. Sagar uses dbt to deliver high-quality data in two ways:
- Code (not manual processes) keeps data cleansing transformations and business logic transformations consistent.
- Data tests catch issues before they impact the end-user.
âIâve never felt more confident in my underlying data,â Sagar says.
âAs an analyst, this is a huge relief.â
Priority #4: Plan for scale
Even with best-in-class technology and code-based processes, the analytics needs of the organization will eventually grow to a point where âit becomes important to optimize for cost and efficiency.â When Lola reaches that stage, Sagar says thatâs when it makes sense to hire data engineers, data scientists, and other business intelligence professionals. âThese team members can create custom data tools for internal use, optimize queries and database structure, and create complex statistical models that can provide insight into the future.â
Sagar points out that this is the stage when efficiency becomes important as well. âSome tools that worked at an earlier stage may become unreasonably costly, with better alternatives for specific use cases.â The example he uses here is event tracking. Snowplow Analyticsâs open source tool allows companies to track any event across the entire funnel from marketing web traffic and product usage, but the trade-off is a more challenging implementation and more maintenance. It may make sense to choose a more expensive, all-in-one tool to start doing event tracking today, knowing that in the long term youâll adopt something more customizable and affordable.
A successful analytics team of one
It took nine months for Sagar to reach this point, and today, heâs moving quickly by doing a few things exceptionally well:
đDevelop a good process for managing analytics requests
âI have about 30 analytics requests in my backlog right now,â he says, but he doesnât seem too stressed. All business leaders can submit analytics requests via an #insights channel in Slack. Requests are then prioritized based on how closely they align with business goals. âRight now, weâre focused on customer acquisition, so analytics requests related to that goal rise to the top.â
đMaintain a tidy dbt project
âMost of what Iâm building today is brand new,â Sagar says. âAs you introduce new data sets, KPIs, and analysts, the potential for duplication of metrics and models increases. Youâll start getting more of the âthis doesnât match thatâ type of questions.â Maintaining a well-organized dbt project helps him be a more effective analytics team of one while also preparing for future scale. A few tips:
- Staging and mart models: Sagar follows the convention of using staging models for data cleaning, and marts for storing the business logic of a given business function. âMy staging models are where I execute minor transformations like data casting, timestamp conversions, putting everything into lower case, etc,â he says.
- Exclusion lists: Sagar uses exclusion lists at the staging layer to clean out things like deleted deal records and test emails. This reduces the volume of data heâs working with in his data marts.
- Ephemeral models: To reduce model âclutterâ in the database, Sagar leverages ephemeral models and dbtâs capability to specify a schema on a model or folder level. âI place all intermediary models in a PROD_TEMP schema instead of âPRODâ,â he explains.
đŹTest data
Today Sagar uses tests on all of his staging models to catch things like duplicate keys. This one simple test is enough for him to catch most problemsâwhether thatâs an error in the data ingestion process or a change in one of Lolaâs systems of recordâbefore they impact the end user. Next on his list is to implement more complex testing that spots more subtle anomalies in the data that the business should address.
đTrain decision-makers to self-serve
Clean data and efficient workflows donât mean much if decision-makers donât know how to turn the data that comes out into actions. Sagar is currently working on building out new self-service training modules on a regular basis, with topics ranging from âHow to Use Lookerâ to âInterpreting SaaS Unit Economicsâ.
Huge thanks to Sagar for sharing your thinking with the dbt community đIf you know someone in the community who has done a remarkable job documenting a process, putting their thinking into a clever framework, or has helped you get better at analytics engineering, message me on dbt Slack. We would love to feature them.
âĄď¸Ready to improve your analytics engineering workflow?  Get started with dbt today. âĄď¸
Last modified on: Apr 25, 2022