Table of Contents
- • No silver bullets: Building the analytics flywheel
- • Identity Crisis: Navigating the Modern Data Organization
- • Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization
- • Down with 'data science'
- • Refactor your hiring process: a framework
- • Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team
- • To All The Data Managers We've Loved Before
- • From Diverse "Humans of Data" to Data Dream "Teams"
- • From 100 spreadsheets to 100 data analysts: the story of dbt at Slido
- • New Data Role on the Block: Revenue Analytics
- • Data Paradox of the Growth-Stage Startup
- • Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!
- • Keynote: How big is this wave?
- • Analytics Engineering Everywhere: Why in the Next Five Years Every Organization Will Adopt Analytics Engineering
- • The Future of Analytics is Polyglot
- • The modern data experience
- • Don't hire a data engineer...yet
- • Keynote: The Metrics System
- • This is just the beginning
- • The Future of Data Analytics
- • Coalesce After Party with Catalog & Cocktails
- • The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation
- • Built It Once & Build It Right: Prototyping for Data Teams
- • Inclusive Design and dbt
- • Analytics Engineering for storytellers
- • When to ask for help: Modern advice for working with consultants in data and analytics
- • Smaller Black Boxes: Towards Modular Data Products
- • Optimizing query run time with materialization schedules
- • How dbt Enables Systems Engineering in Analytics
- • Operationalizing Column-Name Contracts with dbtplyr
- • Building On Top of dbt: Managing External Dependencies
- • Data as Engineering
- • Automating Ambiguity: Managing dynamic source data using dbt macros
- • Building a metadata ecosystem with dbt
- • Modeling event data at scale
- • Introducing the activity schema: data modeling with a single table
- • dbt in a data mesh world
- • Sharing the knowledge - joining dbt and "the Business" using Tāngata
- • Eat the data you have: Tracking core events in a cookieless world
- • Getting Meta About Metadata: Building Trustworthy Data Products Backed by dbt
- • Batch to Streaming in One Easy Step
- • dbt 101: Stories from real-life data practitioners + a live look at dbt
- • The Modern Data Stack: How Fivetran Operationalizes Data Transformations
- • Implementing and scaling dbt Core without engineers
- • dbt Core v1.0 Reveal ✨
- • Data Analytics in a Snowflake world
- • Firebolt Deep Dive - Next generation performance with dbt
- • The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness
- • dbt, Notebooks and the modern data experience
- • You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)
- • Git for the rest of us
- • How to build a mature dbt project from scratch
- • Tailoring dbt's incremental_strategy to Artsy's data needs
- • Observability within dbt
- • The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation
- • So You Think You Can DAG: Supporting data scientists with dbt packages
- • How to Prepare Data for a Product Analytics Platform
- • dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake
- • Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines
- • Upskilling from an Insights Analyst to an Analytics Engineer
- • Building an Open Source Data Stack
- • Trials and Tribulations of Incremental Models
Coalesce After Party with Catalog & Cocktails
Catalog & Cocktails is bringing its unique brand of insight, humor, and conversation to Coalesce. The weekly podcast is an honest, no-BS, and non-salesy conversation about enterprise data management and analytics with a happy-hour vibe.
On Tuesday, December 7 at 4:00 pm CT your hosts, Juan Sequeda and Tim Gasper of data.world, will broadcast live from the Coalesce event platform joined by special guests from the community. We’re also cooking up a delicious dbt-themed cocktail. Please join us as we toast to an awesome event and glimpse into the future of analytics engineering.
Browse this talk’s Slack archives #
The day-of-talk conversation is archived here in dbt Community Slack.
Not a member of the dbt Community yet? You can join here to view the Coalesce chat archives.
Full transcript #
[00:00:00] Julia Schottenstein: Welcome everyone to the dbt Labs coalesce after party for Tim Gasper and Juan Sequeda of data.world. We’ll be hosting a live episode of their podcast, catalog and cocktails. I’m Julia Schottenstein and I’m part of the product team at dbt Labs. And I also co-host the dbt Labs analytics engineering podcast.
Every other week with our CEO Tristan Handy for our podcast listeners out there, we’re currently live at Coalesce which is dbt Labs’ annual community conference. So it’s pretty fun to record in front of such a large audience. We have over 14,000 attendees this year. I’ve listened to a few episodes of catalog and cocktails before.
So I came prepared with my cocktail. Should be a fun episode. And if folks want to chime in with questions, the conversation is happening in the dbt Slack channel #coalesce-catalog-cocktails. Okay. Now for me over to you, Tim and Juan to kick off the episode. Cheers.[00:01:00]
[00:01:01] Tim Gasper: Hello everyone. Welcome. Thank you so much, Julia. And to all the folks over at dbt Core. This is catalog and cocktails. It’s a weekly live hangout. And today it’s a very special episode. It’s an honest, no BS, dbt Coalesce conversation about enterprise data management with tasty beverages in hand. I’m Tim Gasper, a long time data nerd and product guy joined by Juan Sequeda.
[00:01:27] Juan Sequeda: I’m Juan. Tim, thank you so much. Wants a kid. I’m the principal scientist email@example.com, and it is a pleasure. It’s an honor. I am so excited about this because this is the first time we’re doing something like an after party for this awesome conference. I look forward to many more of these and today we have awesome awesome a lineup. We have two of our good friends that are it’s part of the Austin community because we’re an Austin company. There are folks who have been inside the data space for so long and they live and breathe data. I’m talking about Meetesh [00:02:00] Karia, who is a CDO, the CTO of the zebra.
I has had so many past CTO experiences, and also Claire Look from the zebra who started out as the data product manager at the zebra has gone into so many leadership roles and now is the VP of data. And we are both not just fans of dbt or they were customers of dbt, also customer’s data.world. So we got a lot here to go talk about glad you guys are here.
How’s everybody doing well?
[00:02:26] Claire Look: Great to be here.
[00:02:28] Juan Sequeda: Awesome.
So let’s to following the, our catalog and cocktails approach we do here. We’ll do our talent TOSA. What are we drinking? And what are we toasting for Claire?
[00:02:38] Claire Look: How about. All right. I’m drinking our star schema mocktail, so I appreciate everyone getting the mocktail recipe for me this week.
And I’m going to toast to it is my boss’s birthday. Tess, cheers to you. We get to celebrate his birthday with him today at this after party. So [00:03:00] that is what I’m toasting to today.
[00:03:03] Meetesh Karia: Thank you, Claire. And a, and I’m drinking the the regular version of the star schema cocktail, which tastes very fantastically, like a spiced, apple cider.
And I guess I’m going to toast to being able to, celebrate with you all for my birthday and chat one of my favorite things which
[00:03:23] Juan Sequeda: about you, Tim?
[00:03:25] Tim Gasper: I will also cheers to your birthday, Meetesh so glad that you could join us on your special day and cheers of the whole dbt community.
It’s just exciting to be here. A part of such a great conference and a great community. So really appreciate it. And I agree. This does not taste like alcohol. This is tricky.
[00:03:42] Meetesh Karia: Scary.
[00:03:43] Juan Sequeda: I’m actually at a hotel right now. So I went to the bar and I told the bartender, I gave him the recipe. It’s please make something.
And it, yeah, he did a great job. This let’s taste like a spicy apple cider. So here’s an happy birthday meet to just love this, that we’re all here together,
[00:03:57] Meetesh Karia: Cheers.
[00:03:58] Tim Gasper: And cheers to everyone here. Who’s joining us. [00:04:00]
[00:04:00] Juan Sequeda: Yeah. So I think we had a little thread going on in the chat that we’re, looking, it goes, go tell us what you’re drinking today and share pictures or whatever.
We always have the warmup. We have a funny question here of the data kicked this off. So what have you transformed you, someone else into something else for a costume or a product or something? What have you transformed? Who wants to go first.
[00:04:26] Meetesh Karia: I’ll go. I’ll say I’ve transformed my, kids. Like they’re, it’s scary to look at them and see how big they’ve gotten already.
And and my older one, I’m about to join high school next year. So, it’s crazy how much how many parallels there are between raising kids and leading teams.
[00:04:46] Juan Sequeda: I love that. How about you, claire?
[00:04:48] Claire Look: I’m going to go relevant to dbt in this conference and say at the zebra, we transformed our technical analyst and to analytics engineer.
So that is something [00:05:00] we it’s great to be here because we wouldn’t have really discovered that title and done all of that without really digging into the duty community and seeing how that grew. So that’s definitely something we’ve transformed. Is that role,
[00:05:15] Julia Schottenstein: Tim? What have you transformed?
[00:05:18] Tim Gasper: Oh, my goodness. Obviously firstname.lastname@example.org, we’re dbt users as well. And so we’ve tramped transformed some folks into analytics engineers, but I will say that I’ve, transformed my morning routine. One of the only sort of positive things that have come out of COVID here is that I actually get up at a consistent time and I go for a run most morning.
So I’m excited about that. That’s a good thing.
[00:05:39] Juan Sequeda: I’ll take the one. I’ve transformed a research and. Pushed it into real product and the company and sold it. So I think that’s something I’m pretty proud of and really excited about bridging these worlds of academia and the real world. All right.
Cheers to transformations. Okay. So here’s, the deal we’re doing this live and we’re watching on the [00:06:00] Slack. We got a couple of, we’ve prepared some of the topics we want to go chat about, and we’ve mentioned it before. We’re talking about data teams, data, cultures, data mesh, data catalogs, but if there’s any other topic, just put it into the chat into the Slack where we’re watching.
Also, we always have this segment of our, of our lightning round questions. So if you have any questions, preferably with yes or no answers put them in, we’ll be watching them and then we’ll be throwing these out to mutate and Clara and towards the end of the broadcast here. So.
[00:06:33] Tim Gasper: We’re literally watching the channel right now. So please post your questions. There are no dumb questions and we will be integrating it live into the show and also into the lightening round. Please, post your questions.
[00:06:44] Juan Sequeda: All right. So let’s kick it off. Honest, no BS. So why is dbt such a big deal? I always the first time I’ve been looking at dbt, I’m like I can’t believe this didn’t exist before and, now we’re seeing it, [00:07:00] but like why didn’t exist before?
And why now? Like, why is this such a big deal and how big of a deal is that Claire
[00:07:09] Claire Look: I would say I think back to some of my analyst roles that I would consider analytics engineering, but what we’re doing in terms of Tableau extracts were our way of transforming raw data and making them accessible in reports or leveraging spark or leveraging hive.
And we were doing, we were transforming some of this raw data, but it was all in different areas and there wasn’t that consistency. And so it’s just amazing that someone has packaged that up and thought, Hey, this is something we’re all doing over and over in different ways, depending on the scale or size of our team and data and said, this is a product, this is a common problem that we are seeing across all these different teams.
And this is something that. Say, Hey, like the SQL translation layer is a [00:08:00] product and making that accessible for large groups has been very, transformative for sure.
[00:08:09] Meetesh Karia: Yeah. I’d add on there. I, remember building things, using talent and SQL scripts and Cron jobs and, all sorts of things where you get multi thousand lines, SQL files with no way to test, no way to reuse, no way to, to share.
And, we’ve taken a lot of what has become standard engineering practice
And, dbt has brought it to data and data transformation.
[00:08:37] Juan Sequeda: This is one of the things I’m so excited about dbt and just having these transforms as first-class citizens. Is that it’s that right? It’s before it was a thing oh, it’s you live it behind it’s, just, it’s part of the city, a part of the process, but it’s not just part, it’s a core, it’s a core element of everything you’re doing in the data. And I think that’s one of the big transformations that I love about this. [00:09:00] And one of the cool things about having both of you here on this conversation is that meat, as you’ve had this true executive, and you can see how, the technology has transformed companies from the executive point of view and Claire, like you’ve really been rolling up your sleeves and doing all the analytics work, but also managing the teams to go do this.
How has this evolution been in the past, couple of years, all from companies and new companies coming around and thinking about transformations as first-class citizens and building up these tools with, to think about it how, have you seen this evolution
[00:09:30] Meetesh Karia: I can start with before early on and then hand it off to when we first started our analytics and like BI and data journey at, the zebra, we were looking at tools like RJ metrics warehouse point and click transformation, very limited things that we could get going and running on the side. And even when we then moved to Looker, look, and that was a big step up in terms of modeling, but it quickly gets hairy. It quickly [00:10:00] gets to the point where it’s in the source, like source control, but you can’t test It It’s, really hard to share. And I think that then we get to the, power and I can go over to Claire now in terms of what we’ve been able to do since introducing dbt and going from there.
[00:10:20] Claire Look: Yeah, I think so. We’ve, been pretty early adopters. I think we started using dbt in 2018 and a 2018. And yeah, we’ve been able to move the majority. All of our transformations over to dbt. And so that’s been something where it allows us, like before it allows us to really establish the analyst and analytics engineering team as well, because they have something to work within and something to establish like best practices across the team.
And so I think we’ve been able to really scale out. That whole practice [00:11:00] and data modeling as a team and be able to support a lot of different areas of our business by leveraging that versus kind of the point and click good for different areas, but not necessarily for the scale component.
[00:11:12] Tim Gasper: That’s interesting. And Claire Meetesh touched, obviously you all have built a, really powerful team over at the zebra to work on a lot of these problems. And as you’ve adopted dbt, as you’ve expanded the use of that using it in combination with a bunch of modern tools here how have you been thinking about your data team, how you structure that data team and how has the role of the analytics engineer or been a really big impact on all of that?
[00:11:43] Claire Look: Yeah, sure. We have definitely grown rapidly. So I think when I, when we first adopted dbt, we had one analytics engineer, one data scientist, one data engineer we have one of. key role one data, product manager. So we were really like, [00:12:00] let’s figure out what we can do with the tools that we have to now we have a 50 person data org.
So we’ve just we really started with, I think, what is that foundation, but really, as we were growing, it was okay, we got to establish what are our core datasets? I think one of the first things we did was this project data recap. If anyone on the team is listening, they’ll be like, oh gosh.
But that was really to say, okay, what do we have? What are we trying to build? What are the different ways people are looking at our data, not saying it was successful, but saying it was something where we were like, let’s take this modern tools and approach to our data and go build it as a product.
And then scale people accordingly with the increasing demand because the data team, wasn’t the only one that’s growing, every other department is growing. So the needs that are coming into the teams. So the ability to see. Okay. We can build these core datasets, but we need to have multiple analytics [00:13:00] engineers be able to live in this code without creating too much chaos there.
[00:13:07] Meetesh Karia: Yeah, I think we we, grew to a point where the, size of the code base and the size of our, like our data and domain was larger than any one person could really keep in their head. And so onboarding is an issue that if you can modularize, if you can separate out domains, you can ramp people up to be able to have them be more effective and more impactful sooner.
You get this issue of you get to a certain size where you change something in one place and something unintended breaks somewhere else or, vice versa. And by, by splitting out chunks you get the, to limit the blast radius. And then really I think, and we’ll probably get to some of this as we talk later, it in my mind is the foundation for And setting us up for how we grow and, support the [00:14:00] org as it continues to grow.
[00:14:02] Juan Sequeda: So let’s dive a little bit more into this and you just talked about how to scale and some conversations we’ve had before mutations about, I love your, the threes and the tens, but things start changing with respect to threes and tens.
And Claire, what you just said is that at the beginning you were just like one of each, and now you have a team of 50. I always wonder is the conversation how with a lot of people is this balance between centralization and decentralization, right? So you probably start with a centralized team and then you’re like I can’t be a bottlenecks.
So you start figuring out that balance. What is that process? I know we’ve talked about threes and tens is a good way of approaching that. I really would appreciate more of your insights about this, because this is a conversation we have all the time, centralization, decentralization.
[00:14:46] Meetesh Karia: I, think this goes right into the, one of the hottest topics in the field right now that have data mesh, right? Which is a possible solution or an approach to addressing this scale. And I look at it as, we hit [00:15:00] around 30, 40 people is when we started to see some of these issues where we started to say, Hey, we need to actually take a look.
We had centralized to get some consistency, to get shared hiring, to get to build up certain skill sets and in processing. And now we get to the 30th with looking forward to growing the team, but also the entire company and say, we need to set ourselves up so that the data team is no longer, like no longer owns every bit of data for the company.
Instead, the data team is providing the underlying platform, the tools, the processes, the governance that supports different domains, owning their own data. And, the data team pulling it all together. And so I think that’s the big transformation that we’re going to see between 50 and a hundred or the, stage of the company right now that really speaks to that next step of not being [00:16:00] centralized, but also not being completely distributed.
[00:16:04] Tim Gasper: At what point in the growth of the zebra, did it become clear that you needed to change your approach a little bit, that you couldn’t just continue to have it be 100% of centralized approach, but that you needed to start to become more distributed and what role does dbt and analytics, engineers play in that move to distribution?
[00:16:28] Meetesh Karia: I guess I’ll take the first part. I can a second to declare, but I’d say it probably became clearer the beginning of this year or so. Right around that 30 ish size of our team. Because all of last year was around centralizing, hiring up getting the right team structures in place.
And then right around the beginning of this year is when we started to see this is getting really difficult to support the entire business from one central team.
[00:16:56] Tim Gasper: Yeah.
[00:16:57] Claire Look: I think it’s definitely, we hit a point in. [00:17:00] Yes, support in general, like centralized support and just the number of questions you’re getting, the number of things that you are now supporting across these teams that it wasn’t until just recently we said, okay, we need foundational teams.
As, most companies grow on the engineering side, same thing for data, we need our platform teams, our data platform, core reporting, like who are those people? But then we also still need, we can’t Holt the business and say, all of a sudden, you cannot get any data to measure the effectiveness of your product or to measure, measure agency.
In our case, we deal with insurance. So we still have analytics, engineers supporting other areas who, with the benefit of being centralized, they understand how to work within dbt. They understand where to work within dbt. And there’s some of that shared process that we’ve built out. But.
Centralized, but the actual domain, they start to become aware of and they’re able to work within [00:18:00] their, while we focus our centralized teams on pure like foundational platform work, because we were doing both for too long. And it, that gets you in a tough situation where you’re trying to support everything.
[00:18:15] Tim Gasper: That make sense. And just to even rewind a little bit and go back to the teams aspect of all of this, again we actually get an interesting question here in the channel coming from Julia. She mentioned I liked Claire, how you put put it that your team transformed into analytics, engineers, Heather actually been some other technologies that have had a major impact on your team’s identity. Or what does it mean when a tool changes roles on your team.
[00:18:53] Meetesh Karia: Great question.
[00:18:54] Claire Look: Yeah. Like
[00:18:55] Tim Gasper: I know you all use Looker did that have a really big impact as well? Were there other tools that [00:19:00] kind of fit into that mold?
[00:19:02] Juan Sequeda: I was gonna say, I’m curious on is though BS as much as you can. It’s like you, you started talking about looking Mel and oh, this was a great first step, but then you hit some barriers around it.
So maybe that helped for something, but you hit some barriers and then this dbt comes around. Like how did that change with the folks within your team? Yeah
[00:19:20] Claire Look: I, did a whole. I did a talk at one of the Looker events that was like changing the next gen of analyst, because it was like, now your analysts are becoming more like analytics engineers, because they’re able to be in the look ML and they’re able to point at this field and an Explorer trace it all the way back to the source code.
And that was something I hadn’t seen in Tableau or anything else where you could go straight from I’m in a dashboard and now I’m tracing it back to source code. And I was like, that changes the game for analysts because they’re able to understand like all the underpinnings that really are multiple [00:20:00] roles or steps in the process.
And so I definitely think there were like Looker really advanced our analysts. And I think we want to get back to that because right now we do have it’s majority are analytics, engineers that are in dbt. So we have less of that. Collaboration where our analysts are actually making changes to the source code.
So there is a little bit of, that kind of growing pains as you switch from Looker to dbt that we’ve worked through. But I think it does level up. I think Looker was one that did level up the technical skills of our analysts. For sure.
[00:20:39] Juan Sequeda: So I was to mention here that Gregor on one of the threads is making an interesting point, which is one I’ve always had to, and he says is the role of the analytics engineering, BA bounded basically connected to the dbt community.
And again, honest, no BS, right? You hear analytics engineer and you immediately know, [00:21:00] oh, that’s dbt because dbt is pushing this and stuff, which is true. And, but let’s take that label away. I guess the, label that Alexandria let’s call it foo, that work that the analytics engineering is Fu is doing right now.
That’s actually happening. I There’s a need for that regardless of what people were putting onto it.
[00:21:19] Meetesh Karia: Yeah. Absolutely. I think it’s the job of taking raw data and modeling it and transforming it into usable business data. And putting it together and, whether you do that called an analytics engineer, we’re called a data engineer called whatever you call it Fu call it an analyst at someplace places.
It’s that job of really understanding what the end, what’s going to support what the business needs and taking the raw data, putting it together, transforming it and modeling it.
[00:21:50] Claire Look: Yeah. I, want to share this podcast with a mild. HR manager, because I would have conversations with him around, [00:22:00] we have these analysts, not quite what I need.
We have these data engineers in my previous company, because we were building out a universal catalog data set. And it’s I need people who can just take all this data, turn it into a universal product catalog. And it was that in-between skillset. And we didn’t know what to call it. I’m glad we got there at the zebra, but it was one of those.
It was like, what do we recruit for? And we ended up turning some of our analysts into data engineers, but I think we have found they weren’t, there was no dbt involved in that, but the role was still the same as. Yeah.
[00:22:35] Juan Sequeda: We wrote it down here and I think this is something to go pin right now. The role of the analytics engineer does it have to be tied to a technology like dbt at the end of the day, it’s about transforming and modeling that raw inscrutable, complicated, ugly, shitty enterprise data, right in to data that is just beautiful that the business understands and they can go run with it to go make more value.
So, generate more [00:23:00] value. I personally, I’ve called this always more kind of the knowledge engineer or I’ve pushed this term called the knowledge scientists too, but I think it’s all evolve is I want to be able to it’s somebody almost like a bridge right there in the middle. They know how to go talk to the business, understand their new.
Modeling, put on the whiteboard, do that type of stuff, figure out what is in the data, do the transfer for the stuff. And I think that’s something that is crucial that we just haven’t brought that into, organizations today. So I think I’m really happy that this is a, that this role called analyst engineer, the knowledge and year, whatever, we just need more of that and that more of that mindset.
And also I think about it is what comes after the data scientist, right? It’s there’s been 10 years of this now. I think there’s somethings coming next. And I think this is this, is a big one.
[00:23:45] Meetesh Karia: I think that this, that the engineer part of this too, also really helps with understanding, like this is an opportunity for testability.
It’s an opportunity for reusability. It’s an opportunity for modularization. A lot of these concepts that are, really [00:24:00] core to engineering, and dbt has helped bring that there, but also the term analytics engineer. That, is what I think we want those people doing is that modeling.
But then also the engineering aspects of it.
[00:24:14] Tim Gasper: Yeah, it’s clear that there is an overarching theme here, which is there’s all these best practices around software, right? Whether it’s modularization around reuse, don’t repeat yourself configuration as code configures into continuous integration and deployment.
And it seems although obviously dbt is, a big component that is very popular. That’s pushing a lot of this, whether it’s your look ML, there’s a lot of BI oriented players that are moving in this direction now. Obviously there’s your Python code and your notebooks and all that kind of stuff, right?
This is can all fit this paradigm of analytics engineering, which seems, exciting. It seems like an evolution of the field that we’re in.
[00:24:57] Meetesh Karia: Absolutely. Yeah.
[00:24:58] Juan Sequeda: So I want to [00:25:00] go jump onto another topic about about culture. And I think right now too, Julia was asked another great question, which is, I was thinking about this too, is.
We’re talking about scaling teams and stuff. And how do you go train all these, like all these folks within the culture you have understanding the, business terminology about this. I think, and also I’m just reading here. People always want the roles to evolve and to learn more skills, but often hiring can be skewed to look for someone who has done it before.
You were mentioning too, that you were talking with your age your, previous HR colleague, like what is that right balance?
[00:25:39] Claire Look: Yeah.
[00:25:39] Meetesh Karia: It’s a go for Claire. Yeah.
Yeah. It’s challenging. I was gonna say I guess there’s a couple parts of that. One is how do you build and preserve that culture and the other is how do you find that the people that I think that the [00:26:00] second part of that is perhaps easier particularly for us at the zebra To answer, which is that we really believe in growing the people we have at the company.
And coming in as an analyst learning engineering principles, learning a lot more about SQL I think is a natural path to growing into roles without having to come in from the outside, but also bringing in and retaining and continuing that culture within the team. And then, yeah training, onboarding
It’s challenging and it has been really challenging, especially being a remote the last almost two years through a pandemic. We haven’t nailed all of it, but I think back to a passage in the book winning with data that talks about what Facebook did around a data kind of [00:27:00] bootcamp when people come.
And one of my thoughts is that eventually we need to get to somewhere like that.
And that’s not just for the data team. It’s for the entire company to, to help and understand how to think about data and how to, build with data. But Claire, I don’t know if you have other thoughts on some of the things we’ve done, particularly around onboarding.
[00:27:22] Claire Look: Yeah. One of the things I think part of Julia’s question around hiring can be skewed for someone who’s done it before.
Like we saw that challenge a lot, especially within analytics engineering, just because it’s new, everyone knows they need it. And so it’s a competitive space. And one of the things that we’re like, okay, we’ve had analysts that have been successful growing into analytics, engineers roles. So can we create entry-level analyst roles?
Where you can really understand the business context and you can start to understand our data and be around other [00:28:00] people who are working within dbt. So you’re looking at that or potentially, or understanding the business more by leveraging a data catalog or there’s some other areas where you can be learning and adding value, but you may not have the technical skills yet.
And is that a an easier role less competitive to get people in your team where they’re search to understand the business, the data so they can start adding value and then they can learn the technical skills. So we haven’t we’ve been successful moving some analysts, so we think, okay, that could be a good path for us.
In the future is that entry-level analyst role within that core reporting team.
[00:28:45] Tim Gasper: On the, on this topic of culture and thinking about how we manage the culture of our teams obviously as you grow your company you have to manage how that culture will evolve over time. And then also tying it back to we started to talk a little bit [00:29:00] about data mesh as you start to decentralize that can have an interesting impact on culture, right?
And obviously you want The data, people in the org to have a cohesive culture if they’re not working with each other all the time how do you do that? And Nikki actually asked the audience and the channel, does anybody have learnings I’d like to share from seeing orgs go from centralized to decentralized or vice versa, and then tying that to culture, maybe starting with you, Claire, do you have thoughts about like, how you want to establish that culture that goes across a decentralized approach and your, thoughts on culture with centralization and decentralization?
[00:29:38] Claire Look: Yeah. I’ve been answering this question a lot. So again, if any zebra folks are listening, as we’re going through our own little, a centralized and decentralized I’m really confident. There are some things that. Our core to our team, something new, we just did was data divergence days, where it was like our version of a quarterly hackathon.
And it’s that’s something that [00:30:00] everyone should participate in. Whether you’re supporting a core platform team or you are embedded within another team, that’s helping solve a specific business problem. There are things. That I think we just need to continue to establish across data roles. Another one was like a data architecture group test, and I had nothing to do with that was formed by the team when they saw, oh, this is something we’re decentralizing.
And they said, this is something I see as a need. We need to be talking about these best practices. And that’s a completely team formed group. And they’ve recruited members of different roles and people not within data, but those are going to be more important because of course you’re still going to have your happy hours and your hackathons and all of that to keep the teams together.
But I think there’s also gotta be some of that kind of Guild format where you have people across the company, and that helps with the shared ownership of data too, [00:31:00] because it’s not just that’s the data org problem. It’s we’re all we all contribute to the creation and use of data.
And I think that’s a good thing.
[00:31:12] Juan Sequeda: How much? I always think about, we talk about the business or the business domain, the business users, and then the data folks like is, are data folks getting involved into the business practices domain, or is it the other way around, or what does that balance there that you’re seeing and what do you actually suggest?
Because I think this is another thing we, when we talk about data mesh the different domains and you push things under the main, like how there’s a lot of blah-blah-blah, we’re talking about this in theory. And I talked to people who’s actually done this. And it’s still a lot to figure out here.
So what have you guys actually done? What’s working. What’s not working.
[00:31:52] Meetesh Karia: Yeah.
[00:31:53] Julia Schottenstein: I
[00:31:53] Meetesh Karia: I think what we’ve seen is working is when we have the data team folks working closer with various parts of [00:32:00] the business to, understand and getting a head like starting from the very beginning of the product development.
And, understanding and helping define what questions we’re going to ask, what we need to answer it, helping inform the decisions we’re making around product development. We have folks on the marketing side now that have built a very close relationship with a lot of our marketing team where they’re involved in the decision and they know what’s happening, they can inform their proactive.
And at, an almost kind of become part of the culture of two different teams, they get invited to events for both teams, and that’s, where I see like it really is, which is on the business user standpoint. I think it’s really about understanding what they need, what problems they’re trying to solve.
I think we can pull them into utilizing the data and empower and enable them. But I think then it’s actually more impactful for us as a data or to go and get closer and understand really [00:33:00] what the business is trying to do. Yeah.
[00:33:06] Claire Look: Oh, I was going to say, I think one of the. One of the things that we tried that didn’t work is and I think we’ve always been adapting is where the role of the data product manager fits within this, because you do need people who are closely tied to the business, really understand that business domain.
But you also can, you have to be very explicit then about the role and what people are doing. Because as soon as cause it’s the same thing, you could turn a data product manager into a bottleneck where they’re now having to play both the centralized and decentralized role. And so that’s another one where as you scale the teams, you also have to scale, like it doesn’t necessarily have to be a data product manager.
It could be an analyst. Who’s the subject area expert in that domain. Just being very explicit about what do we actually need out of this role? What are, the [00:34:00] problems that we’re seeing in this domain and then adjusting accordingly so that you’re not just putting people in different spots, you’re being clear.
Maybe we already understand the problems of this area. So we don’t necessarily need someone to define them. We just need like help partnering and, right. And really nailing down the requirements. And that might be an analyst. So just being clear on the roles and how to support their .
[00:34:25] Tim Gasper: Yeah. So subject matter expertise and also ownership is a very interesting topic. And so you mentioned about a couple of things you mentioned about the analyst, and sometimes they’re playing that role, but you also mentioned this phrase, the data product manager, which obviously was a central topic in an episode that we did together, Claire.
And w what role are you seeing data, product managers starting to emerge in around this whole thing. And, do you see them being more of a centralized component or do you see them playing a more of a varied role?
[00:34:59] Claire Look: Yeah, [00:35:00] we’ve gone right now. I think they benefit more on the centralized function because they can really help tie across all these different use cases.
What is that core business domain knowledge that we need to provide? And so I think that’s where the most benefit has been provided out of that role. Whereas then you have other analysts were an input into that are very familiar with the kind of decentralized portion, but I think we’ve seen the biggest benefit to data, product management.
And have centralized because that’s really where you’re like here, all my different user problems across all these different domains and what do we need to build to support that? And that gives you more of the power of kind of that product mindset and thinking through the different personas and use cases of the data.
[00:35:49] Juan Sequeda: This is really interesting because I was not expecting that answer. I, always see the if you think about the whole data mesh the decentralization and the domains and stuff that you would, [00:36:00] and the way how it’s pictured is that every domain has their own data product. You take data product, and that gets in, and then the, when you look at all the diagrams on this product gets combined with another product and generates that.
But you’re saying no, that data product managers would be centralized. And I’m saying, okay I, buy that when you’re smaller, but at some point it is how is that going to go scale? I can see that you want to go have data product managers for the central team, but then you won’t also want them to be for the decentralized team, because at the end, I’ve used this phrase before you, you want them to be like liaisons, right?
I’m the product manager for the central team. I know what’s going on here. Like you, both of you all should get your stuff together. And then because I need to go combine them because I have a broader context of that. So, that I can imagine, but so
[00:36:51] Meetesh Karia: it’s, an evolution, right? Because right now, to Claire’s point, they’re understanding all the needs for the data, but then it transforms, I think, to being [00:37:00] about the platform, about building the tooling, building the processes, building the thing that supports the data, mesh the the, decentralization.
And that’s when you though then go and add the data product managers in the various domains. Who are working with their counterparts on the centralized data platform team to make sure that the tools that are being built, the the the, processes, etc, that support people, building their own domains, all like all work together.
[00:37:32] Tim Gasper: And depending on your company, your scale, your use cases, it seems like different roles, maybe get spun out to the spokes, to the decentralized aspects in a different order. Like maybe as you get to a certain scale analysts start to become more domain oriented at a certain scale, then maybe even your analytics engineers start to become more domain oriented at another scale maybe before or after that is when your data product managers start to become more [00:38:00] decentralized.
And you know what, this makes me think about some of the conversations that we have so I’m a product guy. We have conversations about team typology all the time and roles, right? And are you a product manager and a product owner and a scrum manager, or are you just a product owner and a scrum manager, but not a product manager.
And you start to get into some of these topics, right?
[00:38:23] Meetesh Karia: Yeah, exactly. I think that you definitely have a progression there, right? Like you can have an analyst in that role for a certain amount of time.
To, help because the analyst is going to be the, expert. They’re going to understand the data.
They’re going to understand what we need to produce. At some point, though, it grows large enough where you’re like, okay, this is a separate product. I’m not just serving the central set of data, but I’m also serving other use cases of this data. Yeah.
[00:38:49] Claire Look: I think I definitely want I, agree. That’s the state that we get to where it’s once you then move into kind of a de-centralized environment, [00:39:00] then you can have data, product managers who are really domain experts in the area.
I think when you have a centralized team with de-centralized product managers, that’s when you get into some challenges from asking them to do both and it’s okay, we’ll focus on your foundation first, then scale it out, then figure out what are the problems? Can you add this role? So yeah, definitely the progression and depends on the state and scale that you’re in.
[00:39:29] Tim Gasper: Yeah. And we’re, so we’re talking about roles right now. And we’ve got a few roles that we’ve mentioned about, data, product managers analytics, engineers, analysts, data engineers Kurt Lancing in the channel, ask an interesting question or maybe more of a comment, he said, I never hear about database developer roles anymore. And uh, database developers, maybe they worked more on transactional systems. Whereas we’re having a little bit more of a, an OLAP [00:40:00] conversation than, oh, LTP. What is the future of that role? Is it dead and actually brings,
it brings a broader question. I There’s a whole class of data roles that are like database developer, data integration engineer, BI developer, right? There’s some of these roles that like. Some companies are hiring for them. But they, but you don’t hear about them as much. I don’t know, maybe starting with Umi, Tash, like D do you see that some of these roles are fading?
Is this more like modern stack versus traditional stack? Curious as to your thoughts here.
[00:40:33] Meetesh Karia: The, no BS answer is up until the end of last week. I might’ve answered this differently. But actually had a really interesting conversation with our with our, principal data architect talking about the role of a DBA in the which is very much aligned with what was described as a database developer there in, in a modern data stack.
And, there’s still a need for that role that it’s [00:41:00] different. But it’s all about ensuring the data is modeled in such a way that we can do other things with it, ensuring that we’re taking into account security, privacy, and other parts of governance practices.
Even in our our runtime data that is still under CCPA and GDPR, and a lot of these other things.
And getting to a point where we have developers that are coming in, working with higher and higher level frameworks and languages that abstract away a lot of what’s happening under the covers in the database. So you have fewer developers that understand truly how to model data.
What’s critical. What’s important, how it affects things downstream. And so that role it’s maybe shifting it, but it’s it’s still important. And that’s something that, like I said, no BS, I actually, my mind was changed at the end of last week because I thought that, oh, these roles are going away but, [00:42:00] maybe some of them are.
[00:42:03] Juan Sequeda: Huh, this is I didn’t, again, another response I did not expect. I thought you were saying, yeah, the modern data stack, it’s all of the cloud. It’s also finished. We don’t need that stuff. But,
[00:42:13] Tim Gasper: oh yeah. And the data architect is a role here. That’s very interesting where is it the analytics engineer’s responsibility to architect the right model, right?
Like some of these things, maybe they turn into hats that you wear, but maybe there is still a role for some of these traditional roles and apologies on their repeat of the word there. But a role for these roles still, right?
[00:42:37] Julia Schottenstein: I,
[00:42:37] Meetesh Karia: also, I would say, I think there is because there’s one thing that I look at every year, I look back and being it’s my birthday, I looked back every year and I was like, man,
I was stupid last year.
Why did I do that? Why did that? And I think back to 21 year old me out of college and I was ready to go and I thought I knew the [00:43:00] world. And I was like, who needs. And then as is probably cliche, it like across generations, I look back cause man, yeah, experience matters so much.
And when you look at it and you talk about the architect’s role, it may have been in terms of not not in a modern data stack, but the experience of having worked with data and seeing issues that come up, seeing what works, what doesn’t over decades is invaluable, right?
There’s no way you can gain that without having done it. And so I, do think that role is critical. And I think that there’s a lot that teams can gain from bringing someone on who maybe they weren’t the term analytics engineer or dbt didn’t exist back then catalogs didn’t exist back then modern BI tools didn’t but a lot of the, goals and a lot of dealing with data still did
[00:43:53] Juan Sequeda: so honest, no BS, what is it that you’ve in this past year realized I should not [00:44:00] have done that.
[00:44:05] Meetesh Karia: I, actually realized what I should have done. Which is I spread myself too thin and I should have dug deeper into some of the work I was doing and been a little bit more. I, like to get things moving and going. I’m I’m definitely about getting work going. And like I have conversations, I talk to people, get them going, but I don’t always pull that picture together and paint it for a lot of other people.
Get in, one cohesive picture for people outside of the teams doing the work. And so that’s something I I continuously learn is that, Hey that maybe it’s part of my job now to, okay.
[00:44:44] Juan Sequeda: How about you, Claire? What has happened? What did you this last year? Have you realized that I should not do that anymore or I should continue or something positive there?
[00:44:53] Claire Look: I should not do. I think it’s, one, especially with growing [00:45:00] teams, I think being quicker to make changes within the org. I think that’s something where there’s always going to be a different scenario. And you sometimes you just got to try it, like when you start seeing issues of we’re a bottleneck.
Okay. What can we do about that? I think there’s just you can make smaller tweaks along the way, instead of thinking you need to have all the answers before changing something or making a big bang change at the end. So I think just especially when growing rapidly and growing data teams rapidly, just being more like willing to make small tweaks along the way When you notice those problems and then measuring how you’re doing against what you thought or if the problems are improving.
So I think a lot of that where it’s just like what iterative development and software. Okay. Let’s do that with some of our like, test and learn on the org structure side, because all of us are [00:46:00] learning some of these new concepts and ideas together. So yeah, it’s about like really leveraging the community here and people who have done it before and testing out what works and what doesn’t within your org.
[00:46:15] Tim Gasper: That makes sense. And as we transitioned to our sort of last topic here before we do the lightning round just a quick shout out to everybody who’s listening right now. After this next section, we’re going to do what we call the lightning round, where we do these yes. Or no questions.
If you’re hanging out in the Slack workspace, please hop into coalesce catalog, cocktails channel and feel free to drop in some yes or no questions. And we’ll, incorporate them into that last segment. And with that just one other topic that I know we were interested in to hit here.
We talked a lot about sort of the data stacks. And the the analytics engineers working on the transformation, the warehouse run integration and [00:47:00] BI tools like Looker, for example. But obviously another key element that is increasingly being incorporated into the, modern data stack is the metadata component.
Whether it’s things like catalog governance, observability, I’m just curious
From your perspectives, maybe starting from, you, Claire and then moving to Meetesh is what role is catalogs our catalogs and other metadata oriented tools playing in your data strategy and what you’re doing with your data teams.
[00:47:30] Claire Look: Yeah. Biggest piece we were missing, we were moving fast from a technical perspective in terms of how do we model this? How do we make it available? But then you get to the point where we talked about earlier, where for someone onboarding onto data, that becomes very complex to traverse your giant dag.
And so then you got to say, okay, what’s the actual who owns this field and who [00:48:00] we just added our company. Okay. Ours, our key results. And we have executive sponsors for each of those. And it’s like, where do you store that confluence page or directly tied to the actual metric? So I think from a you, look at your data and you’re like, okay, what’s the missing piece here is like the context around it and who I go to when I have a question just all of that information around how frequently is it updated?
Some of that, is just living in our centralized team heads that worked when we were a team of four, we could manage that information. It doesn’t work when you scale. So I think that’s where the metadata component comes in because it’s really this doesn’t, it cannot scale to live in your head.
And so where’s the best place to put it we’re, we are implementing a catalog for that.
[00:48:55] Meetesh Karia: Yeah. And I’ll tack on to the, governance piece, which is [00:49:00] in my mind, one of the trickiest problems or the one that is also the most kind of opaque to solve around, moving to something like a data mesh or scaling is how do you, deal with federated?
How do you do deal with that? And that my mind is where data catalogs and metadata based tools really play a part of. There’s no way you can go distribute and scale your organization. If you don’t have some way of making sure the data is trustworthy, making sure there’s quality, making sure you have privacy security, right?
Like all of those things need to be built in. Otherwise you have every team doing that themselves. And if every team is doing that themselves then, what’s the point of the modern data stack and world we’ve, moved towards.
[00:49:50] Juan Sequeda: It’s a bit really honest and no BS here. We, hear a lot this term, federated governance and federated computational governance, coming from the whole data mesh [00:50:00] stack. What do you, what do we mean by this? And actually, and if we can get really concrete, right? What is an example of this federated governance?
[00:50:11] Meetesh Karia: I’d say at the very, basic level, it’s probably just some standard sets of processes right.
Of, Hey, here’s how you go about certifying dimensions, certifying bits of data
At the very basic level, and then tracking that it was done tracking the audit log of it tracking assigning owners in my mind, that’s probably the most concrete, but also probably the most basic. And I’m sure that there are a bunch of people listening and others that are like there’s way more than that. I absolutely, but in my mind, that’s the first like basic thing that comes to mind.
[00:50:49] Juan Sequeda: All right. And then what, okay. You we, do that. What’s next. How and how, complicated or complex should we get to her or. That’s probably sounds too negative, but [00:51:00] I’m where a sophisticated let’s call it that way.
How much more sophisticated governance should we do or, what’s the minimum or w Tim and I’ve been talking about this too. It’s what’s the minimal viable policy. Let’s call it an MVP that you should go. What’s the minimal stuff, right?
[00:51:20] Meetesh Karia: I guess it depends on the company, right? Like we’ve built a tool that helps us process CCPA requests, right?
So surfacing user data, deleting user data. It, it seems like in a world where we have federated governance and need to deal with compliance in a distributed way, we’d want to build tooling around supporting that.
We wouldn’t want to make every single team solve that problem in their own way. So that’s more than just a process, more than a, just a checklist.
But I know. This tool has to be built into every bit of data. The notion of data that you provide and [00:52:00] here’s the, steps to go do it. And here’s the tool.
And so I think that it goes from processes to then tooling to support some of the things that are a little bit more complex, a little bit more advanced.
[00:52:12] Juan Sequeda: And we’re obviously very biased here on catalogs. We definitely believe that catalogs and governances is a key part that I just feel that sometimes it’s, not really big, a big part of the conversation and the true landscape of data. You’ve say, oh, you need to have the governance.
It’s on one side. But it truly covers the entire landscape of governance of, data. And this is something that we see it too, to separate it. So I think one thing is you have the metadata and you have the governance, the policies, but you really want to have this stuff really connected to the data.
And we see some times it’s just like documentation, but it needs to be more than that. It needs to be the stuff that you could imagine, a world where it’s all, everything is completely executable. Imagine [00:53:00] policies that we go define our executable code and it’s not just English definitions. And it’s something that we can go, people can go reuse and check out and they can apply it.
I think we’re so far from that. And we’re really not thinking about it because we’re just so excited about let’s just get the data and, transform it. But, and the governance has usually been this, oh, it’s just for the CCPA, the GDPR stuff, but there’s money. There’s much more than more than that.
I’ve been having conversations with folks is like, what is the telephone. Like, how do we define what is the telephone number and what is a mobile phone number? Oh, a mobile phone number. Wait if it’s a mobile phone number in must have TCPA consent. It must have an SMS. Can we send SMS to them or not?
Or, voicemail so that’s how we know what what the value, what a true telephone number is. And we probably know what this is. It’s in people’s heads is it’s documented, but if we’re going to go start generating data, I need to know, Hey, that’s a valid telephone. Or it’s not, [00:54:00] or you’re missing.
And if it’s not tell me why it’s not, what’s missing about it. And I think we’re not there yet. And I think this is where governance is key about it. And I think this is where the catalogs are going to play a key role. And it’s not just about just w it’s not just about bringing in what I, what are my tables and columns and how they’re connected.
It’s really documenting and implementing this in an executable way. And that’s why I think is I’m really excited about how, governance and catalogs are going to play a key role because metadata touches everything. Anyways, I just went onto this long ramble around here.
[00:54:33] Claire Look: I was like how easy it is to build data products.
If you start from that foundation where you’re thinking about the governance and the metadata and the policies around each individual, Telephone number versus mobile number. Then think about the product team’s great. I can go use this and this way over in this tool I know how it works.
[00:54:54] Meetesh Karia: It makes me excited, right? Because you look at it, you’re like, oh we, there’s a lot of movement. There’s a lot of growth in the modern [00:55:00] data stack and data org, but there’s so much more still to, to innovate and solve. I get, this is not a solved problem at all.
[00:55:07] Juan Sequeda: I was, I’ve been having conversations with a lot of other companies and customers and stuff.
And we, imagine is you have this model refined for customer. We, what is truly a customer from a modeling and what is a telephone number, for example. So then we say, I want to go to find data. I should be able to go click customer, click telephone number. And here’s what you should be able to go.
Here’s the minimal stuff that you need to go have for that stuff to know it’s valid, right? That’s and, as you can imagine, modeling and transformations, this is all key to it. Wow. We’ve I think we’ve got a few minutes left. We got to start winding down and.
[00:55:44] Tim Gasper: We covered some good around some good ground here.
[00:55:47] Juan Sequeda: I’m ready for my next cocktail now. So
[00:55:51] Tim Gasper: Should we do a lightening round? We’ve got some good. We got some good contributions from the chat.
So who’s first. Juan you want to do yours first? [00:56:00]
[00:56:00] Juan Sequeda: All right. We got this from Nickeel met metrics layer. Is this going to become a thing? Yes or no?
[00:56:10] Meetesh Karia: Not sure. I’m going to say no, because I don’t know what it is.
[00:56:14] Juan Sequeda: All right, Claire?,
[00:56:17] Claire Look: I think, yes. All right. Go,
[00:56:20] Juan Sequeda: Tim.
[00:56:22] Tim Gasper: I liked that. We’ve got two data points here. One is that we’re still learning about the metrics layer, right? And then also, like how useful is it going to be? So next lightning round question from Julia, do companies need a machine learning an ML strategy?
Maybe starting with you, Claire. Yes or no?
Yes, yes, All right. We got two yeses.
[00:56:47] Juan Sequeda: All right. So we got from Gregor. Is it even possible to build a modern data infrastructure with open source only?
Yes or no?
[00:56:59] Meetesh Karia: [00:57:00] Can I answer, with a, but?
[00:57:01] Juan Sequeda: Go ahead. Sure.
[00:57:04] Meetesh Karia: Yes, but I wouldn’t necessarily recommend it. Okay.
[00:57:12] Juan Sequeda: Tim and I have a podcast episode on build versus buy debate. So we should she’ll listen to that one.
[00:57:20] Tim Gasper: Is open source and you were buy.
[00:57:22] Juan Sequeda: Yeah. And
[00:57:23] Tim Gasper: I think Claire any, comments on this topic?
[00:57:27] Claire Look: Oh I, was going to say that was going to be my, if we even had time for advice, I was going to say, bye, but there is the combination, but I think my mind goes towards buy often. If you’re a small team growing quickly.
[00:57:45] Juan Sequeda: You go Tim.
[00:57:46] Tim Gasper: All right, next question. This one is just a funny one, a light theme or dark theme, Claire [00:58:00]
[00:58:00] Claire Look: Dark, All right.
[00:58:02] Juan Sequeda: All right. Final one from Curt. Are there too many data tools?
[00:58:10] Claire Look: Yes.
[00:58:12] Meetesh Karia: Yes.
[00:58:13] Juan Sequeda: So for those who are only listening and not seeing our faces, it just imagine somebody kind of moving their head side by side and compare right.
[00:58:24] Claire Look: Landscape. So it’s one I would want to I would want to be in it. So I don’t blame, I don’t blame people for wanting to expand our ecosystem, but yeah.
[00:58:34] Tim Gasper: And it’s a hot space, right?
[00:58:36] Meetesh Karia: It’s a hot growing space. So, it’s expected.
[00:58:39] Tim Gasper: Yeah, but do you know what tools in the center of the modern data stack diagram.
[00:58:44] Claire Look: That what like data.world.
[00:58:49] Tim Gasper: Hey, this is what’s happening. Ah, crap.
[00:58:56] Juan Sequeda: All right. We’re almost done here. So it’s [00:59:00] our takeaway time. T Tim takes it away with takeaways dirst.
[00:59:04] Tim Gasper: Awesome. I love our conversation about culture, right? I, love some of the suggestions that you gave Claire around ways that you can make culture, data, culture work.
Even if you’re decentralized. Even if you have a large organization, you said data divergence days, a recurring hackathon, data architecture, subgroups, especially if they can happen in a bottoms up way where people are organizing it themselves happy hour. And thinking about the different roles and how they work well together, right?
Folks like data, product managers, data stewards analytics, engineers, everybody can work together to really create that culture and make it work for any scale of organization. So I thought that was great. Me tests. You said that experience matters so much, and I think that’s refreshing, right?
Because I think that in a world where the modern data stack is, such a big center of attention. We often think of like new tools, new tech. Do you have experience in this new tool? Or can you pull this off the [01:00:00] shelf, but there’s so much sort of knowledge that has been created around data management that we can be leveraging.
And we shouldn’t forget our lessons from the past. And then finally, Catalog right. Catalog is a way that you can actually get into this metadata layer, collect that knowledge have, and manage that governance, even federated governance. And obviously that can be a key part of the stack. So those are Tim’s takeaways.
What about Juan’s takeaways?
[01:00:25] Juan Sequeda: We started this conversation and almost an hour. Asking. Hey, why is dbt such a big deal? And I think before dbt there was no consistency on how we would go do data and transform data. We would just do it over and over again. Use some on some, tools, Steagall scripts that we could not go reuse, right?
This was a big pain and, finally there is just this package. The dbt has just packaged this into a product that we can go, Hey, this is how we can just make reuse this. And I think there’s an evolution. We start realizing how we get there. And also from the team perspective, I think it’s [01:01:00] very traditional.
We’ll start small and be centralized to get consistency, to guide shared hiring. And then that’s how we started to find this balance of what we are, what we’re centralizing decentralizing. And the honest, no BS of about the analytics engineer. It’s not tied to dbt. It is about transforming and modeling raw data into that data that the businesses need.
That’s call it analytics engineer, call it knowledge engineer, call it foo, whatever. That’s what’s going on right now. And I think that’s the game changer right now in the industry meeting Claire. Thank you so much and want to throw it to you very quickly. Final words. What’s your advice about life, about data, about anything.
[01:01:43] Meetesh Karia: I can go first. I can go first, which is a, it’s something that I I tell my kids a lot when it comes to food, especially, it was like, you can’t say you don’t like something, unless you try it once. And this goes to the experience point and something Claire mentioned too, which is, there’s no way you’re going to build [01:02:00] experience without trying. So, try don’t be afraid, try things.
[01:02:07] Claire Look: Yeah. I’m going to go off that. And it also goes to my. Role, I had zero data experience, but I had a Redshift database found stitch and found Tableau on top of it. So I was like, I’m a full stack developer. I just went from database to ETL to, but it gives, I think that’s where I’m like take advantage of tools and how many there are in the market, because you can turn someone with zero experience, but an interest in data.
That’s all I had into someone who starts to understand the different roles and components that goes along with it. So that’s definitely, yeah, it doesn’t always need all the coding experience in the world because you just kinda need a problem solving brain and a love for data. And I think everyone in this group and everyone listening has that.
[01:02:56] Juan Sequeda: Hey for everybody who’s listening. If you like what you listened to today, like this is [01:03:00] our show at catalog and cocktails. We do this live every Wednesday. We’ve done it for, I think, 65 episodes now. And please and subscribe us. You can find us on all your favorite platforms. And we do is live on Twitter and LinkedIn and all that stuff.
And we’re starting to schedule for our next next year. And we really want to reach out to the practitioners. So the folks who are really implementing these things, because we talked to the vendors, we talked to everybody who’s puntificating and all that stuff like let’s go talk to people. Who’s actually who rolled up their sleeves.
So if you want to be on the podcast, just shoot us an email, just I’m email@example.com. It’s super easy. And with that Meetesh, Claire and dbt. Thank you so much for this opportunity. This was fantastic. And cheers.
Last modified on: Apr 19, 2022