Table of Contents
- ⢠No silver bullets: Building the analytics flywheel
- ⢠Identity Crisis: Navigating the Modern Data Organization
- ⢠Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization
- ⢠Down with 'data science'
- ⢠Refactor your hiring process: a framework
- ⢠Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team
- ⢠To All The Data Managers We've Loved Before
- ⢠From Diverse "Humans of Data" to Data Dream "Teams"
- ⢠From 100 spreadsheets to 100 data analysts: the story of dbt at Slido
- ⢠New Data Role on the Block: Revenue Analytics
- ⢠Data Paradox of the Growth-Stage Startup
- ⢠Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!
- ⢠Keynote: How big is this wave?
- ⢠Analytics Engineering Everywhere: Why in the Next Five Years Every Organization Will Adopt Analytics Engineering
- ⢠The Future of Analytics is Polyglot
- ⢠The modern data experience
- ⢠Don't hire a data engineer...yet
- ⢠Keynote: The Metrics System
- ⢠This is just the beginning
- ⢠The Future of Data Analytics
- ⢠Coalesce After Party with Catalog & Cocktails
- ⢠The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation
- ⢠Built It Once & Build It Right: Prototyping for Data Teams
- ⢠Inclusive Design and dbt
- ⢠Analytics Engineering for storytellers
- ⢠When to ask for help: Modern advice for working with consultants in data and analytics
- ⢠Smaller Black Boxes: Towards Modular Data Products
- ⢠Optimizing query run time with materialization schedules
- ⢠How dbt Enables Systems Engineering in Analytics
- ⢠Operationalizing Column-Name Contracts with dbtplyr
- ⢠Building On Top of dbt: Managing External Dependencies
- ⢠Data as Engineering
- ⢠Automating Ambiguity: Managing dynamic source data using dbt macros
- ⢠Building a metadata ecosystem with dbt
- ⢠Modeling event data at scale
- ⢠Introducing the activity schema: data modeling with a single table
- ⢠dbt in a data mesh world
- ⢠Sharing the knowledge - joining dbt and "the Business" using TÄngata
- ⢠Eat the data you have: Tracking core events in a cookieless world
- ⢠Getting Meta About Metadata: Building Trustworthy Data Products Backed by dbt
- ⢠Batch to Streaming in One Easy Step
- ⢠dbt 101: Stories from real-life data practitioners + a live look at dbt
- ⢠The Modern Data Stack: How Fivetran Operationalizes Data Transformations
- ⢠Implementing and scaling dbt Core without engineers
- ⢠dbt Core v1.0 Reveal āØ
- ⢠Data Analytics in a Snowflake world
- ⢠Firebolt Deep Dive - Next generation performance with dbt
- ⢠The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness
- ⢠dbt, Notebooks and the modern data experience
- ⢠You donāt need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)
- ⢠Git for the rest of us
- ⢠How to build a mature dbt project from scratch
- ⢠Tailoring dbt's incremental_strategy to Artsy's data needs
- ⢠Observability within dbt
- ⢠The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation
- ⢠So You Think You Can DAG: Supporting data scientists with dbt packages
- ⢠How to Prepare Data for a Product Analytics Platform
- ⢠dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake
- ⢠Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines
- ⢠Upskilling from an Insights Analyst to an Analytics Engineer
- ⢠Building an Open Source Data Stack
- ⢠Trials and Tribulations of Incremental Models
Coalesce After Party with Catalog & Cocktails
Catalog & Cocktails is bringing its unique brand of insight, humor, and conversation to Coalesce. The weekly podcast is an honest, no-BS, and non-salesy conversation about enterprise data management and analytics with a happy-hour vibe.
On Tuesday, December 7 at 4:00 pm CT your hosts, Juan Sequeda and Tim Gasper of data.world, will broadcast live from the Coalesce event platform joined by special guests from the community. Weāre also cooking up a delicious dbt-themed cocktail. Please join us as we toast to an awesome event and glimpse into the future of analytics engineering.
Browse this talkās Slack archives #
The day-of-talk conversation is archived here in dbt Community Slack.
Not a member of the dbt Community yet? You can join here to view the Coalesce chat archives.
Full transcript #
[00:00:00] Julia Schottenstein: Welcome everyone to the dbt Labs coalesce after party for Tim Gasper and Juan Sequeda of data.world. Weāll be hosting a live episode of their podcast, catalog and cocktails. Iām Julia Schottenstein and Iām part of the product team at dbt Labs. And I also co-host the dbt Labs analytics engineering podcast.
Every other week with our CEO Tristan Handy for our podcast listeners out there, weāre currently live at Coalesce which is dbt Labsā annual community conference. So itās pretty fun to record in front of such a large audience. We have over 14,000 attendees this year. Iāve listened to a few episodes of catalog and cocktails before.
So I came prepared with my cocktail. Should be a fun episode. And if folks want to chime in with questions, the conversation is happening in the dbt Slack channel #coalesce-catalog-cocktails. Okay. Now for me over to you, Tim and Juan to kick off the episode. Cheers.[00:01:00]
[00:01:01] Tim Gasper: Hello everyone. Welcome. Thank you so much, Julia. And to all the folks over at dbt Core. This is catalog and cocktails. Itās a weekly live hangout. And today itās a very special episode. Itās an honest, no BS, dbt Coalesce conversation about enterprise data management with tasty beverages in hand. Iām Tim Gasper, a long time data nerd and product guy joined by Juan Sequeda.
[00:01:27] Juan Sequeda: Iām Juan. Tim, thank you so much. Wants a kid. Iām the principal scientist here@data.world, and it is a pleasure. Itās an honor. I am so excited about this because this is the first time weāre doing something like an after party for this awesome conference. I look forward to many more of these and today we have awesome awesome a lineup. We have two of our good friends that are itās part of the Austin community because weāre an Austin company. There are folks who have been inside the data space for so long and they live and breathe data. Iām talking about Meetesh [00:02:00] Karia, who is a CDO, the CTO of the zebra.
I has had so many past CTO experiences, and also Claire Look from the zebra who started out as the data product manager at the zebra has gone into so many leadership roles and now is the VP of data. And we are both not just fans of dbt or they were customers of dbt, also customerās data.world. So we got a lot here to go talk about glad you guys are here.
Cheers.
Howās everybody doing well?
[00:02:26] Claire Look: Great to be here.
[00:02:28] Juan Sequeda: Awesome.
So letās to following the, our catalog and cocktails approach we do here. Weāll do our talent TOSA. What are we drinking? And what are we toasting for Claire?
[00:02:38] Claire Look: How about. All right. Iām drinking our star schema mocktail, so I appreciate everyone getting the mocktail recipe for me this week.
And Iām going to toast to it is my bossās birthday. Tess, cheers to you. We get to celebrate his birthday with him today at this after party. So [00:03:00] that is what Iām toasting to today.
[00:03:03] Meetesh Karia: Thank you, Claire. And a, and Iām drinking the the regular version of the star schema cocktail, which tastes very fantastically, like a spiced, apple cider.
And I guess Iām going to toast to being able to, celebrate with you all for my birthday and chat one of my favorite things which
[00:03:23] Juan Sequeda: about you, Tim?
[00:03:25] Tim Gasper: I will also cheers to your birthday, Meetesh so glad that you could join us on your special day and cheers of the whole dbt community.
Itās just exciting to be here. A part of such a great conference and a great community. So really appreciate it. And I agree. This does not taste like alcohol. This is tricky.
[00:03:42] Meetesh Karia: Scary.
[00:03:43] Juan Sequeda: Iām actually at a hotel right now. So I went to the bar and I told the bartender, I gave him the recipe. Itās please make something.
And it, yeah, he did a great job. This letās taste like a spicy apple cider. So hereās an happy birthday meet to just love this, that weāre all here together,
[00:03:57] Meetesh Karia: Cheers.
[00:03:58] Tim Gasper: And cheers to everyone here. Whoās joining us. [00:04:00]
[00:04:00] Juan Sequeda: Yeah. So I think we had a little thread going on in the chat that weāre, looking, it goes, go tell us what youāre drinking today and share pictures or whatever.
We always have the warmup. We have a funny question here of the data kicked this off. So what have you transformed you, someone else into something else for a costume or a product or something? What have you transformed? Who wants to go first.
[00:04:26] Meetesh Karia: Iāll go. Iāll say Iāve transformed my, kids. Like theyāre, itās scary to look at them and see how big theyāve gotten already.
And and my older one, Iām about to join high school next year. So, itās crazy how much how many parallels there are between raising kids and leading teams.
[00:04:46] Juan Sequeda: I love that. How about you, claire?
[00:04:48] Claire Look: Iām going to go relevant to dbt in this conference and say at the zebra, we transformed our technical analyst and to analytics engineer.
So that is something [00:05:00] we itās great to be here because we wouldnāt have really discovered that title and done all of that without really digging into the duty community and seeing how that grew. So thatās definitely something weāve transformed. Is that role,
[00:05:15] Julia Schottenstein: Tim? What have you transformed?
[00:05:18] Tim Gasper: Oh, my goodness. Obviously over@data.world, weāre dbt users as well. And so weāve tramped transformed some folks into analytics engineers, but I will say that Iāve, transformed my morning routine. One of the only sort of positive things that have come out of COVID here is that I actually get up at a consistent time and I go for a run most morning.
So Iām excited about that. Thatās a good thing.
[00:05:39] Juan Sequeda: Iāll take the one. Iāve transformed a research and. Pushed it into real product and the company and sold it. So I think thatās something Iām pretty proud of and really excited about bridging these worlds of academia and the real world. All right.
Cheers to transformations. Okay. So hereās, the deal weāre doing this live and weāre watching on the [00:06:00] Slack. We got a couple of, weāve prepared some of the topics we want to go chat about, and weāve mentioned it before. Weāre talking about data teams, data, cultures, data mesh, data catalogs, but if thereās any other topic, just put it into the chat into the Slack where weāre watching.
Also, we always have this segment of our, of our lightning round questions. So if you have any questions, preferably with yes or no answers put them in, weāll be watching them and then weāll be throwing these out to mutate and Clara and towards the end of the broadcast here. So.
[00:06:33] Tim Gasper: Weāre literally watching the channel right now. So please post your questions. There are no dumb questions and we will be integrating it live into the show and also into the lightening round. Please, post your questions.
[00:06:44] Juan Sequeda: All right. So letās kick it off. Honest, no BS. So why is dbt such a big deal? I always the first time Iāve been looking at dbt, Iām like I canāt believe this didnāt exist before and, now weāre seeing it, [00:07:00] but like why didnāt exist before?
And why now? Like, why is this such a big deal and how big of a deal is that Claire
[00:07:09] Claire Look: I would say I think back to some of my analyst roles that I would consider analytics engineering, but what weāre doing in terms of Tableau extracts were our way of transforming raw data and making them accessible in reports or leveraging spark or leveraging hive.
And we were doing, we were transforming some of this raw data, but it was all in different areas and there wasnāt that consistency. And so itās just amazing that someone has packaged that up and thought, Hey, this is something weāre all doing over and over in different ways, depending on the scale or size of our team and data and said, this is a product, this is a common problem that we are seeing across all these different teams.
And this is something that. Say, Hey, like the SQL translation layer is a [00:08:00] product and making that accessible for large groups has been very, transformative for sure.
[00:08:09] Meetesh Karia: Yeah. Iād add on there. I, remember building things, using talent and SQL scripts and Cron jobs and, all sorts of things where you get multi thousand lines, SQL files with no way to test, no way to reuse, no way to, to share.
And, weāve taken a lot of what has become standard engineering practice
And, dbt has brought it to data and data transformation.
[00:08:37] Juan Sequeda: This is one of the things Iām so excited about dbt and just having these transforms as first-class citizens. Is that itās that right? Itās before it was a thing oh, itās you live it behind itās, just, itās part of the city, a part of the process, but itās not just part, itās a core, itās a core element of everything youāre doing in the data. And I think thatās one of the big transformations that I love about this. [00:09:00] And one of the cool things about having both of you here on this conversation is that meat, as youāve had this true executive, and you can see how, the technology has transformed companies from the executive point of view and Claire, like youāve really been rolling up your sleeves and doing all the analytics work, but also managing the teams to go do this.
How has this evolution been in the past, couple of years, all from companies and new companies coming around and thinking about transformations as first-class citizens and building up these tools with, to think about it how, have you seen this evolution
[00:09:30] Meetesh Karia: I can start with before early on and then hand it off to when we first started our analytics and like BI and data journey at, the zebra, we were looking at tools like RJ metrics warehouse point and click transformation, very limited things that we could get going and running on the side. And even when we then moved to Looker, look, and that was a big step up in terms of modeling, but it quickly gets hairy. It quickly [00:10:00] gets to the point where itās in the source, like source control, but you canāt test It Itās, really hard to share. And I think that then we get to the, power and I can go over to Claire now in terms of what weāve been able to do since introducing dbt and going from there.
[00:10:20] Claire Look: Yeah, I think so. Weāve, been pretty early adopters. I think we started using dbt in 2018 and a 2018. And yeah, weāve been able to move the majority. All of our transformations over to dbt. And so thatās been something where it allows us, like before it allows us to really establish the analyst and analytics engineering team as well, because they have something to work within and something to establish like best practices across the team.
And so I think weāve been able to really scale out. That whole practice [00:11:00] and data modeling as a team and be able to support a lot of different areas of our business by leveraging that versus kind of the point and click good for different areas, but not necessarily for the scale component.
[00:11:12] Tim Gasper: Thatās interesting. And Claire Meetesh touched, obviously you all have built a, really powerful team over at the zebra to work on a lot of these problems. And as youāve adopted dbt, as youāve expanded the use of that using it in combination with a bunch of modern tools here how have you been thinking about your data team, how you structure that data team and how has the role of the analytics engineer or been a really big impact on all of that?
[00:11:43] Claire Look: Yeah, sure. We have definitely grown rapidly. So I think when I, when we first adopted dbt, we had one analytics engineer, one data scientist, one data engineer we have one of. key role one data, product manager. So we were really like, [00:12:00] letās figure out what we can do with the tools that we have to now we have a 50 person data org.
So weāve just we really started with, I think, what is that foundation, but really, as we were growing, it was okay, we got to establish what are our core datasets? I think one of the first things we did was this project data recap. If anyone on the team is listening, theyāll be like, oh gosh.
But that was really to say, okay, what do we have? What are we trying to build? What are the different ways people are looking at our data, not saying it was successful, but saying it was something where we were like, letās take this modern tools and approach to our data and go build it as a product.
And then scale people accordingly with the increasing demand because the data team, wasnāt the only one thatās growing, every other department is growing. So the needs that are coming into the teams. So the ability to see. Okay. We can build these core datasets, but we need to have multiple analytics [00:13:00] engineers be able to live in this code without creating too much chaos there.
[00:13:07] Meetesh Karia: Yeah, I think we we, grew to a point where the, size of the code base and the size of our, like our data and domain was larger than any one person could really keep in their head. And so onboarding is an issue that if you can modularize, if you can separate out domains, you can ramp people up to be able to have them be more effective and more impactful sooner.
You get this issue of you get to a certain size where you change something in one place and something unintended breaks somewhere else or, vice versa. And by, by splitting out chunks you get the, to limit the blast radius. And then really I think, and weāll probably get to some of this as we talk later, it in my mind is the foundation for And setting us up for how we grow and, support the [00:14:00] org as it continues to grow.
[00:14:02] Juan Sequeda: So letās dive a little bit more into this and you just talked about how to scale and some conversations weāve had before mutations about, I love your, the threes and the tens, but things start changing with respect to threes and tens.
And Claire, what you just said is that at the beginning you were just like one of each, and now you have a team of 50. I always wonder is the conversation how with a lot of people is this balance between centralization and decentralization, right? So you probably start with a centralized team and then youāre like I canāt be a bottlenecks.
So you start figuring out that balance. What is that process? I know weāve talked about threes and tens is a good way of approaching that. I really would appreciate more of your insights about this, because this is a conversation we have all the time, centralization, decentralization.
[00:14:46] Meetesh Karia: I, think this goes right into the, one of the hottest topics in the field right now that have data mesh, right? Which is a possible solution or an approach to addressing this scale. And I look at it as, we hit [00:15:00] around 30, 40 people is when we started to see some of these issues where we started to say, Hey, we need to actually take a look.
We had centralized to get some consistency, to get shared hiring, to get to build up certain skill sets and in processing. And now we get to the 30th with looking forward to growing the team, but also the entire company and say, we need to set ourselves up so that the data team is no longer, like no longer owns every bit of data for the company.
Instead, the data team is providing the underlying platform, the tools, the processes, the governance that supports different domains, owning their own data. And, the data team pulling it all together. And so I think thatās the big transformation that weāre going to see between 50 and a hundred or the, stage of the company right now that really speaks to that next step of not being [00:16:00] centralized, but also not being completely distributed.
[00:16:04] Tim Gasper: At what point in the growth of the zebra, did it become clear that you needed to change your approach a little bit, that you couldnāt just continue to have it be 100% of centralized approach, but that you needed to start to become more distributed and what role does dbt and analytics, engineers play in that move to distribution?
[00:16:28] Meetesh Karia: I guess Iāll take the first part. I can a second to declare, but Iād say it probably became clearer the beginning of this year or so. Right around that 30 ish size of our team. Because all of last year was around centralizing, hiring up getting the right team structures in place.
And then right around the beginning of this year is when we started to see this is getting really difficult to support the entire business from one central team.
[00:16:56] Tim Gasper: Yeah.
[00:16:57] Claire Look: I think itās definitely, we hit a point in. [00:17:00] Yes, support in general, like centralized support and just the number of questions youāre getting, the number of things that you are now supporting across these teams that it wasnāt until just recently we said, okay, we need foundational teams.
As, most companies grow on the engineering side, same thing for data, we need our platform teams, our data platform, core reporting, like who are those people? But then we also still need, we canāt Holt the business and say, all of a sudden, you cannot get any data to measure the effectiveness of your product or to measure, measure agency.
In our case, we deal with insurance. So we still have analytics, engineers supporting other areas who, with the benefit of being centralized, they understand how to work within dbt. They understand where to work within dbt. And thereās some of that shared process that weāve built out. But.
Centralized, but the actual domain, they start to become aware of and theyāre able to work within [00:18:00] their, while we focus our centralized teams on pure like foundational platform work, because we were doing both for too long. And it, that gets you in a tough situation where youāre trying to support everything.
[00:18:15] Tim Gasper: That make sense. And just to even rewind a little bit and go back to the teams aspect of all of this, again we actually get an interesting question here in the channel coming from Julia. She mentioned I liked Claire, how you put put it that your team transformed into analytics, engineers, Heather actually been some other technologies that have had a major impact on your teamās identity. Or what does it mean when a tool changes roles on your team.
[00:18:53] Meetesh Karia: Great question.
[00:18:54] Claire Look: Yeah. Like
[00:18:55] Tim Gasper: I know you all use Looker did that have a really big impact as well? Were there other tools that [00:19:00] kind of fit into that mold?
[00:19:02] Juan Sequeda: I was gonna say, Iām curious on is though BS as much as you can. Itās like you, you started talking about looking Mel and oh, this was a great first step, but then you hit some barriers around it.
So maybe that helped for something, but you hit some barriers and then this dbt comes around. Like how did that change with the folks within your team? Yeah
[00:19:20] Claire Look: I, did a whole. I did a talk at one of the Looker events that was like changing the next gen of analyst, because it was like, now your analysts are becoming more like analytics engineers, because theyāre able to be in the look ML and theyāre able to point at this field and an Explorer trace it all the way back to the source code.
And that was something I hadnāt seen in Tableau or anything else where you could go straight from Iām in a dashboard and now Iām tracing it back to source code. And I was like, that changes the game for analysts because theyāre able to understand like all the underpinnings that really are multiple [00:20:00] roles or steps in the process.
And so I definitely think there were like Looker really advanced our analysts. And I think we want to get back to that because right now we do have itās majority are analytics, engineers that are in dbt. So we have less of that. Collaboration where our analysts are actually making changes to the source code.
So there is a little bit of, that kind of growing pains as you switch from Looker to dbt that weāve worked through. But I think it does level up. I think Looker was one that did level up the technical skills of our analysts. For sure.
[00:20:39] Juan Sequeda: So I was to mention here that Gregor on one of the threads is making an interesting point, which is one Iāve always had to, and he says is the role of the analytics engineering, BA bounded basically connected to the dbt community.
And again, honest, no BS, right? You hear analytics engineer and you immediately know, [00:21:00] oh, thatās dbt because dbt is pushing this and stuff, which is true. And, but letās take that label away. I guess the, label that Alexandria letās call it foo, that work that the analytics engineering is Fu is doing right now.
Thatās actually happening. I Thereās a need for that regardless of what people were putting onto it.
[00:21:19] Meetesh Karia: Yeah. Absolutely. I think itās the job of taking raw data and modeling it and transforming it into usable business data. And putting it together and, whether you do that called an analytics engineer, weāre called a data engineer called whatever you call it Fu call it an analyst at someplace places.
Itās that job of really understanding what the end, whatās going to support what the business needs and taking the raw data, putting it together, transforming it and modeling it.
[00:21:50] Claire Look: Yeah. I, want to share this podcast with a mild. HR manager, because I would have conversations with him around, [00:22:00] we have these analysts, not quite what I need.
We have these data engineers in my previous company, because we were building out a universal catalog data set. And itās I need people who can just take all this data, turn it into a universal product catalog. And it was that in-between skillset. And we didnāt know what to call it. Iām glad we got there at the zebra, but it was one of those.
It was like, what do we recruit for? And we ended up turning some of our analysts into data engineers, but I think we have found they werenāt, there was no dbt involved in that, but the role was still the same as. Yeah.
[00:22:35] Juan Sequeda: We wrote it down here and I think this is something to go pin right now. The role of the analytics engineer does it have to be tied to a technology like dbt at the end of the day, itās about transforming and modeling that raw inscrutable, complicated, ugly, shitty enterprise data, right in to data that is just beautiful that the business understands and they can go run with it to go make more value.
So, generate more [00:23:00] value. I personally, Iāve called this always more kind of the knowledge engineer or Iāve pushed this term called the knowledge scientists too, but I think itās all evolve is I want to be able to itās somebody almost like a bridge right there in the middle. They know how to go talk to the business, understand their new.
Modeling, put on the whiteboard, do that type of stuff, figure out what is in the data, do the transfer for the stuff. And I think thatās something that is crucial that we just havenāt brought that into, organizations today. So I think Iām really happy that this is a, that this role called analyst engineer, the knowledge and year, whatever, we just need more of that and that more of that mindset.
And also I think about it is what comes after the data scientist, right? Itās thereās been 10 years of this now. I think thereās somethings coming next. And I think this is this, is a big one.
[00:23:45] Meetesh Karia: I think that this, that the engineer part of this too, also really helps with understanding, like this is an opportunity for testability.
Itās an opportunity for reusability. Itās an opportunity for modularization. A lot of these concepts that are, really [00:24:00] core to engineering, and dbt has helped bring that there, but also the term analytics engineer. That, is what I think we want those people doing is that modeling.
But then also the engineering aspects of it.
[00:24:14] Tim Gasper: Yeah, itās clear that there is an overarching theme here, which is thereās all these best practices around software, right? Whether itās modularization around reuse, donāt repeat yourself configuration as code configures into continuous integration and deployment.
And it seems although obviously dbt is, a big component that is very popular. Thatās pushing a lot of this, whether itās your look ML, thereās a lot of BI oriented players that are moving in this direction now. Obviously thereās your Python code and your notebooks and all that kind of stuff, right?
This is can all fit this paradigm of analytics engineering, which seems, exciting. It seems like an evolution of the field that weāre in.
[00:24:57] Meetesh Karia: Absolutely. Yeah.
[00:24:58] Juan Sequeda: So I want to [00:25:00] go jump onto another topic about about culture. And I think right now too, Julia was asked another great question, which is, I was thinking about this too, is.
Weāre talking about scaling teams and stuff. And how do you go train all these, like all these folks within the culture you have understanding the, business terminology about this. I think, and also Iām just reading here. People always want the roles to evolve and to learn more skills, but often hiring can be skewed to look for someone who has done it before.
You were mentioning too, that you were talking with your age your, previous HR colleague, like what is that right balance?
[00:25:39] Claire Look: Yeah.
[00:25:39] Meetesh Karia: Itās a go for Claire. Yeah.
Yeah. Itās challenging. I was gonna say I guess thereās a couple parts of that. One is how do you build and preserve that culture and the other is how do you find that the people that I think that the [00:26:00] second part of that is perhaps easier particularly for us at the zebra To answer, which is that we really believe in growing the people we have at the company.
And coming in as an analyst learning engineering principles, learning a lot more about SQL I think is a natural path to growing into roles without having to come in from the outside, but also bringing in and retaining and continuing that culture within the team. And then, yeah training, onboarding
Itās challenging and it has been really challenging, especially being a remote the last almost two years through a pandemic. We havenāt nailed all of it, but I think back to a passage in the book winning with data that talks about what Facebook did around a data kind of [00:27:00] bootcamp when people come.
And one of my thoughts is that eventually we need to get to somewhere like that.
And thatās not just for the data team. Itās for the entire company to, to help and understand how to think about data and how to, build with data. But Claire, I donāt know if you have other thoughts on some of the things weāve done, particularly around onboarding.
[00:27:22] Claire Look: Yeah. One of the things I think part of Juliaās question around hiring can be skewed for someone whoās done it before.
Like we saw that challenge a lot, especially within analytics engineering, just because itās new, everyone knows they need it. And so itās a competitive space. And one of the things that weāre like, okay, weāve had analysts that have been successful growing into analytics, engineers roles. So can we create entry-level analyst roles?
Where you can really understand the business context and you can start to understand our data and be around other [00:28:00] people who are working within dbt. So youāre looking at that or potentially, or understanding the business more by leveraging a data catalog or thereās some other areas where you can be learning and adding value, but you may not have the technical skills yet.
And is that a an easier role less competitive to get people in your team where theyāre search to understand the business, the data so they can start adding value and then they can learn the technical skills. So we havenāt weāve been successful moving some analysts, so we think, okay, that could be a good path for us.
In the future is that entry-level analyst role within that core reporting team.
[00:28:45] Tim Gasper: On the, on this topic of culture and thinking about how we manage the culture of our teams obviously as you grow your company you have to manage how that culture will evolve over time. And then also tying it back to we started to talk a little bit [00:29:00] about data mesh as you start to decentralize that can have an interesting impact on culture, right?
And obviously you want The data, people in the org to have a cohesive culture if theyāre not working with each other all the time how do you do that? And Nikki actually asked the audience and the channel, does anybody have learnings Iād like to share from seeing orgs go from centralized to decentralized or vice versa, and then tying that to culture, maybe starting with you, Claire, do you have thoughts about like, how you want to establish that culture that goes across a decentralized approach and your, thoughts on culture with centralization and decentralization?
[00:29:38] Claire Look: Yeah. Iāve been answering this question a lot. So again, if any zebra folks are listening, as weāre going through our own little, a centralized and decentralized Iām really confident. There are some things that. Our core to our team, something new, we just did was data divergence days, where it was like our version of a quarterly hackathon.
And itās thatās something that [00:30:00] everyone should participate in. Whether youāre supporting a core platform team or you are embedded within another team, thatās helping solve a specific business problem. There are things. That I think we just need to continue to establish across data roles. Another one was like a data architecture group test, and I had nothing to do with that was formed by the team when they saw, oh, this is something weāre decentralizing.
And they said, this is something I see as a need. We need to be talking about these best practices. And thatās a completely team formed group. And theyāve recruited members of different roles and people not within data, but those are going to be more important because of course youāre still going to have your happy hours and your hackathons and all of that to keep the teams together.
But I think thereās also gotta be some of that kind of Guild format where you have people across the company, and that helps with the shared ownership of data too, [00:31:00] because itās not just thatās the data org problem. Itās weāre all we all contribute to the creation and use of data.
And I think thatās a good thing.
[00:31:12] Juan Sequeda: How much? I always think about, we talk about the business or the business domain, the business users, and then the data folks like is, are data folks getting involved into the business practices domain, or is it the other way around, or what does that balance there that youāre seeing and what do you actually suggest?
Because I think this is another thing we, when we talk about data mesh the different domains and you push things under the main, like how thereās a lot of blah-blah-blah, weāre talking about this in theory. And I talked to people whoās actually done this. And itās still a lot to figure out here.
So what have you guys actually done? Whatās working. Whatās not working.
[00:31:52] Meetesh Karia: Yeah.
[00:31:53] Julia Schottenstein: I
[00:31:53] Meetesh Karia: I think what weāve seen is working is when we have the data team folks working closer with various parts of [00:32:00] the business to, understand and getting a head like starting from the very beginning of the product development.
And, understanding and helping define what questions weāre going to ask, what we need to answer it, helping inform the decisions weāre making around product development. We have folks on the marketing side now that have built a very close relationship with a lot of our marketing team where theyāre involved in the decision and they know whatās happening, they can inform their proactive.
And at, an almost kind of become part of the culture of two different teams, they get invited to events for both teams, and thatās, where I see like it really is, which is on the business user standpoint. I think itās really about understanding what they need, what problems theyāre trying to solve.
I think we can pull them into utilizing the data and empower and enable them. But I think then itās actually more impactful for us as a data or to go and get closer and understand really [00:33:00] what the business is trying to do. Yeah.
[00:33:06] Claire Look: Oh, I was going to say, I think one of the. One of the things that we tried that didnāt work is and I think weāve always been adapting is where the role of the data product manager fits within this, because you do need people who are closely tied to the business, really understand that business domain.
But you also can, you have to be very explicit then about the role and what people are doing. Because as soon as cause itās the same thing, you could turn a data product manager into a bottleneck where theyāre now having to play both the centralized and decentralized role. And so thatās another one where as you scale the teams, you also have to scale, like it doesnāt necessarily have to be a data product manager.
It could be an analyst. Whoās the subject area expert in that domain. Just being very explicit about what do we actually need out of this role? What are, the [00:34:00] problems that weāre seeing in this domain and then adjusting accordingly so that youāre not just putting people in different spots, youāre being clear.
Maybe we already understand the problems of this area. So we donāt necessarily need someone to define them. We just need like help partnering and, right. And really nailing down the requirements. And that might be an analyst. So just being clear on the roles and how to support their .
[00:34:25] Tim Gasper: Yeah. So subject matter expertise and also ownership is a very interesting topic. And so you mentioned about a couple of things you mentioned about the analyst, and sometimes theyāre playing that role, but you also mentioned this phrase, the data product manager, which obviously was a central topic in an episode that we did together, Claire.
And w what role are you seeing data, product managers starting to emerge in around this whole thing. And, do you see them being more of a centralized component or do you see them playing a more of a varied role?
[00:34:59] Claire Look: Yeah, [00:35:00] weāve gone right now. I think they benefit more on the centralized function because they can really help tie across all these different use cases.
What is that core business domain knowledge that we need to provide? And so I think thatās where the most benefit has been provided out of that role. Whereas then you have other analysts were an input into that are very familiar with the kind of decentralized portion, but I think weāve seen the biggest benefit to data, product management.
And have centralized because thatās really where youāre like here, all my different user problems across all these different domains and what do we need to build to support that? And that gives you more of the power of kind of that product mindset and thinking through the different personas and use cases of the data.
[00:35:49] Juan Sequeda: This is really interesting because I was not expecting that answer. I, always see the if you think about the whole data mesh the decentralization and the domains and stuff that you would, [00:36:00] and the way how itās pictured is that every domain has their own data product. You take data product, and that gets in, and then the, when you look at all the diagrams on this product gets combined with another product and generates that.
But youāre saying no, that data product managers would be centralized. And Iām saying, okay I, buy that when youāre smaller, but at some point it is how is that going to go scale? I can see that you want to go have data product managers for the central team, but then you wonāt also want them to be for the decentralized team, because at the end, Iāve used this phrase before you, you want them to be like liaisons, right?
Iām the product manager for the central team. I know whatās going on here. Like you, both of you all should get your stuff together. And then because I need to go combine them because I have a broader context of that. So, that I can imagine, but so
[00:36:51] Meetesh Karia: itās, an evolution, right? Because right now, to Claireās point, theyāre understanding all the needs for the data, but then it transforms, I think, to being [00:37:00] about the platform, about building the tooling, building the processes, building the thing that supports the data, mesh the the, decentralization.
And thatās when you though then go and add the data product managers in the various domains. Who are working with their counterparts on the centralized data platform team to make sure that the tools that are being built, the the the, processes, etc, that support people, building their own domains, all like all work together.
[00:37:32] Tim Gasper: And depending on your company, your scale, your use cases, it seems like different roles, maybe get spun out to the spokes, to the decentralized aspects in a different order. Like maybe as you get to a certain scale analysts start to become more domain oriented at a certain scale, then maybe even your analytics engineers start to become more domain oriented at another scale maybe before or after that is when your data product managers start to become more [00:38:00] decentralized.
And you know what, this makes me think about some of the conversations that we have so Iām a product guy. We have conversations about team typology all the time and roles, right? And are you a product manager and a product owner and a scrum manager, or are you just a product owner and a scrum manager, but not a product manager.
And you start to get into some of these topics, right?
[00:38:23] Meetesh Karia: Yeah, exactly. I think that you definitely have a progression there, right? Like you can have an analyst in that role for a certain amount of time.
To, help because the analyst is going to be the, expert. Theyāre going to understand the data.
Theyāre going to understand what we need to produce. At some point, though, it grows large enough where youāre like, okay, this is a separate product. Iām not just serving the central set of data, but Iām also serving other use cases of this data. Yeah.
[00:38:49] Claire Look: I think I definitely want I, agree. Thatās the state that we get to where itās once you then move into kind of a de-centralized environment, [00:39:00] then you can have data, product managers who are really domain experts in the area.
I think when you have a centralized team with de-centralized product managers, thatās when you get into some challenges from asking them to do both and itās okay, weāll focus on your foundation first, then scale it out, then figure out what are the problems? Can you add this role? So yeah, definitely the progression and depends on the state and scale that youāre in.
[00:39:29] Tim Gasper: Yeah. And weāre, so weāre talking about roles right now. And weāve got a few roles that weāve mentioned about, data, product managers analytics, engineers, analysts, data engineers Kurt Lancing in the channel, ask an interesting question or maybe more of a comment, he said, I never hear about database developer roles anymore. And uh, database developers, maybe they worked more on transactional systems. Whereas weāre having a little bit more of a, an OLAP [00:40:00] conversation than, oh, LTP. What is the future of that role? Is it dead and actually brings,
it brings a broader question. I Thereās a whole class of data roles that are like database developer, data integration engineer, BI developer, right? Thereās some of these roles that like. Some companies are hiring for them. But they, but you donāt hear about them as much. I donāt know, maybe starting with Umi, Tash, like D do you see that some of these roles are fading?
Is this more like modern stack versus traditional stack? Curious as to your thoughts here.
[00:40:33] Meetesh Karia: The, no BS answer is up until the end of last week. I mightāve answered this differently. But actually had a really interesting conversation with our with our, principal data architect talking about the role of a DBA in the which is very much aligned with what was described as a database developer there in, in a modern data stack.
And, thereās still a need for that role that itās [00:41:00] different. But itās all about ensuring the data is modeled in such a way that we can do other things with it, ensuring that weāre taking into account security, privacy, and other parts of governance practices.
Even in our our runtime data that is still under CCPA and GDPR, and a lot of these other things.
And getting to a point where we have developers that are coming in, working with higher and higher level frameworks and languages that abstract away a lot of whatās happening under the covers in the database. So you have fewer developers that understand truly how to model data.
Whatās critical. Whatās important, how it affects things downstream. And so that role itās maybe shifting it, but itās itās still important. And thatās something that, like I said, no BS, I actually, my mind was changed at the end of last week because I thought that, oh, these roles are going away but, [00:42:00] maybe some of them are.
[00:42:03] Juan Sequeda: Huh, this is I didnāt, again, another response I did not expect. I thought you were saying, yeah, the modern data stack, itās all of the cloud. Itās also finished. We donāt need that stuff. But,
[00:42:13] Tim Gasper: oh yeah. And the data architect is a role here. Thatās very interesting where is it the analytics engineerās responsibility to architect the right model, right?
Like some of these things, maybe they turn into hats that you wear, but maybe there is still a role for some of these traditional roles and apologies on their repeat of the word there. But a role for these roles still, right?
[00:42:37] Julia Schottenstein: I,
[00:42:37] Meetesh Karia: also, I would say, I think there is because thereās one thing that I look at every year, I look back and being itās my birthday, I looked back every year and I was like, man,
I was stupid last year.
Why did I do that? Why did that? And I think back to 21 year old me out of college and I was ready to go and I thought I knew the [00:43:00] world. And I was like, who needs. And then as is probably cliche, it like across generations, I look back cause man, yeah, experience matters so much.
And when you look at it and you talk about the architectās role, it may have been in terms of not not in a modern data stack, but the experience of having worked with data and seeing issues that come up, seeing what works, what doesnāt over decades is invaluable, right?
Thereās no way you can gain that without having done it. And so I, do think that role is critical. And I think that thereās a lot that teams can gain from bringing someone on who maybe they werenāt the term analytics engineer or dbt didnāt exist back then catalogs didnāt exist back then modern BI tools didnāt but a lot of the, goals and a lot of dealing with data still did
[00:43:53] Juan Sequeda: so honest, no BS, what is it that youāve in this past year realized I should not [00:44:00] have done that.
[00:44:05] Meetesh Karia: I, actually realized what I should have done. Which is I spread myself too thin and I should have dug deeper into some of the work I was doing and been a little bit more. I, like to get things moving and going. Iām Iām definitely about getting work going. And like I have conversations, I talk to people, get them going, but I donāt always pull that picture together and paint it for a lot of other people.
Get in, one cohesive picture for people outside of the teams doing the work. And so thatās something I I continuously learn is that, Hey that maybe itās part of my job now to, okay.
[00:44:44] Juan Sequeda: How about you, Claire? What has happened? What did you this last year? Have you realized that I should not do that anymore or I should continue or something positive there?
[00:44:53] Claire Look: I should not do. I think itās, one, especially with growing [00:45:00] teams, I think being quicker to make changes within the org. I think thatās something where thereās always going to be a different scenario. And you sometimes you just got to try it, like when you start seeing issues of weāre a bottleneck.
Okay. What can we do about that? I think thereās just you can make smaller tweaks along the way, instead of thinking you need to have all the answers before changing something or making a big bang change at the end. So I think just especially when growing rapidly and growing data teams rapidly, just being more like willing to make small tweaks along the way When you notice those problems and then measuring how youāre doing against what you thought or if the problems are improving.
So I think a lot of that where itās just like what iterative development and software. Okay. Letās do that with some of our like, test and learn on the org structure side, because all of us are [00:46:00] learning some of these new concepts and ideas together. So yeah, itās about like really leveraging the community here and people who have done it before and testing out what works and what doesnāt within your org.
[00:46:15] Tim Gasper: That makes sense. And as we transitioned to our sort of last topic here before we do the lightning round just a quick shout out to everybody whoās listening right now. After this next section, weāre going to do what we call the lightning round, where we do these yes. Or no questions.
If youāre hanging out in the Slack workspace, please hop into coalesce catalog, cocktails channel and feel free to drop in some yes or no questions. And weāll, incorporate them into that last segment. And with that just one other topic that I know we were interested in to hit here.
We talked a lot about sort of the data stacks. And the the analytics engineers working on the transformation, the warehouse run integration and [00:47:00] BI tools like Looker, for example. But obviously another key element that is increasingly being incorporated into the, modern data stack is the metadata component.
Whether itās things like catalog governance, observability, Iām just curious
From your perspectives, maybe starting from, you, Claire and then moving to Meetesh is what role is catalogs our catalogs and other metadata oriented tools playing in your data strategy and what youāre doing with your data teams.
[00:47:30] Claire Look: Yeah. Biggest piece we were missing, we were moving fast from a technical perspective in terms of how do we model this? How do we make it available? But then you get to the point where we talked about earlier, where for someone onboarding onto data, that becomes very complex to traverse your giant dag.
And so then you got to say, okay, whatās the actual who owns this field and who [00:48:00] we just added our company. Okay. Ours, our key results. And we have executive sponsors for each of those. And itās like, where do you store that confluence page or directly tied to the actual metric? So I think from a you, look at your data and youāre like, okay, whatās the missing piece here is like the context around it and who I go to when I have a question just all of that information around how frequently is it updated?
Some of that, is just living in our centralized team heads that worked when we were a team of four, we could manage that information. It doesnāt work when you scale. So I think thatās where the metadata component comes in because itās really this doesnāt, it cannot scale to live in your head.
And so whereās the best place to put it weāre, we are implementing a catalog for that.
[00:48:55] Meetesh Karia: Yeah. And Iāll tack on to the, governance piece, which is [00:49:00] in my mind, one of the trickiest problems or the one that is also the most kind of opaque to solve around, moving to something like a data mesh or scaling is how do you, deal with federated?
How do you do deal with that? And that my mind is where data catalogs and metadata based tools really play a part of. Thereās no way you can go distribute and scale your organization. If you donāt have some way of making sure the data is trustworthy, making sure thereās quality, making sure you have privacy security, right?
Like all of those things need to be built in. Otherwise you have every team doing that themselves. And if every team is doing that themselves then, whatās the point of the modern data stack and world weāve, moved towards.
[00:49:50] Juan Sequeda: Itās a bit really honest and no BS here. We, hear a lot this term, federated governance and federated computational governance, coming from the whole data mesh [00:50:00] stack. What do you, what do we mean by this? And actually, and if we can get really concrete, right? What is an example of this federated governance?
[00:50:11] Meetesh Karia: Iād say at the very, basic level, itās probably just some standard sets of processes right.
Of, Hey, hereās how you go about certifying dimensions, certifying bits of data
At the very basic level, and then tracking that it was done tracking the audit log of it tracking assigning owners in my mind, thatās probably the most concrete, but also probably the most basic. And Iām sure that there are a bunch of people listening and others that are like thereās way more than that. I absolutely, but in my mind, thatās the first like basic thing that comes to mind.
[00:50:49] Juan Sequeda: All right. And then what, okay. You we, do that. Whatās next. How and how, complicated or complex should we get to her or. Thatās probably sounds too negative, but [00:51:00] Iām where a sophisticated letās call it that way.
How much more sophisticated governance should we do or, whatās the minimum or w Tim and Iāve been talking about this too. Itās whatās the minimal viable policy. Letās call it an MVP that you should go. Whatās the minimal stuff, right?
[00:51:20] Meetesh Karia: I guess it depends on the company, right? Like weāve built a tool that helps us process CCPA requests, right?
So surfacing user data, deleting user data. It, it seems like in a world where we have federated governance and need to deal with compliance in a distributed way, weād want to build tooling around supporting that.
We wouldnāt want to make every single team solve that problem in their own way. So thatās more than just a process, more than a, just a checklist.
But I know. This tool has to be built into every bit of data. The notion of data that you provide and [00:52:00] hereās the, steps to go do it. And hereās the tool.
And so I think that it goes from processes to then tooling to support some of the things that are a little bit more complex, a little bit more advanced.
[00:52:12] Juan Sequeda: And weāre obviously very biased here on catalogs. We definitely believe that catalogs and governances is a key part that I just feel that sometimes itās, not really big, a big part of the conversation and the true landscape of data. Youāve say, oh, you need to have the governance.
Itās on one side. But it truly covers the entire landscape of governance of, data. And this is something that we see it too, to separate it. So I think one thing is you have the metadata and you have the governance, the policies, but you really want to have this stuff really connected to the data.
And we see some times itās just like documentation, but it needs to be more than that. It needs to be the stuff that you could imagine, a world where itās all, everything is completely executable. Imagine [00:53:00] policies that we go define our executable code and itās not just English definitions. And itās something that we can go, people can go reuse and check out and they can apply it.
I think weāre so far from that. And weāre really not thinking about it because weāre just so excited about letās just get the data and, transform it. But, and the governance has usually been this, oh, itās just for the CCPA, the GDPR stuff, but thereās money. Thereās much more than more than that.
Iāve been having conversations with folks is like, what is the telephone. Like, how do we define what is the telephone number and what is a mobile phone number? Oh, a mobile phone number. Wait if itās a mobile phone number in must have TCPA consent. It must have an SMS. Can we send SMS to them or not?
Or, voicemail so thatās how we know what what the value, what a true telephone number is. And we probably know what this is. Itās in peopleās heads is itās documented, but if weāre going to go start generating data, I need to know, Hey, thatās a valid telephone. Or itās not, [00:54:00] or youāre missing.
And if itās not tell me why itās not, whatās missing about it. And I think weāre not there yet. And I think this is where governance is key about it. And I think this is where the catalogs are going to play a key role. And itās not just about just w itās not just about bringing in what I, what are my tables and columns and how theyāre connected.
Itās really documenting and implementing this in an executable way. And thatās why I think is Iām really excited about how, governance and catalogs are going to play a key role because metadata touches everything. Anyways, I just went onto this long ramble around here.
[00:54:33] Claire Look: I was like how easy it is to build data products.
If you start from that foundation where youāre thinking about the governance and the metadata and the policies around each individual, Telephone number versus mobile number. Then think about the product teamās great. I can go use this and this way over in this tool I know how it works.
[00:54:54] Meetesh Karia: It makes me excited, right? Because you look at it, youāre like, oh we, thereās a lot of movement. Thereās a lot of growth in the modern [00:55:00] data stack and data org, but thereās so much more still to, to innovate and solve. I get, this is not a solved problem at all.
[00:55:07] Juan Sequeda: I was, Iāve been having conversations with a lot of other companies and customers and stuff.
And we, imagine is you have this model refined for customer. We, what is truly a customer from a modeling and what is a telephone number, for example. So then we say, I want to go to find data. I should be able to go click customer, click telephone number. And hereās what you should be able to go.
Hereās the minimal stuff that you need to go have for that stuff to know itās valid, right? Thatās and, as you can imagine, modeling and transformations, this is all key to it. Wow. Weāve I think weāve got a few minutes left. We got to start winding down and.
[00:55:44] Tim Gasper: We covered some good around some good ground here.
[00:55:47] Juan Sequeda: Iām ready for my next cocktail now. So
[00:55:51] Tim Gasper: Should we do a lightening round? Weāve got some good. We got some good contributions from the chat.
So whoās first. Juan you want to do yours first? [00:56:00]
[00:56:00] Juan Sequeda: All right. We got this from Nickeel met metrics layer. Is this going to become a thing? Yes or no?
[00:56:10] Meetesh Karia: Not sure. Iām going to say no, because I donāt know what it is.
[00:56:14] Juan Sequeda: All right, Claire?,
[00:56:17] Claire Look: I think, yes. All right. Go,
[00:56:20] Juan Sequeda: Tim.
[00:56:22] Tim Gasper: I liked that. Weāve got two data points here. One is that weāre still learning about the metrics layer, right? And then also, like how useful is it going to be? So next lightning round question from Julia, do companies need a machine learning an ML strategy?
Maybe starting with you, Claire. Yes or no?
Yes, yes, All right. We got two yeses.
[00:56:47] Juan Sequeda: All right. So we got from Gregor. Is it even possible to build a modern data infrastructure with open source only?
Yes or no?
[00:56:59] Meetesh Karia: [00:57:00] Can I answer, with a, but?
[00:57:01] Juan Sequeda: Go ahead. Sure.
[00:57:04] Meetesh Karia: Yes, but I wouldnāt necessarily recommend it. Okay.
[00:57:12] Juan Sequeda: Tim and I have a podcast episode on build versus buy debate. So we should sheāll listen to that one.
[00:57:20] Tim Gasper: Is open source and you were buy.
[00:57:22] Juan Sequeda: Yeah. And
[00:57:23] Tim Gasper: I think Claire any, comments on this topic?
[00:57:27] Claire Look: Oh I, was going to say that was going to be my, if we even had time for advice, I was going to say, bye, but there is the combination, but I think my mind goes towards buy often. If youāre a small team growing quickly.
[00:57:45] Juan Sequeda: You go Tim.
[00:57:46] Tim Gasper: All right, next question. This one is just a funny one, a light theme or dark theme, Claire [00:58:00]
[00:58:00] Claire Look: Dark, All right.
[00:58:02] Juan Sequeda: All right. Final one from Curt. Are there too many data tools?
[00:58:10] Claire Look: Yes.
[00:58:12] Meetesh Karia: Yes.
[00:58:13] Juan Sequeda: So for those who are only listening and not seeing our faces, it just imagine somebody kind of moving their head side by side and compare right.
[00:58:24] Claire Look: Landscape. So itās one I would want to I would want to be in it. So I donāt blame, I donāt blame people for wanting to expand our ecosystem, but yeah.
[00:58:34] Tim Gasper: And itās a hot space, right?
[00:58:36] Meetesh Karia: Itās a hot growing space. So, itās expected.
[00:58:39] Tim Gasper: Yeah, but do you know what tools in the center of the modern data stack diagram.
[00:58:44] Claire Look: That what like data.world.
[00:58:49] Tim Gasper: Hey, this is whatās happening. Ah, crap.
[00:58:56] Juan Sequeda: All right. Weāre almost done here. So itās [00:59:00] our takeaway time. T Tim takes it away with takeaways dirst.
[00:59:04] Tim Gasper: Awesome. I love our conversation about culture, right? I, love some of the suggestions that you gave Claire around ways that you can make culture, data, culture work.
Even if youāre decentralized. Even if you have a large organization, you said data divergence days, a recurring hackathon, data architecture, subgroups, especially if they can happen in a bottoms up way where people are organizing it themselves happy hour. And thinking about the different roles and how they work well together, right?
Folks like data, product managers, data stewards analytics, engineers, everybody can work together to really create that culture and make it work for any scale of organization. So I thought that was great. Me tests. You said that experience matters so much, and I think thatās refreshing, right?
Because I think that in a world where the modern data stack is, such a big center of attention. We often think of like new tools, new tech. Do you have experience in this new tool? Or can you pull this off the [01:00:00] shelf, but thereās so much sort of knowledge that has been created around data management that we can be leveraging.
And we shouldnāt forget our lessons from the past. And then finally, Catalog right. Catalog is a way that you can actually get into this metadata layer, collect that knowledge have, and manage that governance, even federated governance. And obviously that can be a key part of the stack. So those are Timās takeaways.
What about Juanās takeaways?
[01:00:25] Juan Sequeda: We started this conversation and almost an hour. Asking. Hey, why is dbt such a big deal? And I think before dbt there was no consistency on how we would go do data and transform data. We would just do it over and over again. Use some on some, tools, Steagall scripts that we could not go reuse, right?
This was a big pain and, finally there is just this package. The dbt has just packaged this into a product that we can go, Hey, this is how we can just make reuse this. And I think thereās an evolution. We start realizing how we get there. And also from the team perspective, I think itās [01:01:00] very traditional.
Weāll start small and be centralized to get consistency, to guide shared hiring. And then thatās how we started to find this balance of what we are, what weāre centralizing decentralizing. And the honest, no BS of about the analytics engineer. Itās not tied to dbt. It is about transforming and modeling raw data into that data that the businesses need.
Thatās call it analytics engineer, call it knowledge engineer, call it foo, whatever. Thatās whatās going on right now. And I think thatās the game changer right now in the industry meeting Claire. Thank you so much and want to throw it to you very quickly. Final words. Whatās your advice about life, about data, about anything.
[01:01:43] Meetesh Karia: I can go first. I can go first, which is a, itās something that I I tell my kids a lot when it comes to food, especially, it was like, you canāt say you donāt like something, unless you try it once. And this goes to the experience point and something Claire mentioned too, which is, thereās no way youāre going to build [01:02:00] experience without trying. So, try donāt be afraid, try things.
[01:02:07] Claire Look: Yeah. Iām going to go off that. And it also goes to my. Role, I had zero data experience, but I had a Redshift database found stitch and found Tableau on top of it. So I was like, Iām a full stack developer. I just went from database to ETL to, but it gives, I think thatās where Iām like take advantage of tools and how many there are in the market, because you can turn someone with zero experience, but an interest in data.
Thatās all I had into someone who starts to understand the different roles and components that goes along with it. So thatās definitely, yeah, it doesnāt always need all the coding experience in the world because you just kinda need a problem solving brain and a love for data. And I think everyone in this group and everyone listening has that.
[01:02:56] Juan Sequeda: Hey for everybody whoās listening. If you like what you listened to today, like this is [01:03:00] our show at catalog and cocktails. We do this live every Wednesday. Weāve done it for, I think, 65 episodes now. And please and subscribe us. You can find us on all your favorite platforms. And we do is live on Twitter and LinkedIn and all that stuff.
And weāre starting to schedule for our next next year. And we really want to reach out to the practitioners. So the folks who are really implementing these things, because we talked to the vendors, we talked to everybody whoās puntificating and all that stuff like letās go talk to people. Whoās actually who rolled up their sleeves.
So if you want to be on the podcast, just shoot us an email, just Iām juan@data.world. Itās super easy. And with that Meetesh, Claire and dbt. Thank you so much for this opportunity. This was fantastic. And cheers.
Last modified on: Apr 19, 2022