Table of Contents
- • No silver bullets: Building the analytics flywheel
- • Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization
- • Identity Crisis: Navigating the Modern Data Organization
- • Down with 'data science'
- • Refactor your hiring process: a framework
- • Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team
- • To All The Data Managers We've Loved Before
- • From Diverse "Humans of Data" to Data Dream "Teams"
- • From 100 spreadsheets to 100 data analysts: the story of dbt at Slido
- • New Data Role on the Block: Revenue Analytics
- • Data Paradox of the Growth-Stage Startup
- • Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!
- • Keynote: How big is this wave?
- • Analytics Engineering Everywhere: Why in the Next Five Years Every Organization Will Adopt Analytics Engineering
- • The Future of Analytics is Polyglot
- • The modern data experience
- • Don't hire a data engineer...yet
- • Keynote: The Metrics System
- • This is just the beginning
- • The Future of Data Analytics
- • Coalesce After Party with Catalog & Cocktails
- • The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation
- • Built It Once & Build It Right: Prototyping for Data Teams
- • Inclusive Design and dbt
- • Analytics Engineering for storytellers
- • When to ask for help: Modern advice for working with consultants in data and analytics
- • Smaller Black Boxes: Towards Modular Data Products
- • Optimizing query run time with materialization schedules
- • How dbt Enables Systems Engineering in Analytics
- • Operationalizing Column-Name Contracts with dbtplyr
- • Building On Top of dbt: Managing External Dependencies
- • Data as Engineering
- • Automating Ambiguity: Managing dynamic source data using dbt macros
- • Building a metadata ecosystem with dbt
- • Modeling event data at scale
- • Introducing the activity schema: data modeling with a single table
- • dbt in a data mesh world
- • Sharing the knowledge - joining dbt and "the Business" using Tāngata
- • Eat the data you have: Tracking core events in a cookieless world
- • Getting Meta About Metadata: Building Trustworthy Data Products Backed by dbt
- • Batch to Streaming in One Easy Step
- • dbt 101: Stories from real-life data practitioners + a live look at dbt
- • The Modern Data Stack: How Fivetran Operationalizes Data Transformations
- • Implementing and scaling dbt Core without engineers
- • dbt Core v1.0 Reveal ✨
- • Data Analytics in a Snowflake world
- • Firebolt Deep Dive - Next generation performance with dbt
- • The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness
- • dbt, Notebooks and the modern data experience
- • You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)
- • Git for the rest of us
- • How to build a mature dbt project from scratch
- • Tailoring dbt's incremental_strategy to Artsy's data needs
- • Observability within dbt
- • The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation
- • So You Think You Can DAG: Supporting data scientists with dbt packages
- • How to Prepare Data for a Product Analytics Platform
- • dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake
- • Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines
- • Upskilling from an Insights Analyst to an Analytics Engineer
- • Building an Open Source Data Stack
- • Trials and Tribulations of Incremental Models
Keynote: How big is this wave?
The modern data stack is the third generation of data analysis products to come to prominence since the 90’s.
The prior waves - data warehouse appliances and then Hadoop - were both big steps forwards but ultimately failed to live up to their initial promise.
Is the modern data stack just another iteration in a long string of “trendy technologies” in data - waves that crash upon the shore but ultimately recede? Or is it somehow more permanent?
The answer to this question drives how we think about the future—how much we invest in skill acquisition and our own career paths, how companies think about investing in both technology investments (easy) and organizational change (much harder).
In this talk, Tristan Handy and Martin Casado make the case that the arrival of the modern data stack is “the end of the beginning” for data analysis and ask what that means for all of us in the industry.
Browse this talk’s Slack archives #
The day-of-talk conversation is archived here in dbt Community Slack.
Not a member of the dbt Community yet? You can join here to view the Coalesce chat archives.
Full transcript #
Julia Schottestein: [00:00:00] Welcome everyone to the keynote talk at Coalesce 2021 ‘How big is this wave?’ I’m Julia Schottenstein and I’m part of the product team at dbt Labs. And I’ll be your host for this talk in this session. We’re going to hear from Tristan Handy, founder and CEO of dbt Labs, as well as Martin Casado, who is a general partner at Andreessen Horowitz and our investor and board member, the data stack is having a Renaissance and dbt is at the center of it all.
I won’t spoil any of their talk, but my guess is Tristan and Martin will argue that this wave is pretty big. I’m excited to be here alongside more than 10,000 of you joining for coalesce. It’s this community that powers the movement forward. We invite you to contribute to the conversation in dbt , slack channel #coalesce-keynote-wave.
Feel free to [00:01:00] share your thoughts and ask questions interested in Martin. We’ll follow up with folks after the session. So without further ado over to you,
Tristan and Martin,
Tristan Handy: Thank you, Julia and Martin. Good to have you
Martin Casado: Glad to be here, Tristan.
Tristan Handy: Are you doing a Peloton workout as we do this session,
Martin Casado: That is actually a, an elliptical between meetings.
Tristan Handy: Let me set this conversation up and then I think you’re going to do most of the talking here, but I get a lot of feedback from folks who have been in the data industry for a little while. They, they want to talk about the term modern data stack and they have a sentiment that is something like we’ve seen this movie before.
And especially for those folks who have invested millions of dollars in the teaser boxes there, there’s a little bit of scar tissue that is very justifiable. I think of the industry in three phases wave one is the data warehouse appliance. Maybe that’s early to mid, maybe to [00:02:00] late two thousands.
And then wave two is the Doobie ecosystem. You’ve got companies like Cloudera Hortonworks. Late two thousands to the mid 2010s. So this is like the trajectory that I think we’re on. And if you know these folks that I hear from they’re like this is just another wave. And the temptation is just this is trendy.
I’m going to sit this one out. But I think that’s not the right way to understand this. I think that what we’re involved in right now is significantly more persistent than at the five-year trend. And I think that it’s gonna impact all of our careers and the way the future goes.
And and all of that, I th the reason for everyone on the line here the reason I wanted Martin to join us, because he has a unique perspective on technology cycles. Martin suddenly a prolific investor in data tech. But he’s responsible for creating and commercializing a piece of infrastructure that is now central to the way that the cloud works.
So I’m hoping to steal some of the [00:03:00] lessons from your earlier journey, Martin in seeing how much they apply here in what we’re up to. So maybe we can start out here. What’s software-defined networking?
[00:03:12] What’s software-defined networking? #
Martin Casado: So software-defined networking. So software defined networking is an umbrella term for viewing, networking as a software problem as opposed to an Async problem.
And it now covers many things, but the way that I like to describe it is say 20 years ago, If you were to build a network for, I don’t know, running a very covert intelligence organization you’d buy boxes from Cisco or Juniper and then you’d plug them in and then like you could configure them with a CLI and that’s where you were.
But if Cisco and Juniper hadn’t anticipated the type of security posture that you needed because they don’t normally sell to covert intelligence agencies, there’s kind of nothing you could do. There [00:04:00] was no real programming interface. And this was like very different than let’s say, operating systems at the time.
So it operating systems at the time you could go in and you’ve got Linux and you could muck around you programmed things that were just specific to you. We actually had a lot of evolution in operating systems at the time because of that, but you couldn’t do that with networking. There was no model.
So software defined networking is an umbrella term is how do you make networking as flexible to say operating systems or distributed systems so that you can program them and make them fit for whatever texts you are interested in.
[00:04:27] How did you get involved? #
Tristan Handy: Tell me the story. How did you get involved?
Martin Casado: My first job, I’m a failed physicist.
My first job out of college was working at a at a national lab. And I did computational physics. I worked actually in the weapons program at the time. There’s a lot of modeling going on of nuclear weapons. And I was there when 9/11 happened actually. And so it was like weird.
Cause I was in this like anachronism hold over from the cold war. I was like Nikita weapons, which kind of worked that relevant. And there was a very different posture for the government at the time, which is now the terrorist threat. And because I had a lot of the [00:05:00] clearances, they moved me from.
Being a, basically a distributed, computational scientist to the intelligence community where I worked on the war on terror. And so my background was very much, like operating systems and distributed systems. I wasn’t a networking guy. And so I alluded to before, which is, I, I was responsible for these kinds of like very deep kind of covert infrastructure build outs that we were doing.
And it was just so obvious that you didn’t have the same type of flexibility that you would have. Let’s say if you’re just trying to do this on just a computer. So I moved on from that. I worked in the intelligence community from Afghanistan through Iraq and I moved on to do my PhD at Stanford and there, I just focused on this problem.
And at the time it was very interesting because at the time, so this was like 2004 ish. Networking, which kind of grew up in the internet. Networking gear kind of grew up in the public internet era was being pushed into all sorts of different areas that it wasn’t really built for. So it was like being pushed into like large data centers, which were pretty new at the time.
It was being pushed to build like large mobile networks works were pretty new at the time. Cause the iPhone was coming out. And so you have this kind of old architecture [00:06:00] that was now being used for all these different use cases that we hadn’t really anticipated. And so it’s just a good time to rethink, maybe you could redo the architecture, so make it a bit programmable.
And so that’s what we worked on and, and ended up spending the next 15 years of my life on it.
Tristan Handy: So the part that feels really resonant with what we are working on here, maybe like certainly the cloud data warehouses is programmable in the ways that matter. The I feel like when you start to talk when we’ve talked about this before the control plane of the data plane, this split of like, how do you make something that’s fundamentally on a box scale too many boxes is the problem.
I think that we started to solve in the 2012, 2013 time horizon. Is that the like fundamental unlock that you were working on at Stanford, you and others?
Martin Casado: Here, I actually think it’s probably going to be helpful for you and I to step back a bit and be like why are we even in the third wave?
I think we [00:07:00] can we can unpack a little bit about what’s going on because I actually think that. Like markets create the technologies almost not the other way round. And I think that’s very much what’s going on here. So here’s how I like to say it, which is okay, I’m an investor. And there’s many things I don’t know, most things I don’t know. But like one thing I do see is a lot of companies and here’s the biggest shift, I think certainly an infrastructure in the last, forever, since the beginning of compute, which is today, let’s say five dog-walking companies come in different ones to pitch me over a couple of months.
And what they do is they pair dog walkers with dogs now in the lab, before say five, six years ago, the ways that they would differentiate on are going to be things like, oh, like we scale, we’re like, like seamlessly mobile, it’s going to be a lot of like how they write the software.
Like we’re very fast. We’re very flexible, etc like that. That’s basically. The pitch most companies had because software as a big global [00:08:00] distributed technology, wasn’t very mature. So it was very much the focus of the questions in the answer today. If these five dog walking companies come in, almost all of the differentiation, maybe all the differentiation outside of brand is how they handle data, right?
It’s what is the matching algorithm between dogs and people? How do we predict pricing? What about fraud? What have you? Aren’t really a dog-walker. There’s all of these, there’s all of these things that you actually differentiate and none of them are like actually core saw. So I think the, if you want it to come up with a mental model, how the world has changed and this is why this isn’t just like the third wave, like the other wave it’s like the world has changed.
And now like data’s catching up is today. If you’re building a company, especially an app, like a SaaS app, the way that you differentiate is through how you manage data. But the data ecosystem just doesn’t have the level of maturity of the software ecosystem because in the past, so much of the differentiation.
Was what software does that make sense? So I [00:09:00] would say very similar to my SDN journey. The world now is demanding more of the technology. So in the SDN journey, it wasn’t that like we sat in a room and we came up with here’s a great way to do technology. Nothing we came up with was fundamentally new, but the world was ready for it.
It was like, we’re taking these networks that were built to connect whatever universities and we’re putting them in war zones. That’s not going to work. Oh, we’re taking, these computers that were built, to connect scientists and we’re trying to build Google out of them and putting the date, like that’s not going to work.
And so the world now demanded more. And as a result, we had to sit back and think, okay, what can we do to evolve that? And again, this is what I think is happening with data.
Tristan Handy: So wait, was that the beginning of a sentence or in the end?
Martin Casado: That was the end of the sentence, but I said, but please ask your question.
Tristan Handy: I’m with you and you’re ruining the punchline a little bit because I want to get there. But one of the things that I think that folks in, in data are a little bit used to is [00:10:00] that we use this technology that it like, then doesn’t have more layers that sit on top of it.
So there’s not more layers that sit on top of. Data warehouse appliances that happened in two thousands, right? Like it was the end of the line. But my understanding is that whether or not you’re going to take credit for doing anything fundamentally new the the technologies that you were involved in building in this timeframe.
Now no longer sit at the top. They’ve now had stuff layered on top of them. And that is, I think, one of the most interesting questions about the next five to 10 years for the modern data stack, like what other stuff is going to get layered on top of data, ingestion data, warehousing data transformation.
Martin Casado: A hundred percent. Totally. Yeah, I’ve got this fundamental belief I’ve been now in systems basically my whole life, right? As like a independent researcher as a entrepreneur or whatever. And my pure belief. Layers never go away. We just add on top of them or sometimes we de-laminate them.
Like you can almost think of all of computing as you take a mainframe and then you just delaminating. [00:11:00] And so I’m going to use another analogy and I’m gonna back up into what you said. So it turns out when markets expand, you normally take kind of fixed function things, and then you pull out pieces of that and you can build entire industries or companies on that.
And my favorite example is is what happened to the auto industry. So in the early 1900s, there was a Ford plant called the Rouge river plant. It was in Dearborn, Michigan again, and this is early on when there wasn’t like a huge market for cars. And like literally in this plant, in what like coal and a rubber, an iron ore and out came cars, like it was like, that was the sophistication of the industry.
It was the mainframe. It was the mainframe, but, it was literally rubber. It was going into this thing, whatever, but what’s interesting is if you could look at that car, and you could almost predict the future because if you look today now, like the automotive industry is, multi-trillion dollar industry.
And it is composed of multiple tiers of suppliers and, even independent kind of like bolts and belts and this, and that can be entire [00:12:00] companies, it’s created this entire industry. And so this is very much what happened with. And then I’m going to get to the data part, which is so what did we do?
Networking was never really a distributed systems problem. It was always you just assumed everything was going to be eventually consistent. You assumed full distribution and that’s what you built. And so what we did is we almost took like a single switch or a single router.
And we’re like, why, what happens if you de-laminate this and pull this apart and you can scale this to the size of the. And then independent people could build intimate parts of that and add value to that. It’s almost as de lamination thing that happens right now. Does it get rid of any switches or routers?
Of course not. There’s still switches and routers today. If you listen to, if you’re building the most kind of crazy, crypto startup, where you’re creating, your own, whatever, you’re still using switches and routers, it’s not like they go away. And so the same thing I believe is happening to the data industry, which is one way to think about it is, there is this platonic ideal of I’m going to have [00:13:00] this cloud data warehouse in the sky and it’s going to have all of the tooling for it and it can do all of, assistant whatever on top of it.
But the reality is because the market is getting so large. You’re going to have to de-laminate that you’re gonna have to pull that apart. And then independent, would you consider tools or even features of tools, even features of tools are going to become entire companies. They’re going to be pushing.
Not think about, but in order to do that, the, in order to do that, you need to really think very carefully about things like open source, because it is such so strict requirement. Everything that we did in the early networking days was open source because you need an industry to move. You need open standards.
You need to think about interfaces. You need to think about how these tools, this is why dbt is because in many ways it’s this glue that stitches together, this massive expanse phase for the data industry.
[00:13:47] Talk about your product being used inside of Google data centers #
Tristan Handy: Talk about your product being used inside of Google data centers. That story blew me away.
Martin Casado: Okay. So let me just talk about what we did to to begin [00:14:00] with. So it used to be if you’re building a. A data center, you’d grab a bunch of switches. And then you can connect all the switches together to make, effectively one large switch, except for, it really was just a bunch of different switches.
Then if you managed them, you’d have to go in and you’d have to manage all of these different switches. And by the way that networking algorithms work is if whatever one link failed and it would flood all the links to everybody else. And it was like really optimized for these densely connected graphs that were like, local networks or the internet.
It wasn’t, built for that sort of density. And so what we did is we decided to build a, like a general operating system that would connect to multiple switches or routers, and then you can program for whatever you wanted to. And so what we did is we were programming. So we built this kind of operating system that would connect to all the switches around us and we were programming it to make virtual data centers.
So that like network management was simple, similar to the management of a virtual machine. You can create things virtually and move them around, etc, but it was a general operating system. And so [00:15:00] other companies picked it up for other things. And so Google early on, picked it up to use in two ways, one of them for the, to actually run like the data center fabric.
So they could abstract it as a single switch and actually optimize failures and also for their backbone. Now, remember, this is a very thin, almost interface layer, and what they built a, really fancy stuff on top of it. But you can actually, even if you look up Onix, O N I X that’s the paper, but it also mentioned some of the use cases.
Tristan Handy: So I think that this mirrors a lot of my own experiences in, you know the story of our early days, like the dbt was originally created to be a tool that was used by me and Drew and doing consulting work. And I, as much as I’m like a Richard Stallman fan and like the classic open source and all this stuff, I had like never gone in an open source, open standards journey myself.
And so there’ve been these points in time, over the past five and a half years where dbt now, we haven’t made our way into [00:16:00] the core backbone of Google data centers yet. So that’s still pretty cool. But dbt shows up in all these like really unexpected places. And I just in the early days, you could never have convinced me that it would spread in the way that it has.
And so I guess maybe my question for you is just: do these things take on a life of their own? Like what dynamic is actually going on here?
Martin Casado: So I love it. It’s funny, like when you do something new and you watch it take hold, like the signs, the indications that things are happening.
So I remember when I first got my first SDN spam, I was like, what? You’re like, it’s and then I remember
Tristan Handy: Somebody was trying to sell you SDN?
Martin Casado: No, it was just, no, it was like one of these conference spammy things where they’re like, oh, come to this, like online conference, like whatever SDN, it was just it’s just like they’re using it as a buzzword.
And then that, I remember when companies started talking about it. I remember when people started putting it on their LinkedIn. And I think [00:17:00] actually, I just want to go to the previous comment that I had. Cause I think it’s so relevant, which is so often we look at technologies and we’re like, oh, this technology has X and Y and therefore it’s great.
And therefore people use it. But I strongly believe is the reality is the market need evolves. And sometimes it creates vacuums. And then, the thing that’s best suited to fill the vacuum, get pulled into that vacuum. And that’s the expense that you see. It’s not like you woke up in the morning one day Tristan, and they’re like, I want people to get dbt spam.
It was like, like I didn’t wake up in the morning, say I want people to get, or I want people to have some hokey conference or this or that, or the other thing, but it’s like, there’s an actual need. And then it’s our job as a community. And it’s our job as an innovator. So as quickly as possible, make that technology fit kind of market needed is evolving.
So I love like looking up these kind of almost like pulse or level indications of the need. And I’ll tell you, the, and this is one of the reasons why, dbt is so interesting. Across injuries, if you want to get a sense of like the size of the data movement, I would say across a16z [00:18:00] probably the most activity and entropy outside of crypto, which is really pretty significant is data.
So just the sheer number of companies in the data space that are being created and are evolving and are it is unlike anything I’ve ever seen. And it’s fantastic. And, I’d say a good number of those are actually very focused on, open standards. They use dbtthey’re focused on open source.
So I do think that it’s also an accelerant to, to filling this market need.
Tristan Handy: Let’s go back for a second to your car analogy. I, one of the know this has become candid for me . Ben Evans used to be at a16z. He wrote a post about how you could predict the car, but what you couldn’t predict is Walmart and McDonald’s.
And when you start to look at the history of the car you actually get into this okay, first we have to figure out how to make cars. Great. And then yes, we ha we need to build a lot more roads, but then [00:19:00] what suburbs we’re going to look like, or the fact that we all wanted, like a hamburger that was consistent across our entire like 12 hour road trip.
No one would have guessed that was going to happen. And so I just use that as backdrop. I want to move the conversation into what stuff you think might get built on top of all of this tech that we’re building, because that’s what’s gonna make it long lasting. And I think ultimately we don’t know the answer to that, but I’ve got some hypothesis.
I wonder if you have any hypothesis.
[00:19:29] Do you have any hypothesis? #
Martin Casado: Okay, so yeah this, and this is a lot we can talk about. I actually tried not to presuppose suppose the future in many ways, because, it tends to take these kinds of wild, fascinating shapes that you, we can’t predict period point. But I do think it’s worth saying something to tee up this conversation, which is, I’m not sure that what we’re creating with data, isn’t like almost an entirely different discipline.
I mean like that the level of change may be that. And let me explain that really quickly. And then I would love to hear your [00:20:00] thoughts and then we can noodle on it, but I just want to tee up one thing, which is: software is very much an engineering discipline, right? It’s like, you live and die by abstraction and modularity and you can actually reign in complexity by good programming practices.
You’re like, " I create this library, I create this interface or this API". And so it’s very much kind of an engineering where you project where, you know you have a project that you want to do. You want a subsystem you want to build and you break it down into. Pieces, you ran into complexity that way, and then you go ahead and build it.
And it’s not clear to me that the very nature of the problems we’re tackling with data, at least some of them are of the same order. In many ways, dealing with data is like dealing with like the complexities of the universe. There’s no way to abstractly, it’s just this fundamentally very complex and we’re trying to look for answers.
I just want to give you an anecdote that happened very recently to me, just to give you a sense of this. And so I was speaking with one of the top AI people who works in NLU is building large models and [00:21:00] they’re of course looking for kind of, AGI, generalized artificial intelligence.
And he was telling me, he was like, Hey, listen, I’ve been doing this, I did this sit up in the air, I’ve done this, a Google, I’ve done this for a very long time. And it was really interesting as you could train these like big, huge models and you can train them to like learn natural language, but what’s really weird is then you can take that same model and you can play a video game with it.
Like I’m talking about moving pixels on the screen, not just like listening to language. Then he got kind of philosophical. Maybe there’s like these fundamental rules of the universe that like, these models are learning about and the universe isn’t that much, that complexity just requires a lot of computation to figure out.
So that was a general trend of the conversation. So the point of this anecdote is not to say, we can do AGR. The point of this is listening to him was not listening. It wasn’t listening to an engineer. It was like listening to a scientist philosopher.
This is somewhat, it sounded like talking to a physicist. And so I think that when we talk data there’s data infrastructure, like I’m an infrastructure type of dude, [00:22:00] right? Like I built software that scales, there’s the infrastructure for the data, but then there’s the data itself.
And once you enter that domain, I think we’re better like leafing through a science fiction book than we are leafing through like an analyst report. I really do feel like this unlocks almost the next level of system building. And so to tee up the conversation and I’d love to hear your thoughts.
I just want to make the point that we could be on, we could be,
Tristan Handy: I’m just imagining it as like the next Gartner analyst.
Martin Casado: Exactly. I think we could be on the verge of like almost like a new discipline with a new type of person, etc. Yeah.
Tristan Handy: Okay. And that gets into one of my three thrusts and maybe it’s so much bigger than the other two.
I feel like the place that we sit at today is there’s still this big divide between the ML people and the analytics people. And, by and large today, you’re here talking to the analytics people. We probably have like friends in ML, but mostly we’re not like sitting on that side of the wall ourselves.
It [00:23:00] feels to me like for the professional success of many of us here, we would all love to. Be able to go, maybe not all the way down, like fully submerged into the data science ML world, but go into a certain extent. And often it is actually like the dichotomy of the technology stacks that actually keep us out of there.
I feel like the bringing together of those two stacks is a major part of this trend that I want to see.
Martin Casado: Yeah, totally. I think the analyst ML labels are basically where you came from and you’re both going in the same place I just do.
And so to go back to the SDN is maybe, a bit of a learning journey on this. In these early days of the creation of an industry, there’s just so much chaos and everybody’s confused and that’s actually a good measure of the health of something.
Almost at the more chaos, the healthier, the more entropy, the healthier for sure.
It just suggests that they’re [00:24:00] growing and people don’t have time to optimize, as soon as people are like, really worried about like the specific definition of this and the specific stack of that they just have too much time on their hands.
They’re not running after like this, I think it’s absolutely the case. And so this is, in these growth phases, people are just at a dead run. And so for those of you that don’t know. Matt Bornstein who works with me and Jennifer Lee, who else works in me and I, we spent about a year just talking to, practitioners who know way more than we do about the quote unquote new data stack.
And listen, when I say new data stack. This isn’t some VC buzzwordy thing. It just turns out that, people are building bigger systems and they’re using new technologies. And, we just wanted to know what that was like. And so one of the big questions we wanted to answer Tristan was exactly, this is the question you raise, which is is there like this big difference between analysts and ML folks and what’s going on and etc.
[00:24:52] Is there like this big difference between analysts and ML folks and what’s going on? #
Martin Casado: And, we wrote this, and we try to reconcile it all, but I got to tell you my actual takeaway that we didn’t put in the report. [00:25:00] So you’re hearing it for the first time here. This is just a mess out there. Like people having to clue, if you took a Venn diagram of all of the answers we got, you’d end up with the set like district.
But I don’t mean to ramble on about this, cause I just think it’s so germane to the question you asked, it is clear that the goals are very similar and whether someone calls themselves ML or an analyst, because that’s where they came from. It really is about building systems that extract value out of data.
And so I honestly believe that this is all going in the same place and the analysts have, a career path it’s just as complicated, just as complex, just as deep justice sophisticated. I just think that they just come from a different, and I honestly like, I hate to say it, but probably a more disciplined.
Tristan Handy: While we’re talking about the mess of technologies that people use in the null set of overlap, laughing there have been and I’m not honestly in this camp, but there’s, there are voices who talk about this as a problem. Things are too complicated and [00:26:00] the kind of specter that gets invoked is the like Oracle, or whatever, these, like the hemophilia who have acquired their way towards a way to do everything beginning to end in a single product.
And that is maybe that’s simpler from a buyer perspective, certainly. And maybe you can just attend all the same trainings and you know how to do the whole field. I dunno, I’m resistant to that. Do you think that’s a way to resolve this complexity?
[00:26:28] Do you think that’s a way to resolve this complexity? #
Martin Casado: As I said, I think entropies and DEMEC and growing systems full stop.
So either you’re part of a growing trend or you wait until the collapses, I don’t think there’s any way around it. And I want to be actually very clear about what I mean every once in awhile, the industry listen to the industry is like a big amoeba in a sense the pseudopods out like whatever.
And every once in a while it hits like, this, this vein of food and it just, it’s going to go there and it’s going to mind and it’s going to grow. And when that happens Because our industry is so innovative, everybody focuses on like, how do I build the best [00:27:00] solution here for whatever it is.
And so you get all of the chaos and all of the effort and all the, and this is exactly how SDN was. It was being applied to things that it should never have been applied to for sure. Like people are like, I’m going to use, SDN for building ad hoc networks for like phones, like when you’re hiking and, things like that, it just didn’t really apply, and you just have these massive casts where you have these expanse phases, because. Because the growth opportunity is in new areas and you’re exploring, and that’s where the value is. There’s a reason why markets value growth is because, that’s we’re value increases.
And so in any market that has a lot of expense, you’ll always, and your, if you’re interested next, growing with the market and taking advantage of what’s new, you have to be part of, this kind of chaotic landscape, then what happens is market slow down, but they’ve actually staffed up towards growth.
So you’ve got all of these people and you’ve got all of these products. That’s when you actually start to see consolidation, this is where you’re like, oh there’s nothing new to do. Tie two things together, going to improve the user experience, etc, etc. And I do [00:28:00] think that everybody listening and every organization has to decide where on this curve, they are like, this phenomena investor, but I’m also like my entire life has been like basically category creation, bleeding, edge innovation.
So man, throw me in the deep end. Like I want all the chaos. I want to be in the middle of the maelstrom. I want stuff not to work. I want stuff not to fit together. That’s this is why I got into tech to begin with. On the other hand there is a very practical stance here that uses.
Blueprints that have emerged that are based on traditional technologies, but flirt with the new things where there are examples out there where people use it very successfully, there’s talent that you can hire into. And that’s a great way to understand these new technologies and use them where, like maybe 20% is a new stack, but it’s based on an old stack.
And then listen, if you don’t care about being part of the wave. Sure. You can use lagger technology and, you can join the museum of computer science pass and we’ll visit you and we come through.
Tristan Handy: All right. So I have two more the end and I want to leave a new platform [00:29:00] for application development till the end, because I know that we could run out the clock on that.
But the next one I will talk about is what I am in my head calling pervasive analytics. Not necessarily a like fundamental increase capability, but an increasing connected tissue and an improvement in user experience. So like the thing that I feel like is how do we get an order of magnitude increase within a given company, in their usage of whatever cloud data platform that they’re using.
And I think that is less about what are the kind of core data workflows that you need to do? You’re probably doing them already. It is actually a question of like, how do you get data into the hands of every knowledge worker in the organization and get their decisions to just be like, not you’re conducting science in the same way that like, you’re not conducting science when you like, [00:30:00] look for recommendations on open table.
Like it’s just a part of the workflow. I there’s a lot of. People thinking about this, I’m one of them in my sub stack, everybody’s got a sub stack today and I’m one of them. Do you look at companies who are trying to push this future?
[00:30:17] Do you look at companies who are trying to push this future? #
Martin Casado: Yeah, for sure. So again, I, again, like I like historical analogs I strongly believe that the closest analog to data software, and I think it’s that size too.
I think it’s arguably bigger because I do think it kind of straddles, like I said before, the engineering and then potentially the science too. And most of the people I would imagine that are listening to this have actually seen software come of age because it happened in the last 10 years.
Software was traditionally, something that businesses would build and then maybe you’d put on a PC. And it wasn’t until maybe 2009 that we all became massive software consumers. I’d probably, I don’t know. The last couple of years spent, $10,000 on software, 99 cents at a time on my [00:31:00] iPhone.
And, it’s a funny thing to to mention that way, but the reality is as we’ve matured everybody in the workforce as a as a software buyer, and that means that now you can have these, these what you call prosumer apps, which you don’t have a sales force. They look like consumer apps but they actually do real productivity and real work and the users most importantly know how to buy them, download them and use them effectively.
And I would say this is massive in kind of productivity gains in the entire industry. And I absolutely think that data’s going to go through the same Renaissance, which is, we’ve just become, we forget just because it’s easy to fail. We’ve just all become software consumers, really like actually really weren’t before.
If you think you are like, you’re buying your app, like your game for Windows 95, you weren’t really like a software consumer. Now we’re software consumers buying an app is something we do every single day. [00:32:00] I think what’s going to happen. Forward is we’re all gonna become data consumers because it is so relevant to our personal lives.
And it’s so relevant to every job function we do. But the industry really hasn’t cotton up to provide us with kind of the tools and in order to do that and I’m using almost a consumer analogy, which I don’t think we have to, but I just think that, it will, from the consumer to the pro-sumer, to the worker to of course, the back end analyst, this is going to be part of our daily routine.
And we’re just the very cusp of it. Another sign of the, the big growth in data
Tristan Handy: It’s I want that world to exist. And it, I think of the app store, the iPhone app store too. When you go from like this high level imagination of like how a user experience could work down to oh shit, what are all the things that we’re going to have to build in order to like, make that possible?
It’s probably not there yet, honestly. And it’s not just like a, compute and storage and moving the data around and all this stuff. It’s also I mean that if [00:33:00] I have never been a like a senior person in an it organization, but I can imagine that if I were, that would terrify me a little bit lots and lots of, because that’s the common element that all of these things will need is like access to your data.
Oh, for sure. We’ve gotta make that safe. I assume that’s next project, number one, if we want this world to come into existence,
Martin Casado: I don’t even think we understand what it means to, to build a company that’s entirely data focused. So absolutely like compliance is a big issue.
Security is a big issue, but there’s also things like, liveliness and correctness and, again, this, I think this is all a part of why this may even shift disciplines. If you have a piece of software and you press a button, like you normally know what that button is going to do.
So you’ve got pretty deterministic output when it comes to data and analyzing data, you’ve entered this realm of interpretation, which is, we want to expose a bunch of data and people make decisions based on that data. And we’re going to [00:34:00] improve the accuracy of that over time, but there’s just this kind of massive, evolution on the tooling and then also evolution.
[00:34:07] How we educate users on what that means? #
Martin Casado: How we educate users on what that means? And so not only is it a, to your point system, regulatory compliance is also a massive kind of education and just changing the way that we think. And so this is why, this is a trend that’ll I think last on the order of decades, not…
Tristan Handy: It’s gonna be my question like this doesn’t seem oh, five more years and we’ll be there. Which to a certain extent maybe this actually feels a little bit harder than the software.
Martin Casado: But the here’s the good news. Here’s the good news. Okay. So listen, here’s why everybody here should feel very confident that listen, if you choose to do this for the next 50 years, you’re in the right place.
So let’s say, Tristan, you and I, in a few years, or oh, like we we really care about, climate science and we’re gonna help fund. That is going to solve like a lot of these things. It’s very much a data problem. And so there’s a company that like pops up that’s really focused, or maybe it’s a nonprofit [00:35:00] effort that really focuses on predicting, what are the key contributors to climate science, critical, great predictive model, very much a data thing.
It’s something that we work on. A very different discipline. A lot of folks will be involved. If you come up with an answer, let’s say you come up with an answer. And like that like fundamentally change how we think about climate science. You know what we’ll do. We’ll be like, oh, that’s great.
That’s one app. Then we’ll go off and go after I don’t know, like water pollution, we’ll go after something. There’s literally one app. Like in the. The things that you can apply data to is basically the natural world. And so I think, yes, there’s going to be this it’s true.
There’s going to be, there’s going to be this, this massive kind of shift into how we think about data. And then I think we’re just going to start applying it to these broad things and we’ll, we can tackle huge problems and then move on to the next. Yes, exactly to your point. I do think that this is a multi-decade thing.
Tristan Handy: I know that you’re my last question for you was actually going to be what’s your takeaway for all the people who are here today? And I think that you just gave the answer to that [00:36:00] question that that the career path, everybody defines career ladders for their analytics organization.
And what you won’t see is get promoted, become a founder, tackle, climate change. But like I have personally found that my data skills have been incredibly relevant to think from first principles as you navigate all the uncertainty involved in starting a company. So I actually don’t think that feels really normal and natural to me, not that I’m necessarily inviting 10,000 people to tomorrow go pitch you.
But I think that, that’s the thing that will happen.
Martin Casado: Yeah. I think the most important thing to understand for people that are listening to this, that aren’t in my position. And it’s hard to actually get a broad view in the day to day. It was funny as I listen, I did a bunch of, tactical work.
I found it, I found the two companies. I was a, an executive in a very large company. Like when I left, it was a $600 million run rate product. I’ve done all of those things and I didn’t really understand how the industry worked until, I have this position because you see so many companies.
And [00:37:00] so in some ways it’s almost like sampling the industry. And two big takeaways, the first one, and I’ve said it before, but it’s worth reiterating, which is the biggest growth area right now in, in all of the industry is data for sure. And it’s not because it’s buzzy. It’s not, because the VCs created this.
No it’s because there’s just a fundamental need and people are smart and they address needs and the need is the market. And it’s what I said before. It’s because today, if you’re building any system in tech, the way you differentiate is through data. That’s how you differentiate. Everything else is relatively mature.
It’s hard, but it’s an engineering problem. And so just know that if you’re in data this is great. And then. I would realize that there isn’t this schism, there’s this almost this false schism we’ve created between like analysts and ML and this and that. The second thing that I think is important to us, we’re really all heading in the same place, which is like, how do you build a discipline in an industry around data? That’s the size of software? Probably much better because it’s going to shift. And so no matter where you are, I think you continue to focus on that. And yeah, [00:38:00] absolutely. Listen if you’re quite interested in doing something very big, my email is open as they say.
Last modified on: Apr 19, 2022