June 11, 2021 update â This post hasnât aged as well as I wouldâve hoped. George makes a strong statement in it: âFivetran is not fundamentally a company that is focused on data transformation.â That statement appears to be less true today than it was in October of 2020. In a recent conversation, George indicated to me that data transformation had become a core part of the Fivetran strategy. And as the company moves towards the GA release of Fivetran Transformations, itâs become increasingly hard for the two companies to continue what was once an extremely close commercial relationship.
While we continue to be excited about Fivetranâs adoption of dbt as a standard for data transformation, the relationship between the two companies is a bit more like the relationship between Snowflake and AWS: sometimes allies, sometimes competitors. And thatâs okâthese kinds of evolutions happen in fast-moving markets like the one weâre in.
We continue to have a lot of respect for Fivetran (and use their data ingestion product internally!). Many Fivetran employees make fantastic contributions to the dbt Community and the company is working on some tooling contributions that it hopes to release as open source software soon. All of this is fantastic.
As the ecosystem evolves, I do want to be extremely clear about our own stance: we believe that dbt Cloud is by far the most robust tool on the market for both authoring and productionizing dbt workloads. And as we continue to ramp our product and engineering velocity that gap will only increase. This shouldnât be surprisingâweâve been working on this problem for five years now and itâs been our sole focus that entire time. It will continue to be our sole focus moving forwards.
Original post â Many folks may not know it, but George (Fivetranâs Co-Founder and CEO) and I go way back. We started out as mortal enemies when I was helping to launch Stitch while Fivetran was in its still-nascent days back in 2015 đ Since I founded Fishtown Analytics in 2016, though, weâve been close partners and have chatted several times a year. I still remember the days when the Fivetran office was so small that I couldnât avoid also getting Taylor in frame when we were Zooming! And of course, as of April 2020 weâre now both A16Z portfolio companies and share a board member.
We figured with Fivetranâs big launch of its dbt integration weâd do a bit of an unusual blog post: a Slack conversation, rather than a monologue. The topic? Why we both believe that Fivetranâs launch of a dbt scheduler is a great thing for the long-term health of the dbt ecosystem and why we, Fishtown Analytics, are excited about it.
Letâs dive in Â
đŹ
George: Thanks for inviting me onto the dbt blog to talk about our new product that is directly competitive with Fishtownâs commercial software offering! What a wacky world open source is đ
Tristan: Hah! My pleasure. I love you taking it right there because itâs something thatâs on a lot of folksâ minds. I know that you and I donât feel that way at all, but there was a great thread in dbt Slack recently that voiced exactly this sentiment. Iâm sure there are others out there thinking the same thing.
George: Yeah for sure. And itâs a legitimate thing to wonder about. So let me start by saying that Fivetran is not fundamentally a company that is focused on data transformation. Our mission is to make access to data as simple and reliable as electricity. We provide the bottom layer of the modern data stack. The vast majority of our product innovation focuses on growing our number of connectors, forever increasing their reliability, and forever decreasing their latency. The question âhow do I get my data from point A to point B, reliably, quickly, and with as little effort as possible?â is a shockingly large problem and we plan to continue to be laser focused on it for a *very long time.
Tristan: I love that. And just to be clear, itâs the existence of tools like Fivetran that enabled us to build dbt in the first place. If Fivetran didnât exist, the utility of dbt would be far lower.
Weâre incredibly aligned on the fundamental ETL»ELT transformation thatâs going on in the market: customers should load their data into a high-performance, cloud-based datastore in its most granular form and then transform it using the highly scalable resources of that datastore. Itâs this disaggregation of the extraction and loading from the transformation that has always enabled our products to be such strong compliments.
George: Absolutely. But this really starts getting at the heart of why it was important to us to have our own native dbt integration: many companies havenât made this shift yet. We talk to companies every day who currently use a single vendor for the E, the T, and the L. If Fivetran didnât offer our own data transformation functionality, weâd be at a disadvantage when selling to companies who look at the world in this way.
The way we see it, officially adopting dbt as a standard for data transformation enables us to leverage the strength of (and invest in!) the product and the open source community instead of attempting to compete with it. We truly are making a bet on the long-term future of both dbt as a product and as a community. And of course, that also implies a strong forward-looking bet on Fishtown Analytics as the maintainer of both.
Tristan: Aw shucks đł
âŠno, seriously, I appreciate that! Fivetran has made a meaningful commitment and I see a tremendous amount of potential in pushing the modern data stack towards an open source standard for expressing data transformation workloads. Analysts spend literally years of their lives authoring dbt code, and making sure that they can take these skills with them to future jobs is hugely valuable for the community.
Weâre especially excited about the work that yâall have done with the dbt packages that youâve built for Fivetran connectors. 24 distinct packages today! Thatâs awesome.
George: Yeah! You may not realize this, but weâve spent way more human hours in building out our open source dbt packages than weâve spent on building our native dbt job scheduler. We now have two full-timers dedicated to this effort, so expect to see more and more coverage in coming months.
Tristan: Nice. Is the goal for there to be a package for every connector?
George: Yes, for every connector that delivers a predictable schema. We consider these dbt packages to be the âsecond layerâ of Fivetran, that will ultimately be just as valuable as the connectors.
Tristan: Love that.
Taking a turn: itâs great that weâre both so positive on dbt and its community, but how do you see the commercial relationship evolving between Fivetran and dbt Cloud?
George: Itâs important for Fivetran to offer a great transformation solution to our users âout of the boxâ when they set up Fivetran, and dbt orchestration is going to be an important part of our product that we will continue to develop over time. However, we expect that dbt Cloud will always be the premiere dbt experience, and weâre perfectly happy to see customers start with Fivetran dbt Transformation and later upgrade to dbt Cloud. And of course, weâre also very happy to see our customers that need the more advanced features of dbt Cloud go straight there if thatâs what makes the most sense for them.
Tristan: That 100% makes sense, and I appreciate your willingness to go on the record saying thatâit certainly makes this collaboration so much more straightforward.
Can we talk for a sec about where you think the ecosystem is going? I think you and I both believe that over the next several years there will likely be a bunch of companies who have some versions of dbt job scheduling incorporated into their products. Is that right?
George: Absolutely. dbt is complementary to each layer of the modern data stack and itâs hard to imagine that some of the cloud providers and other ecosystem players wonât offer some level of dbt functionality inside their own products. We wonât be the only ones to see the value in doing this.
I think this kind of story has played out in very negative, value-destroying ways in the past (think: Cloudera / Hortonworks, MongoDB/DocumentDB). IMO we have an opportunity to get it right this time in this ecosystemâto make collaboration between vendors actually win-win.
Tristan: We actually thought about this a lot in the early days. Our goal had always been to build dbt into an open source standard for how data transformation workloads were expressed, and so we fully anticipated the question of âHow do you compete with other vendors hosting your open source product?â
Other open source products have responded to this challenge by migrating to non-OSS licenses. We considered doing the same, but it never sat right with us. Instead, we opted for a different approach. We decided that dbt Cloud does (and charges for) two things:
- Harden the platform
dbt Cloud gives companies of all sizes access to an operational environment for dbt Core that would be hard to replicate themselves. This means: distributed, fault-tolerant, well-monitored, and highly reliable. These characteristics of a system are costly and hard-to achieve, and when youâre running critical workflows theyâre extremely valuable. We back up these characteristics of the platform with guaranteed SLAs. - Innovate on top of the platform
dbt Cloud provides brand new user interfaces that make dbt both more accessible and more powerful. This includes development experiences like the dbt Cloud IDE, slim CI, and a soon-to-be-launched metadata API weâre calling Codex. These features are all built on top of and leverage dbt Core, but they greatly extend its reach and usefulness.
#1 is a fairly standard approach for OSS maintainers, but #2 is much less common. We feel that this âinnovate on top of the platformâ strategy has the potential to create a lot more value in the ecosystem and to align interests of many vendors more closely. Itâs why we fundamentally see Fivetranâs launch here as a good thing: ultimately, we arenât trying to sell a hosted dbt schedulerâŠweâre trying to sell brand new user experiences that are additive to dbt.
George: In fact, weâre already brainstorming on ways that Fivetran and dbt Cloud could be directly integrated, further strengthening this âvalue addedâ story!
Tristan: Hah, yes! Too early to share more now, but this is an area that I know that we both want to spend more time on :D
Just to complete the story: weâre very early on in the product lifecycle of dbt Cloud overallâwe had a grand total of four engineers at the start of 2020! But weâre growing the team quickly: weâre at 13 engineers today and are on pace to have more than 30 by EOY 2021. Youâll start to see more and more rapid launches from us in the coming months.
George: You obviously know your product way better than I do, so I wonder if you could do a quick brain dump. Letâs say Iâm evaluating the two products against each other todayâwhat are the differences in capabilities?
Tristan: Yeah! Give me a minute on this and Iâll ping you on Slack when Iâm doneâŠ
OkâŠthatâs what I have. I messaged some folks on your team to make sure I had the right info in there so I think we should be good to go. I hope you donât feel like Iâm piling onâŠ!
George: Hah, no, thatâs fair, that was exactly the point I was trying to make. I think this chart speaks louder than my high-level assertions from before: Fivetran is primarily focused on data movement, not data transformation. Weâre incredibly excited about dbt as a solution to data transformation for our customers specifically because it allows us to focus on what weâre already good at and leverage all of the work done by dbt and the dbt community as an accelerator.
Will we build some of the things in your spreadsheet over time? Absolutely. Is it a priority for us to close this gap? Not at all.
Last thing Iâll say before we can put this already-long post out there into the world. Something Iâm always telling folks is that the size of the ecosystem we all operate in today is absolutely tiny relative to the size that it will be in, say, ten years. There is just so much data at rest that needs to be moved. There is so much computation that needs to happen on top of it to make it useful, actionable. Commercially, the dollars today just pale in comparison to the dollars in the future as all of us in the ecosystem grow the pie together. You can see this promise for future growth in the explosive multiple Snowflake commanded at its IPOâeveryone realizes there is a tremendous amount of demand for what weâre jointly building.
Getting lost in âwhat features product A has vs. product Bâ is just not really the point where we sit today. This release for us was about a long-term alignment with the dbt community.
Tristan: Love it. Thanks for taking the time to hang out đ
Last modified on: Apr 25, 2022