How Flipside Crypto built a consumer-facing blockchain data product with dbt Cloud

This is the story of how Flipside Crypto used dbt Cloud to enable a collaborative community of thousands of data analysts

3500

models

running every 15 minutes

45,000

data analysts

contributing to dbt models

92%

reduction

in weekly bugs with the help of automated dbt testing

Self-serve data and analytics on cryptocurrencies

Flipside Crypto was founded in 2017 to provide analytics and business intelligence to crypto organizations. Their product—a crypto data platform—provides updated data every 15 minutes on 17 blockchains, including Ethereum and Solana.

Users can—in turn—freely access, model, and visualize data directly on Flipside Crypto. Their website has 100,000 community-powered dashboards analyzing diverse crypto information, such as active users of individual blockchains and valuations of NFTs. 

Such a data-centric product comes with its challenges. In order to be a successful business, Flipside Crypto must maintain and model trillions of rows of blockchain data, while also providing an easy-to-use product to thousands of self-serve data analysts.

Diverse data consumers

The company has different data consumers and—alongside its free platform—unique monetization models. Flipside Crypto’s data must cater to and support distinct stakeholders.

Empowering a thriving community of crypto data analysts

Today, the biggest group of data consumers is Flipside Crypto’s thriving community of crypto analysts, crypto businesses, and data analysts who want to learn more about crypto. 

“We have around 15 thousand members in our Discord server, and many more analysts contributing to our project,” shared Jason Mission, Head of Analytics of Flipside Crypto. “They can query our Snowflake account for free using our web app, do analysis, and build dashboards.”

Providing granular crypto data to enterprises

Flipside Crypto also enables large enterprises—from investment firms to financial institutions—to access and ingest their data, without needing to do any additional ETL.

“All these blockchain pipelines we’re building are very difficult. You need a large team and deep subject matter expertise,” said James. “If you have a data scientist, but don’t want to hire five data engineers, we can feed this data directly from our Snowflake into your account.”

The many obstacles to ingesting, modeling, and maintaining blockchain data

Trillions of rows, hundreds of models, tens of unique blockchain protocols

Flipside Crypto supports data from 17 different blockchain protocols. Since each protocol has its own data structure and related products (such as ETFs), the data is complex and vast.

“We’re essentially 17 data teams in one,” said James. “We have tables that are hundreds of billions of rows. For example, our Solana data (a blockchain protocol), by itself, is expected to hit a trillion rows in five years.”

In order to expose self-serve end users to this data, these trillions of rows need to be cleaned and documented.

A high data quality bar for financial data

Flipside Crypto’s business model relies on providing governed and accurate data to different stakeholders: from individuals to enterprises. That alone puts data quality at the top of the company’s priority list. But the data quality bar rises even higher when it comes to data used for financial decision-making.

“If we have bad data, no one’s going to use our product,” emphasized James.

To deliver on its promise, Flipside must have a robust test system—with almost 13,000 tests— to guarantee the correctness and availability of the data, as well as a structure that facilitates troubleshooting.

Tackling the blockchain data challenges with dbt Cloud

Modeling, cleaning, and maintaining a large data volume

Before the trillions or rows of blockchain data handled by Flipside can be analyzed by end users, it must be first transformed by Flipside. It’s through robust data modeling that the company provides valuable, accessible, and up-to-date data for consumers.

“In order to maintain all this data, we need to run close to 500 to 700 models every 15 minutes,” said James. “These models are written on dbt to clean and curate the data.” 

“Before dbt Cloud, with the volume of blockchains and models we have, our data development was highly custom, manual, and very difficult to maintain. Now we can copy code from different blockchains and make it fit within a couple of weeks.”

Since each protocol has its own data pipelines, a well-defined structure for data modeling is necessary for the company’s success.

Austin Blackberby, Data Analytics Manager at Flipside Crypto, explained Flipside’s approach:

“We divide our data into three levels: bronze, silver, and gold. Bronze is our source data. Silver is where the meat of the transformation and data testing happens. In gold, you can find the views we eventually expose to end users.”

These models—written in dbt Cloud—are then documented on Flipside’s dbt Docs webpage.

Providing documentation and opening up to external collaborators

The data models Flipside Crypto creates are publicly available on dbt Docs. Accessing these repositories and reading the documentation are the first steps users take before any analysis. 

The documentation contains a full picture of the data—what are the columns available and what do they mean, what are the dependencies of each table, where are metrics mentioned, among other pieces of crucial information such as lineage.

Armed with this documentation, users can also create dbt pull requests, helping Flipside fix bugs and publish new tests.

“We have super users who request certain protocols, but we don’t have time to get to all of them,” said Austin. “With dbt, they can do the modeling, write tests, and create a pull request. This set-up has enabled external analysts to contribute directly.”

For Flipside, dbt Docs function both as the backbone of their data product and as the tool that enables their data to continuously improve.

Empowering crypto expert analysts with SQL support

Blockchain knowledge is scarce and, for Flipside, one of the most knowledgeable sources on the topic is their 15,000 community members. The company must provide a data setup that enables these crypto experts to build more, faster.  By investing in SQL, they achieved that:

“We’ve built tooling where, within dbt Cloud, an analyst can do web 3.0 analytics on their own without needing the help of a platform or analytics engineer,” explained James.

“With our technology, users don’t need to write Python code or set up job schedulers to get the endpoint,” said Austin. “We bypass all of that and allow users to connect directly to the blockchain with SQL and dbt. It’s so simple for end users.”

“We couldn’t dream of this a year ago. Now we can post any type of request without any involvement from our analytics engineering team. We’ve enabled our ecosystem, all 15,000+ people, to build more, new, cool stuff faster.”

Improving data quality with automated testing and lineage

High data quality is a top priority for Flipside. With dbt Cloud’s lineage graph and automated testing, dbt has helped the company enforce a high standard.

The lineage feature provides a holistic view of all data pipelines. This is particularly useful when the data is big and complex, like Flipside’s. By understanding dependencies, the team can easily identify the root cause of breaks and bugs.

While data lineage assists the team in fixing pipeline breaks faster, dbt Cloud’s automated testing has helped them avoid the majority of bugs altogether.

“Before dbt, we used to have so many bugs. The crypto analyst community would call out bugs at all times,” said James. “Today, we run over 10 thousand tests on a daily basis across our models. When you focus on quality and invest in implementing comprehensive tests, you find the issues before you even release anything.”

As a result, bug reports have gone down from over 50 a week to less than 4. 

“This investment in testing has increased our velocity significantly because we’re not redoing work or having to fix previous releases. We can instead focus on building new pipelines, dashboards, and data products.”

More focus on building, less focus on maintenance

The new data engineering workflow Flipside Crypto set up with dbt facilitated data modeling, decreased bug occurrences, and improved collaboration with external stakeholders. In turn, this has freed time for their team and community members to instead focus on more valuable activities.

“dbt Cloud allows us to focus on building versus focusing on maintenance,” said James. “As someone who essentially created a dbt-like product at different companies, I know how much effort it takes to build and maintain. I’d rather have my team focusing on building data pipelines and developing new models than being absorbed by infrastructure.”

Continuing automation and expanding customer participation

Decreasing latency with further automation

Moving forward, Flipside Crypton will continue developing automation flows to get valuable data into the hands of analysts faster, while simultaneously freeing the team’s time to focus on building new things. 

“We’re looking into how we can use dbt to set listeners, where if a certain action happens, then it triggers following actions which develop a complete streaming pipeline,” explained James.

Expanding customer participation with easy-to-access data

Flipside’s ability to enable analysts to connect directly to the blockchain with dbt and SQL has been a success for the company. They’re moving forward with exploring other ways they can enable users to build without Python or engineering resources.

“We’re leaning more and more into user-defined functions,” shared Austin. “We’re exploring ways we can open up access to users.”