Inventa Builds Confidence in Data Through the dbt Semantic Layer
This is the story of how Inventa uses dbt Cloud to boost accuracy and domain ownership of its data
in data maintenance time
to data models, increased from 2
centralized and implemented in the dbt Semantic layer
A booming business
Based out of Sao Paulo, Brazil, Inventa is one of South America’s fastest-growing prospects. The company functions as a B2B marketplace designed to quickly and efficiently connect countless small businesses with major suppliers, allowing them to stock up on everything from snack foods to cosmetics.
Although other businesses with similar portfolios operate in North America and Europe, Inventa offers a service tailored to meet the unique demands of the South American marketplace.
“In Brazil, it’s not easy to have a credit line as a small business,” explained Gabriel Marinho, Lead Analytics Engineer at Inventa. “They also have low leverage with suppliers, so they pay huge shipping fees. They can’t get discounts, and oftentimes the minimum order suppliers require is a lot more than small business can—or need to—buy.
Many businesses therefore don’t use traditional purchasing software packages. Instead, they work through messaging programs and place orders by directly communicating with sellers—a painful process bouncing between WhatApp messages and PDF catalogs. Inventa provides the large category of small business and suppliers a safe, easier platform to complete these transactions.
The need for good data
Connecting a high number of small clients and customers means Inventa must handle a lot of information. As the business has grown in both reach and ambition, so has its need for reliable data.
“For us, it’s almost everything,” said Gabriel. “We want to offer credits to people without a bank history, and it’s not easy to give them a credit line without having a lot of data. So we have our own credit risk model.
“It allows us to evaluate credit applications and empowers our product to provide a better solution and help people improve their access to cash flow.”
On the other side, for suppliers that sell on Inventa, the team needs to provide them with visibility of how their store performs on the platform.
“With in-depth reporting, we can help suppliers that are struggling or want to have better results,” shared Gabriel. “Data assists them in identifying how to improve their products and increase sales”.
An MVP system to start
When Inventa began operations, the company used a Postgres database directly managed by Gabriel.
“At first,” he explained, “we were mainly sourcing free or cheap ways to manage our data. So when I got to the company, my first job was to list the cheapest way we could build some sort of data stack.”
As the information Inventa was handling started to mount, it quickly became apparent that this low-budget approach would start holding the company back and restrict its ability to grow. The data team spent too much time on maintenance and patching systems together rather than delivering value to the rest of the business.
“I was duct-taping our platform together so it could cope with our growth,” Gabriel said. “This was a problem. Once we received investment funds, we decided we should start fresh—not only for the data team but for the company, too.” Gabriel and his team started evaluating solutions and soon encountered dbt.
“We wanted to use SQL,” he said. “That’s somewhere that dbt had no competitors.”
dbt Cloud: investing in data the right way
After experiencing the difficulties of working with an unstable software stack, Inventa’s data team sought to ensure that their next setup was as reliable and reputable as possible.
“We didn’t want to go to a new tool that was not mature enough and would potentially leave us needing to migrate again or change our workflow in the next few years,” Gabriel explained. “We didn’t need to be cutting costs for data anymore because we knew data would be the pillar of the business for the next two or three years of development.”
Another of dbt’s major draws was the wealth of information and documentation associated with the solution. With a rapidly growing team, it was crucial to have easy access to resources that detailed best practices and helped the company upskill.
“One of the reasons we went with dbt Cloud was so the rest of the team could try to build machine learning solutions, build better dashboards and BI systems for stakeholders, and create new models themselves,” he explained.
Reducing Maintenance Time & Costs
While gaining experience with the fine details of databases in Inventa’s early months had its benefits, keeping everything online and running as expected was time-consuming and frustrating. Once the company switched to dbt Cloud, Gabriel and his team had a lot more time on their hands—time that could be spent developing the business rather than fighting fires.
“Before, maintenance was taking up 80% of my time,” he said. “Now, it’s almost nothing.”
One of the big reasons for this is that Inventa’s data stack now works smoothly. Gabriel noted that when something is failing in dbt, “99% of the time, it’s a problem with the source data rather than the system.” Before, it could have been a problem in the pipeline, an accessibility issue, or one of the dozens of other potential errors.
“Finding out the cause,” he noted, “was a lot of work. A lot of frustrating, time-consuming work.”
With dbt, Inventa’s data pipelines are much easier to understand, assess, and maintain thanks to dbt’s reliability and clear data lineage.
“We now have a huge amount of documentation,” he explained. “With dbt, we can add Git pre-commits hooks and use project evaluator to enforce the CI/CD for documentation.
“I’m setting the guidelines, but I’m also accountable to keep following them rather than simply pushing code without documenting it.”
Another significant advantage of moving to dbt Cloud is the ease of testing, which Inventa struggled with in its early days. According to Gabriel, the company had previously used DynamoDB, which was “not the easiest way to extract data.”
“We used to have a lot of bad data,” he explained. “With dbt, I can easily test the columns that I care about, and when we have an actual problem, we can quickly flag the issues to the engineering team.”
With Inventa’s previous data stack, the engineering team’s first indication that something went wrong was when the production pipeline broke. Now, the team is alerted the moment a test fails, even if it’s not breaking the data pipeline.
“Things are way more robust now,” he said. “This helps the product team avoid any issues reaching customers and helps the data team identify any issues during ingestion.”
Automated reporting to suppliers with the dbt Semantic Layer
The dbt Semantic Layer enables teams to create standardized metrics that return the same consistent and precise data across tools.
Inventa was already using this feature to power their business analytics—it ensured different teams were looking at the same calculation of metrics, such as supply revenue:
“When you put everything on dbt, you ensure everyone is seeing the same number,” said Gabriel. “You don’t get that message saying, ‘oh, my director got this GMV number and I’m getting this different one.’”
Supplier analytics—which had shown in customer interviews to be crucial for overall supply and demand performance—was one of their dbt Semantic Layer powered data products.
“We had a lot of MVPs that needed to be worked on and supplier analytics was no exception,” shared Daniel. “It was a manual, error-prone process. We imported a .CSV from a Hex dashboard locally, turned those results into different PDFs, and then uploaded them to Google Drive to send it to suppliers.
“It would sometimes take an entire day of work for our business analysts to generate these reports,” winced Daniel.
Once built out, dbt’s Semantic Layer powered all of Inventa’s internal dashboards; next, they saw an opportunity to also use the feature externally:
“We realized the dbt Semantic Layer could take the same metric we used for business analytics and reuse it for supplier reporting,” shared Gabriel.
Inventa used the dbt Semantic Layer to power automated Hex reports for suppliers. Each supplier had a dedicated dashboard, filtered based on their unique supplier ID. The data team could now ditch their former manual process that was taking a full day of work.
“The initial goal was to get this day back. We succeeded, and the feedback from the suppliers has also been really positive,” said Gabriel. “We can now see which suppliers and accounts are looking at the data, and have our account managers reach out to them proactively to help them succeed”.
Distributing Ownership Through Collaboration
One of the less apparent benefits of dbt Cloud came through its impact on the broader data team. Inventa’s previous data stack had been too complex for most of them to handle easily, but the relative simplicity of dbt allowed for much more collaboration—and, with that, ownership.
“Before, I was the only person doing the modeling,” explained Gabriel. “We had five or six data scientists, but they didn’t know how to build the pipeline, integrate with Airflow, or deploy a lambda function to AWS directly.
“With dbt Cloud, however, they don’t need to know any of that, and they can easily collaborate.”
As the team began collaborating, members got more insights on the best ways to model different data, what to discard, and how best to display their insights to end users.
This granted the team ownership over their data, enhanced by their new ability to easily check lineage and documentation with dbt.
“I can now say that I spend only about one percent of my time on maintenance,” Gabriel said, “because I now have 12 people to maintain their own data.”
Once the new data stack was in place, it didn’t take long for the benefits delivered by dbt to filter out from the data team to the rest of the company.
The easy integration between dbt and Inventa’s Hex and Snowflake systems allowed the data team to meet stakeholders where they are.
“The whole company benefits,” Gabriel explained. “A lot more people across the business are using the data because they can see more value. As a result, the whole company has started to do more analysis and be more data-oriented.
“And if they are not, we are saying: ‘Look, we have this beautiful data set that you can use to test your assumptions and hypotheses. Why don’t you try and use it?.’”
This new approach has helped Inventa to run more efficiently and reliably than ever. Where the data pipeline was once confusing and prone to errors, it’s now viewed as a trustworthy source of valuable insight.
“One thing I used to hear a lot was “can I trust this data?” Gabriel said. “Now they know; if it’s there in the warehouse, you can trust it because it’s been tested, and we’ve done a lot of work to communicate our data quality.
“That is incredibly valuable.”