How Loft created organizational alignment while scaling rapidly
Loft is a Brazilian real-estate startup that's been on a tear. The company, just three years old, has been valued by investors at over $2B. It operates 40,000 live listings, touching the main capitals of the South American country. It saw 500% growth in 2020 alone.
At the heart of Loft’s success — the magic ingredient permeating everything they do — is data. They consider the data they generate on the real estate market to be “a great living thing,” to be carefully nurtured and constantly improved. By collecting and analyzing swathes of data, they aim to provide a home-buying experience that’s as painless and predictable as possible.
Felippe Felisola Caso is a Business Analytics Manager at Loft, tasked with enabling Loft’s internal team with the data to make effective decisions.
The key challenge he faced on joining was one of scalability. As Loft rocketed from a few dozen employees to over 1,000, how would he make sure the organization was enabled to make data driven decisions at every level?
“We’d gone from 0 to 1 on the analytics team. Now we were starting the journey to go from 1 to 100,” says Felippe. “The way we’d structured our data, our tables, our processes, everything was really starting to mount up.”
The company had built an impressive machine learning engine, allowing them to provide property valuations and facilitate transactions in the Brazilian market at scale. Could they build an internal analytics engine to match it?
Why Loft needed a new approach to their internal analytics
Emblematic of Loft’s problems was the lack of clarity around core business definitions.
“At Loft we had a lot of different concepts that were referred to by the term ‘portfolio.’ For some in the business, it meant groups of listings. For others, it was the units themselves. And following from that, our own tables and reporting weren’t clear on this either.”
Without a shared understanding of how metrics were defined, maintaining internal data products quickly became a nightmare. “We were not sure what models were being created, or if the same concepts were being used the same way in each model.”
To compound the confusion this caused, another major obstacle Felippe had to help his team overcome was that dashboards that were relied on by internal consumers would frequently break. The team was constantly putting out fires.
“We were modeling data in Databricks notebooks directly; there was no review process and there was no version control,” Felippe said. “Everything in Looker was breaking once an hour. Everything would break when someone updated a table.”
Downstream dependencies were impossible to keep track of. To stay above water, his team developed a Google Sheet that read data from notebooks and showcased dependencies. This process, however, soon became a burden to maintain as well.
Meanwhile, the lack of visibility on dependencies was creating other issues too, less urgent but still costly: builds just weren’t very efficient.
“In the old version based on notebooks, we had a view with the definition of the table. We just ran CREATE TABLE AS and selected everything from that view. That used to take easily 45-50 minutes for each business domain, of which we had 6, all running a lot of resources in parallel. It was very intensive.”
The world with analytics engineering
dbt’s way of working was a boon for Felippe and his team. The analytics engineering workflow — with clear documentation and transparent lineage, regular testing and native version control — made it possible to spark an unprecedented level of alignment across the organization.
“Now I can point to something and tell someone ‘that’s not a portfolio — that’s a listing! And here’s how you can get that definition in the future.’”
Just as importantly, Felippe no longer finds himself spending entire days attempting to diagnose and solve data pipeline issues. Things rarely break, and when they do, it’s immediately clear why and satisfyingly simple to address.
“dbt has provided an environment in which we can define concepts, define sources, and so on. Now we know: where does this data come from? Where does it go? Where is it consumed?”
These questions are now possible to answer in seconds, allowing the analytics team to focus on higher leverage tasks instead of troubleshooting broken dashboards.
Like what? For starters, Loft’s entire team of 30+ data analysts are now capable of contributing to data modeling in SQL through the dbt Cloud IDE.
“dbt has made modeling accessible. Few folks on my team have deep knowledge of Python or software engineering. Business analysts can now model data themselves.”
Best of all, all this added functionality hasn’t come at the cost of efficiency. Quite the opposite, in fact.
“With the intelligence dbt brings to the table we know that if everyone needs this portfolio table, we can materialize it just once and then have all its dependencies run. Now, to run every model in production, our run times look like 50 minutes for everything in total without parallel executions. And we’re actually pretty confident we can reduce that much further.”
Loft’s tech stack
The data team at Loft uses Stitch as well as some in-house pipelines to ingest data into a Databricks Lakehouse, where dbt handles transformation. Looker is used for BI.
With his goal of creating alignment across the organization, Felippe has been particularly satisfied with what Databricks has made possible. “Everything is plugged into Databricks. It all lives in one place and it’s all access controlled. We don’t have to worry about writing to a separate data warehouse or a separate cloud and pulling it into Looker. Having everyone in the same environment and accessing the same version of the same data, every time, is huge.”