Building a data team from the beginning

last updated on Jan 27, 2025
This post first appeared in The Analytics Engineering Roundup.
Daniel Avancini is the chief data officer and co-founder of Indicium—a fast-growing data consultancy started in Brazil.
There are a lot of data consultancies around the world, and a lot of them do great work. What has been so fascinating about Indicium’s journey is their HR model. Rather than primarily hiring experienced professionals, they decided to go hard on training. They built a talent pipeline with courses and an internal onboarding process that takes new employees from zero to 60 over a few months.
The result has been phenomenal and Indicium delivers great client outcomes, but most importantly, they're building skills for hundreds of brand new data professionals.
Data is a hard field to break into because fundamentally you can't do the real thing unless you have access to data. So any company investing in building scalable hiring and training processes for analytical talent is one to be excited about.
Listen & subscribe from:
Key takeaways from this episode
Can you give a little bit of an introduction to you and to Indicium?
Yeah, sure. So I'm the co-founder and CDO of Indicium. We're a data consultancy. Now we're based in New York, but we also have a presence in Latin America and Brazil where we started. We mostly focus on the data stack and new data stack tools. We've been helping business companies use modern data platforms and move to new data stack tools, including dbt, for about seven years. We are a young company, but not that young in the modern data stack world.
Tell me about your journey starting in Latin America to expanding internationally.
We started in a small city in Brazil called Florianopolis. It's like a tech center, like San Francisco. There are many new companies there, but it's not a business data consulting space. We really started from the beginning; we started with smaller, mid-sized regional companies, really trying to find something that made sense. So we pivoted a lot in the beginning on how we could deliver value.
You skipped the step “we wanted to start a company.” What was the original idea?
I was working for a startup in agricultural hardware machinery. Nothing related to data or services in general. My cofounder was managing a surfboard manufacturing factory.
That is wild. I love that. You can come to data from so many different backgrounds, including surfboard manufacturing.
It's a beach town, so there's a lot of surfing there. We realized at the beginning that there were a lot of technology platforms for marketing analytics, data intelligence, SaaS tools. But when we talked to anyone that was making decisions, no one was really using that data.
Our first insight was there’s a need for someone in this market to bridge the gap and bring all this really great data to companies in a more organized way and in a value-driven way. At first that was our goal.
We were not focused on building a data platform consultancy. But as we grew, we found out that it's harder than we thought. We needed to do a lot of foundational work, especially on smaller companies. All they had were Excel spreadsheets. Databases, SQL databases, and a lot of Excel spreadsheets, and a lot of the complex analysis we wanted to do, was just not ready. We helped them build platforms and foundations for these companies.
And it took off like a rocket ship? What's a sense of your scale that you want to share?
Yeah, we're at around almost 400 employees right now. So we're pretty big for this market.
What has allowed you to become successful at the scale that you have been?
I think what really helped us scale is that since the beginning, we have really focused on building our own teams and our own capabilities to scale. Even before we started using dbt or any of the modern data stack, we already thought, because we are in a smaller market, we couldn't compete for data engineering talent.
So at this age in Brazil, probably the same in the US, it was a very competitive market for data engineering in general. There was a data engineering talent pool in Brazil, but it was expensive.
But there were a lot of training programs on the internet. There are a lot of data camps, Udemy, Coursera. There's so much good stuff there. But maybe there's a lack of curation, right? People want to work in this area, but how do they start? What do they have to do? So we really focus on building that talent pool, our own talent pool right from the beginning. We were at the university just bringing good people, good talents for engineering, from economics, from business. “Hey guys, do wanna work with data?” Look at this program; it's free. Just go there and train. And then we would hire maybe one or two of the best ones. We would bring like 10 or 15 people. They would come to the program, we would hire the two or three best ones.
Maybe four years ago we were starting to grow faster and we needed more people. We needed a more stable source of talent.
First, we built our own analytics engineering course. We found dbt and realized this is the way we grow because we don't need to hire experienced data engineers. We can hire experienced marketing analysts and train them.
It's such a consulting hack, right? I've been excited to hear your story because I think it is so parallel to our own story. We were doing a similar thing in that we were hiring people with no data experience.
We can grow much faster because we can hire analysts in general. We can train anyone. It was so hard to train Airflow and Spark at that time. But if we use dbt, we can just teach these guys how to work with data analytics. And so I built the Analytics Engineering Formation, our first course. And what we did in this course wasn't only dbt. We trained about dimensional modeling, ETL, a lot of foundational analytics work that we weren't seeing when we're trying to hire people.
But everyone at that time wanted to be a data scientist. But that's not the work. For every data science, you're going to find 30 data analytics engineers because there’s so much more work with analytics.
We've trained more than a thousand people with this course in the past four or five years. And a lot of those people, a lot of these talents we hired, so they would do the course and then we're like, yeah, we have an open position. Do you want to work for us? And so we started really hiring from this course for the analytics engineering profession.
And it really worked. And we still use the same course today for our own team, Everyone has to do the course so they understand what we do. There's a practical exam, so they need to build their own data warehouse with dbt by themselves.
My best guess is that there's probably a million or so humans in the world that have used dbt pretty regularly. In the grand scheme of things, that’s not a big number when you're a giant consulting organization and you have a huge hiring pipeline. Building a practice that puts dbt at the center of it can work really well, but you have to really build the business model around it.
But I would argue dbt is not the only one. If you think about data science, about data engineering, all these other data professions, it's really hard because there's no undergrad. People don't graduate on airflow engineering. Everything they use at work, they learn after they start working.
Yeah. And so the point is you have to build a talent pipeline that teaches people how to do the stuff as opposed to expecting it to already exist.
One of the things that people don't fully understand, unless they've been through this journey, is that it is an unbelievable level of investment to do what you've done. Consulting businesses don't generally raise a ton of venture money. It's a real strategic investment, but two, it's a real risk.
If for some reason this doesn't work, that's a giant problem for Indicium, I would imagine. And that means that if you're going to make this type of investment, you have to feel like you have control over the technology that you're choosing. I imagine that it would be very hard for you to make this type of investment in something that was not not open source. Is that a true statement?
Probably yes. Especially because a lot of these tools, I can't really pay for the tool when I'm educating and when I'm teaching. Maybe if I have a partnership, but yes, for a lot of the work we needed to use some kind of open-source tool for this work.
That makes total sense. I didn't even think about it from a seat's perspective. Let's say that you were gonna use Amplitude or something like that. You would have to figure out how to get whatever, 100 people per semester access to Amplitude and that requires partnership.
And also we had to build our own courses because if I needed to use market courses like Udemy, I would have to pay for all these courses for all of these students and then it becomes too expensive. So we had to invest a lot of our time just building our own training programs and our own training.
What other tools did you incorporate into the standard training?
So what we did after a few years is instead of just training dbt and analytics engineering, we created another program we call the Lighthouse program. When we open positions for analytics engineering, we get all kinds of people just because they are engineers. Then we're like, what kind of engineering? “I'm a chemical engineer.” Okay, but do you know anything about data? “No, but I'm an engineer.”
We had so much work on teaching because it's such a new market. People don't know what the work is. A lot of the undergrads. They still don't understand what an analytics engineer does. So the idea of the program was be a lighthouse. Like, I'm going to show you the best career for you.
After a person joins the program, we're going to tell you, you're going to be a data engineer because of their competencies.
It's like the sorting hat in Harry Potter. And is that about skill sets or personality or interests or what?
Yeah, I really looked into skill sets, personality, and we did some personality traits tests.
Okay, so tell me what's the personality of a data scientist versus an analytics engineer?
Okay, that's a good one. So what I did on this, I look into being very innovative, like looking to innovation, new things. You want to build new things. I want to build new things all the time, but I also want to build reliable things. The new things side is for data scientists, like experimenting, experimenting, building new things. On the other side of the spectrum, data engineers. So I usually put the analytics engineers kind of in the middle. Like I want to build stuff, but I also want to have reliable pipelines. And I want to build things that are closer to business. And I want to understand the value of what I'm building.
Do you prefer to bring to make something new, but unstable or do you prefer to have something that works every time? Just that question would filter the personalities for these professionals really well.
I really identify with that so much. I'd be curious to hear where you fall in this spectrum, but I am a deeply impatient person and so I can't stay on one thing too long. I love making pipelines and getting them to a certain point. But then I'm like, okay, let me try something else where I'm learning about the business. Having this like bi-modality, I think is what keeps me forever engaged in this work.
Yeah, you should probably ask Matheus, my co-founder, because he always says the same thing, but I'm really closer to the data scientist when I build new things. My background is in economics and statistics and data science. Our CTO is an engineer. He's a data engineer. He's angry if something doesn't work.
Okay, so you developed a course. You developed an ability to funnel people into what? Data scientist, data engineer, analytics engineer?
Now we also have data analysts, so more in the BI, it's like an analytics engineer, with a deeper BI knowledge. Analytics engineering, data science, AI engineering, data engineering, and we are also adding a data consultant career. We have all these tracks.
And this program is now a six-month program, and we are paying for them to study. So that's also very also risky for us.
I'm glad you said it's risky. One of the things that I think you don't recognize until you run a consulting business is that it is terrifying to face attrition. Attrition is the thing that kills your business. People will quit, life. Things happen. This is the world.
But when somebody quits, it's not only revenue walking out the door, but it is also your investment in them as a human walking out the door. One reason why I think it is very rare for companies to invest in people is that they are going through this J curve where they are not making money at first. Then you're slowly working your way out of it.
I don't know if you've done the math, but I think you could figure out how much you'll pay back for the training you've put in.
For some of these programs, it's not only training. One of the reasons we built the program is that people need to work on something. They need practice. We put them in internal projects.
We have phases. So the first phase is the foundational phase, just learning. We teach them databases, APIs, cloud computing. These things are important for anyone who works with data now, but not important for someone who just graduated from college or is changing careers.
Then we have theory, or the data journey, that's where they train on dbt and data engineering techniques. And then we gradually put them into projects, into real work. They can be a copy of someone, they can work on their own projects, and sometimes they can actually work on real projects. We can even charge them for some clients. Most times we are able to pay for the program with the work they do inside the program. So we have a break even before they graduate from the Lighthouse.
I really feel like so much innovation is business model innovation. To me, what you're describing is a new business model that lets you invest in more people who know how to do great data work. This is very cool.
Yeah, that's what we thought. You can always invest in technology. But in our work in consultancy, it's really humans, right? How can you get the better humans and how can you get them to stay at your company? How can you keep them, right?
We've been very intentional on how to create the social structures that you would find in working in an office. We have a very low attrition rate. If you compare to the market, we are like six, seven, eight times lower than a competitor in our attrition rate.
And that's compared to Brazil, which has a lower attrition rate than the U.S. If you compare to the US, it's like 20 times lower than the U.S.
Live virtual event:
Experience the dbt Fusion engine with Tristan Handy and Elias DeFaria on October 28th.
VS Code Extension
The free dbt VS Code extension is the best way to develop locally in dbt.