Congratulations! You’re a manager! And, you’re building a data team! How exciting! Now what?
When thinking about building a data team at a small or medium-sized organization, there are primarily four buckets of roles that you want to consider: data engineer, analytics engineer, data analyst, and machine learning engineer.
While the data scientist title is trendy, it is also ambiguous. Some data scientists are data analysts, and some are machine learning engineers. Avoiding that sort of catch-all will help manage expectations both within your organization, and with candidates whom you’re interviewing.
In the past 8 months, I have brought on 6 people into the data organization at Netlify.
What follows is from my experience hiring for many data roles, across different companies of various sizes.
Data engineers are the people who move data from outside of your ecosystem into your ecosystem. They are responsible for your infrastructure and data plumbing.
Responsibilities for these folks might include: keeping your dbt instance on the latest version, managing snowflake permissions, managing and writing airflow pipelines, and maintaining the CI/CD pipeline for your repo.
These are the folks who work with data within your ecosystem and prepare it for analysis. You can think of them as data librarians.
Part of the nature of being a growing organization means that everyone has to be willing to do different kinds of work, including analyses. For example, if UI-based data sync tools (FiveTran etc) are available for your data sources, analytics engineers may perform the work of data engineers to sync data to the warehouse. Analytics engineers will often own the design of the data warehouse.
The hardest part of data is managing stakeholders, so in many ways data analysts have the hardest job.
The work of analysts can be split along a 2x2 characterization: projects can be circular or linear and proactive or reactive.
As a leader, it is your goal to maximize the amount of time that your analysts are producing proactive work. Let’s look at some examples of projects in each bucket:
|Proactive||Iterating over your user activation moment, to identify the best leading indicator for conversion.||Understanding what caused a spike in traffic to a specific web asset on a given day in the past.|
|Reactive||Answering a question for an executive, that leads to another question.||Answering a usage question for a PM, so they can prioritize which feature to build next.|
Machine learning engineers focus on building machine learning models and deploying them to production.
Especially at the early stages at a team, it is important to have a person who can both build and deploy what they create.
Consider whether you are a team that will support both R and Python or just one.
Hiring, like sales and marketing, is all about the funnel; you are selling candidates on the opportunity of joining your team.
Very strong job descriptions are a crucial first step. I recommend job descriptions have five parts:
Many job descriptions don’t have these things, but candidates really appreciate them.
For data roles, it’s a job seeker’s market. Investing in thorough job descriptions will help you stand out from the crowd and help ensure a strong candidate pipeline.
Give the candidate an opportunity to understand the company, your goals for this role, and how they will fit into the team.
What does the team already have and what is the need that you are filling?
Is this someone who’s going to focus on a specific domain or subject area? Let them know up front.
This is also a great time to paint a picture of your stack and sell the candidate on your business.
What are the hard requirements for your role? (Hint: a college degree shouldn’t be one.)
Do you need someone with experience with certain technologies or frameworks?
Your requirements list should be as specific as needed but should not be a laundry list.
For example, if you use Airflow, you don’t need to have Airflow experience as a requirement, but you might decide that orchestrator experience needs to be, so a candidate who has used Luigi, Prefect, or Dagster is also one you’d consider. If that’s the case, call out “Experience with data orchestration tools” instead of “Experience with Airflow.”
Try to keep your list of requirements to 5 to 10 bullet points. Fewer actual requirements is better than a lot of fungible requirements.
If you have additional “Nice to Haves,” make that a separate list. Women are less likely to apply for jobs if they feel they don’t meet all of the requirements. Help ensure a strong, diverse pipeline by keeping your list of requirements to only requirements.
What are the things that a candidate will actually do if they move into this role?
Try to be as specific as possible. This is your opportunity to paint a picture for a candidate.
I always mention in interviews that we are looking for “floor sweepers” — people who are not afraid to pick up a broom and sweep a pile of dust on the floor if it’s in front of them, even though it’s not in their job description.
A list of responsibilities is not a list of all the things you will be doing, but this is your opportunity to present what an exciting role this will be.
You should tell candidates upfront exactly what the steps in the interview process are.
Let me emphasize: You should tell candidates upfront exactly what the steps in the interview process are.
Nobody likes to be in the dark.
Tell candidates exactly how many calls they need to do, how long they will be, and who they will be with.
Is there a technical assessment? Include that information too.
If you cannot write this before posting a role, you have not spent enough time thinking through your hiring process.
Starting a new job is nerve-wracking.
Laying out a 90-day plan on how candidates will ramp into a role helps establish standards for performance. It affirms to the candidate that you have thought about what success looks like and helps set their expectations.
As in data projects, the time to set clear measures of success is before you invest time and energy, not after.
Hiring is no joke, and is not a small amount of effort, but investing in the process up front is something that will pay long-term dividends.
Let’s take a look at example job descriptions for those four roles, to see how this breaks down in practice:
Included in all roles. This is specific to your company, your team, and your needs. This is your opportunity to sell yourself to the candidate.