dbt
Blog Analytics engineering and data engineering: Do you need both?

Analytics engineering and data engineering: Do you need both?

May 23, 2024

Learn

More companies than ever are adding analytics engineers to their data teams. Should you follow suit? We’ll dive deep into what differentiates data engineering from analytics engineering—and when you should consider developing both practices side-by-side.

What is data engineering?

Data engineering is the engineering practice that designs, builds, and maintains the infrastructure required to store, query, and analyze data. It focuses on a few key responsibilities, including:

Building new data infrastructure capabilities

Data engineers build the data infrastructure on which all users depend. This includes creating data storage facilities such as databases, data warehouses, and data lakes. It also includes self-service tools for managing data, as well as the infrastructure for securing and governing data.

Enabling and managing data pipelines

This involves creating the foundation for all data pipelines—how data transformations are coded, orchestration pipeline execution, handling errors, monitoring, etc. (It may also involve creating the pipelines themselves, especially when an organization’s data practice is just getting started.)

Building custom data integrations

In other words, writing custom code to handle legacy internal systems, currently unsupported file types, or data from third-party APIs.

Optimizing data storage and query execution

Data engineers keep an eye on data read/write performance so that business users can focus on solving business problems instead of fine-tuning their SQL queries.

What is analytics engineering?

By contrast, analytics engineering focuses on providing clean data sets to the end users on their teams. Unlike data analysts, who analyze data, analytics engineers spend their time transforming, testing, deploying, and documenting data.

An analytics engineer’s typical tasks include:

Providing clean, transformed data

Analytics engineers primarily create new data pipelines using standardized tools to provide data analysts and other business users with high-quality, relevant data for their business needs.

Maintaining clean analytics code

Managing data transformations at scale requires maintaining a clean codebase. This means checking code into source control, versioning code and data transformation models, creating and running data transformation tests, ensuring code is DRY by bundling reusable components into packages, and using Continuous Integration and Continuous Deployment (CI/CD) to automatically and safely ship data changes to production.

Maintaining documentation and data definitions

Documentation helps data users and engineers understand what purpose data serves, where it comes from, and how it was derived.

Training business users on tools

Analytics engineers bridge the gap between data engineers and business users by using brown bags, one-on-one training, and other teaching modalities to show data analysts and others how to find data and leverage it in reports.

Do you need both analytics engineering and data engineering?

The short answer is: probably. The long answer involves explaining why the analytics engineering role exists in the first place.

In the old days (we’re talking pre-dbt here), data engineers weren’t just responsible for data infrastructure. They also took requests from business users to create new data pipelines and address data quality issues.

As data volumes began to spike, this strategy proved untenable. Data engineering teams found themselves with months-long backlogs. And business users had to wait weeks or longer to get access to the data they needed.

In the old days, this was, perhaps, a necessary evil. Data pipelines were complicated and often required specialized knowledge.

Starting in the 2010s, however, we saw an onslaught of new data management technologies. Cloud-based data warehouses like Snowflake, data pipeline services and tools like dbt, and easy-to-use Business Intelligence (BI) tools like Looker and Mode meant more business stakeholders than ever could find, transform, and use data.

The analytics engineering practice grew due in large part to this advanced toolset. Tools like dbt Cloud that make it easy to transform data using standard SQL or Python code made data pipelines both less complicated and less fragile. ‌This meant that analytics engineers could take on some of the workload that previously burdened data engineering teams.

Introducing an analytics engineering practice provides numerous benefits for everyone in your data ecosystem, including:

  • Reduces data engineering backlogs
  • Improves data velocity
  • Improves data quality
  • Increases evolution of data infrastructure

Reduces data engineering backlogs

When all requests for new data projects or fixes must go through the data engineering team, that team becomes a bottleneck. We’ve seen this time again as our customers’ data needs skyrocket.

Adding analytics engineering means adding a practice focused solely on creating new data pipelines and transformations. Instead of relying on data engineering for data changes, a team can hire an analytics engineer whose sole responsibility is managing the team’s data needs.

Improves data velocity

Data engineering team members juggle multiple responsibilities. That means there’s often a large lag between requests and fulfillment of that request.

This can lead to longer-than-expected delays when rework is required. If a business user has an issue with something the central data team delivered, it means making yet another request—and returning to the back of the queue.

By contrast, an analytics engineer’s sole focus is developing new data sets for their teams. That means they can ship changes to their users in a shorter timeframe than a central data team—usually days instead of weeks or months.

Improves data quality

Another issue with a centralized data team is that they usually don’t have domain expertise in a specific team’s data. Since they respond to requests from multiple teams, they sometimes have to make their best guesses when it comes to the contents of tables, the format of individual fields, and the calculations required for numeric fields.

Analytics engineers, on the other hand, make it their jobs to understand a team’s business model and data needs in detail. This makes them more efficient at implementing their team’s requirements. Analytics engineers embedded with their teams can also get answers to hard questions more easily than members of a centralized data engineering team.

Increases evolution of data infrastructure

Data engineers can absolutely build data pipelines. That doesn’t mean it’s the best use of their time. Most data engineering teams would rather focus on providing new capabilities for analytics engineers and data analysts.

When analytics engineers take on more data pipeline work, it frees data engineers up to focus on infra. That means data engineering teams can tackle burning issues such as creating self-serve tools for initializing new data products, improving the company’s data governance capabilities, improving data query performance, and optimizing data pipelines to achieve the lowest possible cost.

Since infrastructure improvements are available to the entire company, this work often has a high return on investment. For example, consider a change that reduces the time it takes to run CI/CD jobs to move data changes to production by 10 minutes. In a company with hundreds of data pipelines, this represents significant time and cost savings.

Conclusion

If your data engineering team is beset with long queues chock full of data pipeline work, it’s time to consider launching an analytics engineering practice. Analytics engineers can produce high-quality data pipelines in less time and with fewer delays. That frees up your data engineers to invest in new data platform capabilities that benefit the entire company.

An analytics engineering practice is only possible if you have a solid data platform supporting it. With dbt Cloud as your data control plane, your analytics engineers have a standardized and cost-efficient way to build, test, deploy, and discover analytics code. Meanwhile, data consumers have purpose-built interfaces and integrations to self-serve data that's governed and actionable.

Learn more about how to launch an analytics engineering practice with dbt Cloud—ask us for a demo today.

Last modified on: Dec 13, 2024

Build trust in data
Deliver data faster
Optimize platform costs

Set your organization up for success. Read the business case guide to accelerate time to value with dbt Cloud.

Read now ›

Recent Posts