/ /
The changing role of the analyst: Getting closer to the data source

The changing role of the analyst: Getting closer to the data source

Kathryn Chubb

on Jun 05, 2025

As the demand for high-quality data accelerates, driven in large part by generative AI, data teams face mounting pressure to deliver more, faster. To do this, many analysts - from business-savvy analysts and power users to data scientists - are turning more to self-serve tools, such as low-code data transformation pipelines and AI-powered tooling to directly query data sources. However, that’s creating tension between speed and good data governance. In any given company, there are usually multiple data analysts per data engineer. This makes it unrealistic for every data request to flow through the data engineering team. With long request queues and competing priorities, engineers simply don’t have the bandwidth to support every ad-hoc query or model analysts need.

This means more analysts are using self-serve tooling, and in many cases, spinning up their own data marts to work around engineering bottlenecks. But when these workflows live outside of governed pipelines, relying on inconsistent logic and unstructured data assets, they introduce serious risks, including duplicated work, data security concerns, and rising cloud costs from redundant or unmanaged assets.

The solution isn't more dashboards. It's empowering the right analysts with the right self-service tools, without sacrificing governance. We'll look at how the role of the analyst is changing in response to this demand, and how companies can use governed collaboration to increase data velocity without compromising on quality.

The evolving role of the analyst

Data analysts are increasingly taking a more active role in shaping data within their companies. This is driven by both business demands for data and changes in the underlying technology. As a result, analysts are:

  • Getting closer to raw and modeled data sources
  • Becoming more familiar with data tooling
  • Incorporating AI

Let’s take a look at each of these areas in detail and what’s driving them.

Getting closer to raw and modeled data sources

Data quality is still a leading concern across industries. In dbt Labs’ 2024 State of Engineering Analytics report, we found 57% of data professionals citing data quality as the largest data-related issue. That’s up from 41% in 2022.

Companies are expecting analysts to be more than just passive users of data. Analysts are increasingly expected to have the skills and tools to verify that the data sets they have are accurate, up to date, and have been properly cleaned for business use.

The need for more high-quality data is also driving analysts to seek out useful data sources they can incorporate into their work. Data silos, islands of data that are independent from and often incompatible with more governed and highly structured data, are still a vexing issue plaguing most companies. Data analysts play a pivotal role in helping to find and transform this data to ensure that it's compatible with the company's governed datasets.

Becoming more familiar with data tooling

In the past, data pipelines were solely the province of data engineering teams. They were often written in different languages, hidden away in stored procedure code in a database or data warehouse. They were as hard to find as they were to use and manage.

Today, with tools like dbt, anyone with knowledge of SQL or Python can contribute to data transformation code. dbt provides a common and governed approach to data transformation backed by software development best practices like documentation, version control, and testing.

As a result, analysts are becoming more familiar with the technical tools required to create and maintain data pipelines, including source control systems such as Git. That enables data engineers and analysts to collaborate on analytics code, data tests, documentation, and data metrics in ways that weren’t previously possible.

In other words, analysts, who were always quite technical, are becoming even more so.

Incorporating AI

dbt Labs co-founder Tristan Handy has noted how AI is disrupting the way we do data engineering. The advent of AI means that analysts can do more and do it more quickly than ever before:

  • Beginner analysts can query data using natural language prompts to a large language model (LLM), which can translate their requests into SQL and run the results for them against the source systems
  • Experienced analysts can use AI to help them automatically develop complex queries that otherwise might take time to develop and debug fully
  • All analysts can leverage AI to generate boilerplate code for new data pipelines and tests, as well as base documentation for data models

None of this, of course, means that an analyst can rely solely on AI to create high-quality reports and data products. AI augments a skilled analyst by helping them do more in less time.

The challenges that analysts face

All this means that, more than ever, data analysts can dive headlong into data and find the answers they need without waiting on an already overtaxed data engineering team. However, analysts also run multiple risks when dealing directly with ungoverned and unstructured data:

No mechanisms to ensure data quality. Data stored in multiple systems often isn’t rationalized or harmonized. It may exist in different formats across different data stores. Key data values - e.g., revenue - may even differ from system to system, leading to doubts around which system is the “source of truth.”

Missing (or unavailable) metadata. Ungoverned data often lacks appropriate or complete metadata - data about data. This can include technical metadata (tables, columns, data types, relationships, last update time, upstream source) as well as business metadata (owner, description, method of calculation, business meaning, and usage). Without metadata, it can be difficult to tell who’s responsible for a given dataset or how certain values were calculated.

Documentation is light or nonexistent. A critical form of metadata is documentation about the meaning and purpose of a given dataset. Documentation is critical for collaborating across roles. However, without a tool that supports documenting datasets in a data model, such rich metadata might be impossible to capture.

The tools that data analysts can use to collaborate on governed data

In the end, data analysts are concerned primarily with delivering high-quality data and insights to their stakeholders as quickly as possible. It’s the job of a company’s data engineering and central governance teams to set standards and monitor data to ensure that this data is well-governed, secure, and compliant.

With the right tools, however, data analysts and the company’s data governance team can work together to turn ungoverned and unstructured data into governed, structured data sets that set new bars for quality, security, and compliance.

dbt is a data control plane that centralizes your analytics workflow metadata so that teams can ship and use trusted data, faster. Using dbt, analysts and data governance specialists can collaborate on creating governed data sets by leveraging:

Easy model building. dbt Canvas is a visual tool that any analyst can use to contribute to data models. Using Canvas’ visual, drag-and-drop experience and built-in AI, analysts can create model changes that compile to SQL with all the benefits of dbt, including version control, orchestration, and discovery.

Discovery. dbt provides access to all of a company’s data transformation models and associated metadata via dbt Catalog. This feature provides a full view of your data estate, including non-dbt data objects in Snowflake. Engineers, analysts, and business decision-makers can collaborate on code and documentation as part of a single collaborative workflow.

dbt supports writing documentation as an intrinsic part of each data model. Once a data pipeline is pushed to production, analysts can find a governed dataset, examine its metadata, and read its associated documentation before putting it to use.

Data insights. With dbt Insights, analysts can freely query data against all models available to them via dbt Catalog. Analysts can write SQL queries from scratch or use AI to generate new queries from natural language prompts. This further expands data availability by making it available to users regardless of their technical skills. Insights is available in dbt Catalog, Canvas, Studio IDE, and Semantic Layer.

Data lineage. Data lineage provides a visual map of the journey that data takes through your company’s data estate. dbt automatically generates and ships data lineage for all of your data models.

Using these data lineage maps, analysts can answer questions about the origin and downstream dependencies of data without filing a support ticket against the data engineering team. If analysts detect an issue in the data, they can report it, and data engineers can use lineage to find and fix the error upstream.

AI-powered workflows. dbt Copilot is our AI-powered solution that supports engineers, analysts, and business users at every step of the data lifecycle. Analysts can leverage Copilot to write SQL queries, create new data tests, and generate documentation for dozens or even hundreds of data models.

To learn more about how dbt supports governed collaboration for tomorrow’s data solutions, request a demo today.

Last modified on: Jun 05, 2025

2025 dbt Launch Showcase

Catch our Showcase launch replay to hear from our executives and product leaders about the latest features landing in dbt.

Set your organization up for success. Read the business case guide to accelerate time to value with dbt.

Read now

Share this article
The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

100,000+active members
50k+teams using dbt weekly
50+Community meetups