/ /
AI agents and the data lake

AI agents and the data lake

Daniel Poppy

last updated on Jan 12, 2026

One of the interesting commonalities of AI and the data lake is that they both require new thinking around how we manage identity. For AI, the big question is how do agents interact with underlying data? For the data lake, the big question is how do we make open data stored outside the purview of any given data platform act like you’d expect?

In this episode of The Analytics Engineering Podcast, Tristan talks with Lauren Anderson, who leads the enterprise data platform at identity company Okta. Lauren discusses how identity sits at the center of two seismic shifts in data—AI agents and the open data lake—and why central governance and a shared semantic layer are critical. She lays out how analytics engineers and data engineers should divide responsibilities as agents begin to write a growing share of analytical queries.

A lot has changed in how data teams work over the past year. We’re collecting input for the 2026 State of Analytics Engineering Report to better understand what’s working, what’s hard, and what’s changing. If you’re in the middle of this work, your perspective would be valuable.

Take the survey

Please reach out at podcast@dbtlabs.com for questions, comments, and guest suggestions.

Listen & subscribe from:

Key takeaways

Tristan Handy: Before we dive into the current day, can you share a little bit about your background and how you came to the role that you’re in today.

Lauren Anderson: I’ve had a 20‑something year career at this point. I have basically spent my entire career in analytics some way, but my first data job was at a big bank. I won’t name it. There’s only a few big banks you could probably guess. I worked for the finance org and I did compensation planning and administration, with a side of sales tracking and analytics. I was part database analyst, part customer support for people that made a lot more money than I did.

I was there for seven, seven and a half, eight years. Towards the end of it, I became the owner and creator and almost business architect for our brand‑new sales tracking data warehouse. At a very young age, I got to think about how relational databases should come together for the outcome of both analytics and reporting—dashboards and whatnot—but also operations, which was paying compensation every month. It got me super excited about this world of data and being able to architect pipelines and the end‑to‑end flow for real‑world outcomes.

What do you think allowed you to be successful in that era? I often think the things that enabled success then aren’t the same as what make data folks successful today.

When I took it over, we ran compensation out of an Access database. I was new, the person who designed it left, and there wasn’t much documentation. It worked the first month, then broke the second—right before a payroll deadline. I rebuilt it as a long series of SQL queries with inline comments and step‑by‑step checks that produced a clean file. That willingness to throw away the brittle thing and rebuild with clarity and documentation gave me early success. The meta‑skills:ability to learn, take chances, and figure out the best path—still apply, but the technology is completely different now.

You’ve split time at Okta into two stints. How would you characterize the work?

Okta was my first truly B2B company. I realized quickly B2B data is my sweet spot. I love thinking about customers as businesses and how business users interact with our products and features. Okta data is complex—many products, features, and highly configurable use cases—especially with large customers. That variety is exciting. In simpler retail flows you see a lot of the same patterns; in B2B, the variety is the appeal.

What’s your current role?

I lead our enterprise data platform, engineering, and architecture function. For enterprise data used to make business decisions, we own ingestion into the warehouse, transformations, and delivery—dashboards, reverse ETL to third‑party applications, other data stores, and internal apps.

How big is the central function and how do you engage with the business?

We’re about 50 people across data engineering and analytics/data science in a company south of 7,000 employees. We support every business unit. Engagement spans a maturity curve. One end is platform self‑service: teams land data via approved connectors, build transformations in dbt on our implementation, and build dashboards in Tableau we administer. Governance and roles are defined centrally, and teams assign people to those roles. The other end is a white‑glove model where we partner through the full lifecycle—question, discover existing assets, requirements, data work, build, interpretation, validation, and end‑of‑life of the data product. Our sweet spot is the middle: we own enterprise “gold” pipelines for company‑level metrics—monitored and governed—while domains build and later graduate via a path‑to‑production under stronger governance.

Okta is known for identity and security. How does security‑first actually work in practice?

Reinventing controls every time slows you down. We invest in repeatable frameworks. Any new source goes through third‑party risk review, classification, and decisions on masking or exclusions. We help teams through that; after a couple times, they can engage directly with risk while we stay in the loop and monitor. As our classifications and expectations got clearer, review cycles shrank from weeks to days. It’s not all roses—it takes time—but we all operate as security practitioners. That shared mindset builds trust and reduces corner‑cutting.

How much do users need to know?

We don’t expect everyone to know everything. We provide dbt frameworks and minimum testing standards, plus SMEs to guide teams. The culture is to ask when unsure.

Will agents write more analytical queries than humans in the next 12–24 months?

Macro, yes. For us, more like 24–36 months because we’re careful. The key is safe, ethical AI consistent with being a security company.

How are you thinking about agent access?

Central governance. Ideally, agents query centralized, agent‑ready stores. Run governance once: policies, roles for users and for data, tracking and logging on a central plane. The semantic layer is essential. Creating semantic views must get easier and more automated, and semantics should inform policy application.

Why are agents different from humans in access patterns?

Row‑level security to the extreme. Conversational intelligence data should be limited to what the requesting user can access. Aggregations could be broadly accessible with anonymization, but detailed content should remain constrained. You might also limit allowed functions on large unstructured objects. Identity for agents matters—Okta Secures AI looks at distinct identity patterns to secure agents across applications.

Where are you with MCP and agent building?

Early, building support and insight use cases. Progress is fast, but nothing broad in production yet.

How should analytics engineers and data engineers participate?

Analytics engineers should own semantics—tooling, vendor choices, onboarding use cases, and the shared business language. Data engineers should optimize for consistency and scale, notice overlap across agents, and provide a platform others can build on with confidence in governance and security.

Will you standardize an agent development platform?

Yes, in partnership with engineering and shared services. Our current pull skews to the business, so we’re leaning toward accessible, governed platforms that serve both business and engineering with central governance.

Any assumptions you’re rethinking?

Treating everything like a relational model. Many initial agent questions are intentionally simple, where speed and reasonable accuracy trump perfect sophistication. The important thing is to start, observe, and mature.

Chapters

00:02:28 — From bank analytics to owning a sales DW

00:05:00 — Rebuilding brittle Access → SQL with documented checks

00:08:30 — Ops accountability then vs. optimization today

00:11:00 — TripIt, marketing analytics, and moving into tech

00:13:14 — Why B2B data became Lauren’s sweet spot

00:16:00 — Current role: ingestion → transform → delivery at Okta

00:18:10 — Operating models across business units and the path to production

00:22:20 — Security-first in practice: repeatable frameworks over friction

00:24:23 — Third‑party risk, classification, and shrinking review cycles

00:28:00 — Policies, masking, and the need for a central governance plane

00:30:20 — Frameworks for dbt, testing, and SME guidance

00:32:11 — Will agents outwrite humans? Macro yes; Okta timeline nuance

00:33:48 — Central governance and agent access patterns

00:37:19 — Semantic layer as bridge and policy carrier

00:41:00 — Function limits on unstructured data and Okta Secures AI

00:42:35 — Early MCP experimentation and support use cases

00:43:03 — Roles: analytics engineers (semantics) and data engineers (scale)

00:46:10 — Enabling an org-wide agent platform with shared governance

00:47:43 — Solve governance once, serve business and engineering

00:49:30 — Simpler questions first; rethinking relational assumptions

VS Code Extension

The free dbt VS Code extension is the best way to develop locally in dbt.

Share this article
The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

100,000+active members
50k+teams using dbt weekly
50+Community meetups