From Informatica to dbt: A migration path to an AI-ready data control plane

last updated on Oct 09, 2025

Legacy ETL tools like Informatica served their purpose for decades. The demands of AI, however, are making the limitations of this legacy tooling impossible to ignore.

The old era of ETL tooling—from Oracle and Informatica to products like IBM DataStage and Matillion—did a good job of solving yesterday’s problems. But they were born in a different time when storage was expensive and everything was locked in on-premise data centers. They operated on timelines of weeks or quarters.

Today’s reality—one that involves cloud data warehousing, elastic compute, domain team ownership, AI initiatives—is drastically different. Today, the companies that win ship solutions in hours or days, not weeks.

Legacy ETL tools can’t deliver this. And the technical debt inherent in their existing data pipelines is dragging your business down.

Solving this requires more than just swapping out tools. It requires rethinking how data teams deliver value and how everyone—not just engineers—works with data. It also requires a carefully considered and executed migration strategy to succeed.

In this article, we’ll dig into what that means, how some companies have tried solving this problem, and how moving to an AI-ready data control plane positions you to compete in today’s modern business landscape.

Watch the on-demand webinar: From Informatica to dbt: A migration path to an AI-ready data control plane

Why legacy ETL tools can't keep pace with modern demands

The data landscape has transformed dramatically over the past decade. Legacy pipelines and stored procedures that once powered analytics workflows now represent more than technical debt. They actively slow organizational change, obscure risk, and inflate TCO.

When business logic is buried in black-box ETL jobs, every enhancement becomes a risk. Every incident takes longer to debug. Every audit becomes a complex undertaking.

The teams that succeed today are those that can ship insights in hours or days rather than weeks or quarters. They’re the teams that build trust through transparent and testable code.

The architectural center of gravity has shifted from heavy standalone ETL tools to in-database transformation. Cloud data warehouses provide elastic compute, data products are owned by domain teams across the organization, and AI initiatives demand governed, explainable, high-quality data.

The tools and requirements have both evolved. But the requirements have changed even more than the tools have. Boards and regulators now ask where numbers come from and demand full lineage and testing. Business units run weekly experiments and expect data to move at that cadence. Costs must map to ROI with precision.

Business participation in data workflows is no longer optional. Data teams aren't order-taking factories. They’re enablers helping stakeholders build and act safely.

The hidden cost of tool sprawl

As organizations grow, different teams inevitably adopt different tools, each working in isolation:

Architects might use WhereScape
Data engineers work in Informatica or Matillion
Analytics engineers prefer Alteryx or Talend
BI developers rely on Tableau or Power BI
Analysts still turn to Excel

Each tool has its own workflow, terminology, and implementation patterns. And the result is, predictably, chaos:

The same key performance indicator gets implemented five different ways across the organization
There's no single place to review logic, no single source of truth
Trust erodes as teams argue over which monthly revenue number is correct—and according to their own calculations, they're all right

The modernization opportunity

However, this fragmentation also represents a critical opportunity. Modernization isn't just about swapping tools. It's about redesigning how teams deliver value and how the business participates in data workflows.

Organizations that successfully modernize unlock measurable advantages across four key dimensions.

Time to value. Time to value accelerates dramatically when work becomes modular, tested, and reviewed. Projects that once took quarters can be completed in weeks.

Cost reduction. Cost and complexity decrease substantially—many organizations see 50 to 80 percent lower transformation costs by consolidating on in-database processing, using compute efficiently, and eliminating duplicate workflows running the same calculations across different tools.

Resilient operations. Operations become more resilient as incident numbers drop and recovery happens faster. When lineage, tests, and logs show exactly what changed, when it changed, and where it changed, troubleshooting transforms from guesswork to precision.

Strategic reinvestment. Perhaps most importantly, the savings from modernization fund growth initiatives like AI.

AI only works when data is high-quality, governed, documented, and explainable. Trusted data models result in safer and more accurate retrieval-augmented generation (RAG) supplementation, AI copilots that understand your data based on your DAG and your data test suites, and AI agents that can take action because the underlying semantics and policies for your data are codified and clear.

Why dbt has become the modernization standard

dbt has been around since before the AI boom. That’s because we saw the need years ago to transition from traditional ETL systems like Informatica to a more modern method of data processing.

dbt does this by bringing software engineering best practices to analytics. It enables companies to implement an Analytics Development Lifecycle (ADLC), similar to the Software Engineering Lifecycle, where data producers, analysts, and stakeholders collaborate iteratively on building and shipping high-quality data pipelines.

With dbt, teams build robust transformations using modular SQL or Python models, using a cross-vendor data model syntax that transforms data wherever it lives in your organization. Data engineers manage production pipelines with built-in Continuous Integration and Deployment (CI/CD), testing, and governance to ensure ongoing data quality and safety. The dbt Fusion engine enables context-aware development that accelerates delivering data faster and with higher quality.

For data engineers transitioning from Informatica, this represents a fundamental unlock:

Teams gain shared context, so downstream partners aren't blocking progress
Engineers spend less time refactoring or rewriting logic from fragmented tools
High-performing data teams build like software teams: modular code, test-driven development, governance, and iteration built into every workflow.

dbt is different because, unlike proprietary ETL systems, it treats data like code. That includes version control, CI/CD, tests shipped alongside pipelines rather than maintained separately, and automatically-generated documentation.

The result is more innovation with the confidence to ship often, because guardrails catch issues early. Trust isn't an afterthought—it's engineered into every step.

dbt: A data control plane that defeats tool sprawl

dbt isn’t a new kind of ETL system. Rather, it’s a fundamentally different way of managing data - one that is flexible, cross-platform, collaborative, and focused on producing trustworthy outputs.

Think of dbt as a one-stop data control plane for your data. Capabilities that once required gluing together multiple tools—orchestration, observability, cost management, catalog, and semantic layer—are integrated in a single platform with deeply connected metadata that drives results.

dbt is the data control plane for everybody - not just data engineers. It integrates with the most popular BI tools and AI systems, enabling easy analysis. The built-in dbt Semantic Layer centralizes metric definitions using common business language, ensuring both consistency across the organization and accuracy for AI-derived answers.

To encourage data democratization, dbt supports multiple tools for finding and transforming data. These include dbt Studio and Visual Studio Code for developers, dbt Canvas for analysts and tech-savvy stakeholders, and dbt Catalog for decision-makers who need to find the right data fast.

Finally, dbt uses AI to build for AI. dbt Copilot leverages AI to facilitate working with data - from constructing complex SQL queries for data models to helping stakeholders analyze datasets with natural language queries.

De-risking the migration path

That brings us to the million-dollar question: how to migrate from traditional ETL systems to a modern solution like dbt.

Migration has a reputation. And it…isn’t great. When you hear “migration,” you likely think of drawn-out projects that finish over time and over budget. If they finish at all.

Fortunately, these days, your data teams no longer need to disappear for years to make the jump to a new data platform. dbt and partners such as Infinite Lambda have developed proven playbooks for iterative migrations that deliver value quickly.

Modern tech also gives you more tooling for migration than we’ve ever had previously. AI-assisted migration tooling accelerates pattern discovery and code translation. With dbt, which is SQL-versant, you can reuse valuable SQL and business logic rather than discarding it. This means that institutional knowledge becomes visible and testable rather than lost.

It’s also easier to find the resources you need. Because dbt is an industry standard, organizations tap into a large talent pool, making hiring and scaling easier. That all spells less risk, faster wins, and a runway to transformative AI projects.

Choosing the right migration approach

Three primary migration approaches exist, each with distinct tradeoffs:

Lift-and-shift. A lift-and-shift code migration attempts to translate code with automation tools.

This offers perceived speed and simplicity with relatively low risk, since business logic doesn't change. However, this approach can't take full advantage of the benefits of the cloud or reduce existing technical debt. Perceived speed can be misleading once you factor in validation time.

Full rewrite. At the opposite end, a full rewrite creates truly cloud-native solutions but requires enormous investment of time, money, and effort.

For anyone who’s attempted a complete platform rewrite, alarm bells ring immediately. Completely rewriting logic requires understanding code sometimes written over decades by people no longer with the organization. That makes the risk here substantial.

Replatforming and refactoring. This is the middle path and often the most viable. This lifts and shifts data and logic, then incorporates best practices and looks for improvements. A human-led approach uses code automation for straightforward translations while investing time in thoughtful refactoring, whether AI-assisted or purely human-powered.

The critical importance of assessing the three Vs

Before attempting any migration, a comprehensive assessment is essential. This means understanding exactly what exists in legacy pipelines—i.e., taking skeletons out of the closet.

Teams examine data sources and destinations, analyze pipeline complexity, and score each pipeline with complexity points to estimate the human effort required. Converting a representative pipeline helps translate complexity scores into real human terms.

Understanding the three Vs—volume, velocity, and variety—enables appropriate sizing of the new data architecture and accurate cost profiling. Identifying optimization opportunities is critical. These include finding code duplication where dbt macros can substantially reduce the application footprint, providing long-term maintenance benefits.

Flowline + dbt for successful migration

Even with the best planning, however, success isn’t guaranteed. Statistics say that only one in three cloud migrations succeeds.

Recognizing this, Infinite Lambda developed Flowline. Flowline is a packaged software and service solution designed to help enterprises modernize legacy infrastructure from Talend, Informatica, and SQL Server SSIS to an AI-ready platform in weeks rather than months or years.

Flowline's approach centers on a human-led methodology heavily assisted by automation and AI, distinguishing it from both purely manual migrations and fully automated approaches. The solution follows a four-step process that balances speed with quality and reduces risk at every stage:

Deterministic code conversion

Flowline extracts legacy pipelines and converts them using deterministic code—notably not AI—to dbt. This deliberate choice stems from practical considerations: deterministic code is cheaper to run and produces identical results every time, allowing the system to bake in best practices consistently. This step achieves approximately 95 percent automatic code conversion, representing a 10x improvement over manual rewrites.

Validation and reconciliation

This is arguably the most critical phase. Flowline compares the resulting dbt models against production-quality data through an automation-assisted, human-led process.

Some organizations underestimate this phase. Infinite Lambda’s learned through years of experience that conversion is the easy part. It’s testing, validating data, understanding differences, and securing stakeholder sign-off that consume the most time.

At the end of this step, pipelines are nearly 100 percent reconciled. This phase accounts for any differences introduced by moving to new data warehouse platforms.

Refactoring

With a stable baseline established, Flowline applies proprietary AI with human oversight to refactor code for improved performance, reduced code size, and lower costs. This refactoring happens confidently because the previous step has already validated correctness.

Onboarding

The process concludes with rapid onboarding that trains both technology teams and business users while providing comprehensive change management to ensure complete adoption rather than leaving a migrated platform isolated.

Real-world results from modernization

Multiple companies have used Flowline to streamline their migration process from Informatica to dbt. The result is substantial cost savings with better data quality.

Macif, a French insurer, saved substantial licensing costs compared to their traditional ETL systems and dramatically improved operational efficiency. Operations teams now go home on time because pipelines run substantially faster. One pipeline saw its runtime reduced from over two hours to under five minutes through migration and refactoring.

AstraZeneca is positioned to save USD 40 million in total cost of ownership across personnel costs and licensing fees. Their AI-ready platform now powers better data experiences and helps deliver on their core mission of discovering new drugs.

Addressing common migration concerns

Three questions frequently arise when initiating migration conversations:

Isn't migration just a rewrite?

Moving to dbt is more than refactoring SQL code from one product to another. The complete platform acts as a data control plane leveraged across teams, consolidating existing tooling and enabling safe, governed data consumption across BI tools and AI systems.

Will we just get locked in again?

Platform flexibility is essential. dbt is designed to avoid lock-in.

The tool sprawl era scattered data across multiple vendors, with analytics logic duplicated across tools. This led to tight coupling and high switching costs.

The SQL standardization era consolidated analytics in dbt and SQL with centralized transformation logic. While this created scalable workflows, it locked organizations into specific cloud providers.

Today, over half of dbt customers work across multiple data platforms. 81 percent of enterprises use more than one cloud provider. With native hosting, organizations use dbt on their cloud provider of choice—AWS, Azure, or GCP—running close to their data without compromising governance, security, or latency. Support for Apache Iceberg and catalog integrations for select adapters provides flexibility across compute engines.

This isn't about checking boxes—it's about giving teams choice without penalty.

What about ROI?

dbt drives value across tooling and maintenance costs, efficiency gains, and direct business benefits. Organizations reduce current tooling costs, achieve cloud platform savings, and improve operational efficiency by freeing engineer and analyst time for core business work. That accelerates innovation around your company’s strategic objectives.

Moving forward

Modernization from legacy ETL tools to an AI-ready data control plane represents more than a technology upgrade. It’s a shift in the way your company interacts with data.

That leap forward can be daunting. Even frightening. However, with proven migration approaches, comprehensive assessments, and platforms purpose-built for modern analytics workflows, the path forward is clearer than ever.

The journey begins by figuring out where you’re already at. To get started, take the Infinite Lambda online migration readiness assessment and plan out the first steps of your journey to AI-ready data.

Live virtual event:

Experience the dbt Fusion engine with Tristan Handy and Elias DeFaria on October 28th.

Save your seat

VS Code Extension

The free dbt VS Code extension is the best way to develop locally in dbt.

Install free extension

Latest posts

Insights7 min

AI unlock: Empowering future-ready analysts

Daniel Poppy

on Oct 27, 2025

Insights6 min

The governance gap: How shadow AI is already reshaping analytics

Daniel Poppy

on Oct 20, 2025

Company13 min

Coalesce 2025: Rewriting the future of data, analytics, and AI

David Tishgart

on Oct 14, 2025

The dbt Community

Join the largest community shaping data

The dbt Community is your gateway to best practices, innovation, and direct collaboration with thousands of data leaders and AI practitioners worldwide. Ask questions, share insights, and build better with the experts.

Join the Community Explore the community

100,000+active members

50k+teams using dbt weekly

50+Community meetups