Data Engineering Podcast

The True Costs of Legacy Systems: Technical Debt, Risk, and Exit Strategies


Listen Later

Summary
In this episode Kate Shaw, Senior Product Manager for Data and SLIM at SnapLogic, talks about the hidden and compounding costs of maintaining legacy systems—and practical strategies for modernization. She unpacks how “legacy” is less about age and more about when a system becomes a risk: blocking innovation, consuming excess IT time, and creating opportunity costs. Kate explores technical debt, vendor lock-in, lost context from employee turnover, and the slippery notion of “if it ain’t broke,” especially when data correctness and lineage are unclear. Shee digs into governance, observability, and data quality as foundations for trustworthy analytics and AI, and why exit strategies for system retirement should be planned from day one. The discussion covers composable architectures to avoid monoliths and big-bang migrations, how to bridge valuable systems into AI initiatives without lock-in, and why clear success criteria matter for AI projects. Kate shares lessons from the field on discovery, documentation gaps, parallel run strategies, and using integration as the connective tissue to unlock data for modern, cloud-native and AI-enabled use cases. She closes with guidance on planning migrations, defining measurable outcomes, ensuring lineage and compliance, and building for swap-ability so teams can evolve systems incrementally instead of living with a “bowl of spaghetti.”

Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
  • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
  • Your host is Tobias Macey and today I'm interviewing Kate Shaw about the true costs of maintaining legacy systems
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • What are your crtieria for when a given system or service transitions to being "legacy"?
  • In order for any service to survive long enough to become "legacy" it must be serving its purpose and providing value. What are the common factors that prompt teams to deprecate or migrate systems?
  • What are the sources of monetary cost related to maintaining legacy systems while they remain operational?
  • Beyond monetary cost, economics also have a concept of "opportunity cost". What are some of the ways that manifests in data teams who are maintaining or migrating from legacy systems?
    • How does that loss of productivity impact the broader organization?
  • How does the process of migration contribute to issues around data accuracy, reliability, etc. as well as contributing to potential compromises of security and compliance?
  • Once a system has been replaced, it needs to be retired. What are some of the costs associated with removing a system from service?
  • What are the most interesting, innovative, or unexpected ways that you have seen teams address the costs of legacy systems and their retirement?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on legacy systems migration?
  • When is deprecation/migration the wrong choice?
  • How have evolutionary architecture patterns helped to mitigate the costs of system retirement?
Contact Info
  • LinkedIn
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
Links
  • SnapLogic
  • SLIM == SnapLogic Intelligent Modernizer
  • Opportunity Cost
  • Sunk Cost Fallacy
  • Data Governance
  • Evolutionary Architecture
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
...more
View all episodesView all episodes
Download on the App Store

Data Engineering PodcastBy Tobias Macey

  • 4.5
  • 4.5
  • 4.5
  • 4.5
  • 4.5

4.5

140 ratings


More shows like Data Engineering Podcast

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

291 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

The Cloudcast by Massive Studios

The Cloudcast

155 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

588 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

41 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

214 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

984 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

142 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

96 Listeners