Latent Space: The AI Engineer Podcast

[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify


Listen Later

From investing through the modern data stack era (DBT, Fivetran, and the analytics explosion) to now investing at the frontier of AI infrastructure and applications at Amplify Partners, Sarah Catanzaro has spent years at the intersection of data, compute, and intelligence—watching categories emerge, merge, and occasionally disappoint. We caught up with Sarah live at NeurIPS 2025 to dig into the state of AI startups heading into 2026: why $100M+ seed rounds with no near-term roadmap are now the norm (and why that terrifies her), what the DBT-Fivetran merger really signals about the modern data stack (spoiler: it's not dead, just ready for IPO), how frontier labs are using DBT and Fivetran to manage training data and agent analytics at scale, why data catalogs failed as standalone products but might succeed as metadata services for agents, the consumerization of AI and why personalization (memory, continual learning, K-factor) is the 2026 unlock for retention and growth, why she thinks RL environments are a fad and real-world logs beat synthetic clones every time, and her thesis for the most exciting AI startups: companies that marry hard research problems (RAG, rule-following, continual learning) with killer applications that were simply impossible before.

We discuss:

  • The DBT-Fivetran merger: not the death of the modern data stack, but a path to IPO scale (targeting $600M+ combined revenue) and a signal that both companies were already winning their categories

  • How frontier labs use data infrastructure: DBT and Fivetran for training data curation, agent analytics, and managing increasingly complex interactions—plus the rise of transactional databases (RocksDB) and efficient data loading (Vortex) for GPU-bound workloads

  • Why data catalogs failed: built for humans when they should have been built for machines, focused on discoverability when the real opportunity was governance, and ultimately subsumed as features inside Snowflake, DBT, and Fivetran

  • The $100M+ seed phenomenon: raising massive rounds at billion-dollar valuations with no 6-month roadmap, seven-day decision windows, and founders optimizing for signal ("we're a unicorn") over partnership or dilution discipline

  • Why world models are overhyped but underspecified: three competing definitions, unclear generalization across use cases (video games ≠ robotics ≠ autonomous driving), and a research problem masquerading as a product category

  • The 2026 theme: consumerization of AI via personalization—memory management, continual learning, and solving retention/churn by making products learn skills, preferences, and adapt as the world changes (not just storing facts in cursor rules)

  • Why RL environments are a fad: labs are paying 7–8 figures for synthetic clones when real-world logs, traces, and user activity (à la Cursor) are richer, cheaper, and more generalizable

  • Sarah's investment thesis: research-driven applications that solve hard technical problems (RAG for Harvey, rule-following for Sierra, continual learning for the next killer app) and unlock experiences that were impossible before

  • Infrastructure bets: memory, continual learning, stateful inference, and the systems challenges of loading/unloading personalized weights at scale

  • Why K-factor and growth fundamentals matter again: AI felt magical in 2023–2024, but as the magic fades, retention and virality are back—and most AI founders have never heard of K-factor

Sarah Catanzaro

  • X: https://x.com/sarahcat21

  • Amplify Partners: https://amplifypartners.com/

Where to find Latent Space

  • X: https://x.com/latentspacepod

  • Substack: https://www.latent.space/

Chapters
  • 00:00:00 Introduction: Sarah Catanzaro's Journey from Data to AI
  • 00:01:02 The DBT-Fivetran Merger: Not the End of the Modern Data Stack
  • 00:05:26 Data Catalogs and What Went Wrong
  • 00:08:16 Data Infrastructure at AI Labs: Surprising Insights
  • 00:10:13 The Crazy Funding Environment of 2024-2025
  • 00:17:18 World Models: Hype, Confusion, and Market Potential
  • 00:18:59 Memory Management and Continual Learning: The Next Frontier
  • 00:23:27 Agent Environments: Just a Fad?
  • 00:25:48 The Perfect AI Startup: Research Meets Application
  • 00:28:02 Closing Thoughts and Where to Find Sarah

...more
View all episodesView all episodes
Download on the App Store

Latent Space: The AI Engineer PodcastBy swyx + Alessio

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

92 ratings


More shows like Latent Space: The AI Engineer Podcast

View all
The a16z Show by Andreessen Horowitz

The a16z Show

1,092 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

345 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

225 Listeners

Practical AI by Practical AI LLC

Practical AI

202 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

201 Listeners

Last Week in AI by Skynet Today

Last Week in AI

309 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

98 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

531 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

512 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

141 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

226 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

637 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

464 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners