The MAD Podcast with Matt Turck

Top AI Researcher on GPT 4.5, DeepSeek and Agentic RAG | Douwe Kiela, CEO, Contextual AI


Listen Later

Retrieval-Augmented Generation (RAG) has become a dominant architecture in modern AI deployments, and in this episode, we sit down with Douwe Kiela, who co-authored the original RAG paper in 2020. Douwe is now the founder and CEO of Contextual AI, a startup focusing on helping enterprises deploy RAG as an agentic system.


We start the conversation with Douwe's thoughts on the very latest advancements in Generative AI, including GPT 4.5, DeepSeek and the exciting paradigm shift towards test time compute, as well as the US-China rivalry in AI.


We then dive into RAG: definition, origin story and core architecture. Douwe explains the evolution of RAG into RAG 2.0 and Agentic RAG, emphasizing the importance of self-learning systems over individual models and the role of synthetic data. We close with the challenges and opportunities of deploying AI in real-world enterprise, discussing the balance between accuracy and the inherent inaccuracies of AI systems.



Contextual AI

Website - https://contextual.ai

X/Twitter - https://x.com/ContextualAI


Douwe Kiela

LinkedIn - https://www.linkedin.com/in/douwekiela

X/Twitter - https://x.com/douwekiela


FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck


(00:00) Intro

(01:57) Thoughts on the latest AI models: GPT-4.5, Sonnet 3.7, Grok 3

(04:50) The test time compute paradigm shift

(06:47) Unsupervised learning vs reasoning: a false dichotomy

(07:30) The significance of DeepSeek

(10:29) USA vs. China: is the AI war overblown?

(12:19) Controlling AI hallucinations at the model level

(13:51) RAG: definition and origin story

(18:46) Why the Transformers paper initially felt underwhelming

(20:41) The core architecture of RAG

(26:06) RAG vs. fine-tuning vs. long context windows

(30:53) RAG 2.0: Thinking in systems and not models

(31:28) Data extraction and data curation for RAG

(35:59) Contextual Language Models (CLMs)

(38:04) Finetuning and alignment techniques: GRIT, KTO, LENS

(40:40) Agentic RAG

(41:36) General vs. specialized RAG agents

(44:35) Synthetic data in AI

(45:51) Deploying AI in the enterprise

(48:07) How tolerant are enterprises to AI hallucinations?

(49:35) The future of Contextual AI

...more
View all episodesView all episodes
Download on the App Store

The MAD Podcast with Matt TurckBy Matt Turck

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

17 ratings


More shows like The MAD Podcast with Matt Turck

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,272 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,022 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

513 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

213 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

8,902 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

379 Listeners

The Logan Bartlett Show by by Redpoint Ventures

The Logan Bartlett Show

188 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

122 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

77 Listeners

More or Less by Dave Morin, Jessica Lessin, Brit Morin, and Sam Lessin

More or Less

85 Listeners

Crucible Moments by Sequoia Capital

Crucible Moments

91 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

454 Listeners

AI + a16z by a16z

AI + a16z

30 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

21 Listeners

Training Data by Sequoia Capital

Training Data

40 Listeners