Inference by Turing Post

Beyond the Hype: What Silicon Valley Gets Wrong About RAG. Amr Awadallah, founder & CEO of Vectara


Listen Later

In this episode of Inference, I sit down with Amr Awadallah – founder & CEO of Vectara, founder of Cloudera, ex-Google Cloud, and the original builder of Yahoo’s data platform – to unpack what’s actually happening with retrieval-augmented generation (RAG) in 2025.

We get into why RAG is far from dead, how context windows mislead more than they help, and what it really takes to separate reasoning from memory. Amr breaks down the case for retrieval with access control, the rise of hallucination detection models, and why DIY RAG stacks fall apart in production.

We also talk about the roots of RAG, Amr’s take on AGI timelines and what science fiction taught him about the future.

If you care about truth in AI, or you're building with (or around) LLMs, this one will reshape how you think about trustworthy systems.

Did you like the episode? You know the drill:

 📌 Subscribe for more conversations with the builders shaping real-world AI.

 💬 Leave a comment if this resonated.

 👍 Like it if you liked it.

 🫶 Thank you for watching and sharing!

Guest:

Amr Awadallah, Founder and CEO at Vectara

https://www.linkedin.com/in/awadallah/

https://x.com/awadallah

https://www.vectara.com/

📰 Want the transcript and edited version?

Subscribe to Turing Post: https://www.turingpost.com/subscribe

Chapters

00:00 – Intro

00:44 – Why RAG isn’t dead (despite big context windows)

01:59 – Memory vs reasoning: the case for retrieval

02:45 – Retrieval + access control = trusted AI

06:51 – Why DIY RAG stacks fail in production

09:46 – Hallucination detection and guardian agents

13:14 – Open-source strategy behind Vectara

16:08 – Who really invented RAG?

17:30 – Can hallucinations ever go away?

20:27 – What AGI means to Amr

22:09 – Books that shaped his thinking

Turing Post is a newsletter about AI's past, present, and future. Publisher Ksenia Se explores how intelligent systems are built – and how they’re changing how we think, work, and live.

Sign up (Jensen Huang is already in): https://www.turingpost.com

Things mentioned during the interview:

Hughes Hallucination Evaluation Model (HHEM) Leaderboard https://huggingface.co/spaces/vectara/leaderboard

HHEM 2.1: A Better Hallucination Detection Model and a New Leaderboard

https://www.vectara.com/blog/hhem-2-1-a-better-hallucination-detection-model

HCMBench: an evaluation toolkit for hallucination correction models

https://www.vectara.com/blog/hcmbench-an-evaluation-toolkit-for-hallucination-correction-models

Books:

Foundation series by Isaac Asimov https://en.wikipedia.org/wiki/Foundation_(novel_series)

Sapiens: A Brief History of Humankind Hardcover by Yuval Noah Harari https://www.amazon.com/Sapiens-Humankind-Yuval-Noah-Harari/dp/0062316095

Setting the Record Straight on who invented RAG

https://www.linkedin.com/pulse/setting-record-straight-who-invented-rag-amr-awadallah-8cwvc/

Follow us:

https://x.com/TheTuringPost

https://www.linkedin.com/in/ksenia-se

https://huggingface.co/Kseniase

...more
View all episodesView all episodes
Download on the App Store

Inference by Turing PostBy Turing Post