AI Engineering Podcast

Expert Insights On Retrieval Augmented Generation And How To Build It


Listen Later

Summary
In this episode we're joined by Matt Zeiler, founder and CEO of Clarifai, as he dives into the technical aspects of retrieval augmented generation (RAG). From his journey into AI at the University of Toronto to founding one of the first deep learning AI companies, Matt shares his insights on the evolution of neural networks and generative models over the last 15 years. He explains how RAG addresses issues with large language models, including data staleness and hallucinations, by providing dynamic access to information through vector databases and embedding models. Throughout the conversation, Matt and host Tobias Macy discuss everything from architectural requirements to operational considerations, as well as the practical applications of RAG in industries like intelligence, healthcare, and finance. Tune in for a comprehensive look at RAG and its future trends in AI.
Announcements
  • Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
  • Your host is Tobias Macey and today I'm interviewing Matt Zeiler, Founder & CEO of Clarifai, about the technical aspects of RAG, including the architectural requirements, edge cases, and evolutionary characteristics
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what RAG (Retrieval Augmented Generation) is?
  • What are the contexts in which you would want to use RAG?
  • What are the alternatives to RAG?
  • What are the architectural/technical components that are required for production grade RAG?
  • Getting a quick proof-of-concept working for RAG is fairly straightforward. What are the failures modes/edge cases that start to surface as you scale the usage and complexity?
  • The first step of building the corpus for RAG is to generate the embeddings. Can you talk through the planning and design process? (e.g. model selection for embeddings, storage capacity/latency, etc.)
  • How does the modality of the input/output affect this and downstream decisions? (e.g. text vs. image vs. audio, etc.)
  • What are the features of a vector store that are most critical for RAG?
  • The set of available generative models is expanding and changing at breakneck speed. What are the foundational aspects that you look for in selecting which model(s) to use for the output?
  • Vector databases have been gaining ground for search functionality, even without generative AI. What are some of the other ways that elements of RAG can be re-purposed?
  • What are the most interesting, innovative, or unexpected ways that you have seen RAG used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on RAG?
  • When is RAG the wrong choice?
  • What are the main trends that you are following for RAG and its component elements going forward?
Contact Info
  • Website
  • LinkedIn
Parting Question
  • From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. [Podcast.__init__]() covers the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
  • Clarifai
  • Geoff Hinton
  • Yann Lecun
  • Neural Networks
  • Deep Learning
  • Retrieval Augmented Generation
  • Context Window
  • Vector Database
  • Prompt Engineering
  • Mistral
  • Llama 3
  • Embedding Quantization
  • Active Learning
  • Google Gemini
  • AI Model Attention
  • Recurrent Network
  • Convolutional Network
  • Reranking Model
  • Stop Words
  • Massive Text Embedding Benchmark (MTEB)
  • Retool State of AI Report
  • pgvector
  • Milvus
  • Qdrant
  • Pinecone
  • OpenLLM Leaderboard
  • Semantic Search
  • Hashicorp
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
...more
View all episodesView all episodes
Download on the App Store

AI Engineering PodcastBy Tobias Macey

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

6 ratings


More shows like AI Engineering Podcast

View all
The Cloudcast by Massive Studios

The Cloudcast

153 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

994 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

139 Listeners

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

144 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Last Week in AI by Skynet Today

Last Week in AI

281 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

124 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

63 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

423 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners