vBrownBag

Troubleshooting AWS Hallucinations from Vector Store DBs


Listen Later

Join us as Amelia shares the debugging story nobody tells you about - how her vector store DB couldn't surface specific data until she tested it with simplified data from ChatGPT.

Amelia walks through her journey from throwing JIRA tickets into a large language model without understanding pipelines or data cleaning, to discovering why her production vector store was failing. You'll learn about the gap between chatting with data and getting accurate connections, how to validate vector similarity search results, the difference between production and synthetic test data, and practical troubleshooting workflows for AWS vector stores. This episode reveals the messy reality of RAG systems - when everything seems fine but the outputs are subtly wrong, and how testing with simplified data can expose what production complexity hides.

Timestamps

0:00 Cold Open

1:03 Welcome & Introduction

2:06 Amelia's Background & DeepRacer Trophy

4:49 The JIRA Ticket Use Case Origin Story

5:53 Getting Into the Presentation

6:03 Accessing & Cleaning Data Sets

8:12 Losing Production Data & Recreating with ChatGPT

  • 12:45 Understanding Vector Databases
  • 18:22 How Embeddings Work
  • 24:16 The Hallucination Discovery
  • 30:41 Testing Strategies for Vector Stores
  • 36:52 Debugging Vector Similarity Search
  • 42:18 Real-World Troubleshooting Workflows
  • 44:26 Where to Find Amelia & Wrap-up
  • How to find Amelia:

    https://www.linkedin.com/in/ameliahoughross/

    ...more
    View all episodesView all episodes
    Download on the App Store

    vBrownBagBy vBrownBag

    • 4.7
    • 4.7
    • 4.7
    • 4.7
    • 4.7

    4.7

    34 ratings