Breaktime Tech Talks

Ep79: Reducing Agent Token Costs + RAG Beyond Semantic Search


Listen Later

In this episode, I sit down with Roie Schwaber-Cohen, a software engineer and developer advocate at Pinecone, to talk about smarter ways to build with AI — without burning through tokens or your patience!

What we cover:

  • Why agentic AI systems burn so many tokens (and ways to combat it)
  • How Pinecone's Nexus pre-explores retrieval paths so agents don't have to discover them at runtime, cutting latency and token usage
  • The problem with naive RAG ("Franken answers") and why domain-level separation of your documents matters
  • How Pinecone Marketplace lets non-developers connect structured and unstructured data sources to build production-ready AI apps
  • Why semantic similarity isn't the same as correctness, and how document introspection helps agents ask better questions
  • Links & Resources:

    • Pinecone
    • Pinecone Marketplace (recently announced)
    • Pinecone Nexus
    • Roie on LinkedIn
    • ...more
      View all episodesView all episodes
      Download on the App Store

      Breaktime Tech TalksBy jmhreif

      • 5
      • 5
      • 5
      • 5
      • 5

      5

      2 ratings