Weaviate Podcast

Retrieving Texts based on Abstract Descriptions Explained!


Listen Later

This video explores a new paper exploring the use of summarization chains to represent long texts and use (original text, summary) pairs for optimizing text embeddings models! Here are 3 main takeaways I think everyone working with Weaviate may get value from:

1. Understanding of Summary Indexing and the Prompts (as well as Prompt Chains) used to build them.
2. Continued development of LLM-generated data for search -- creating (full text, summary) pairs gives you (1) data to build a summary index with as mentioned, (2) data to compare different embedding models with, and (3) data to train your own embedding model.
3. Tournament style evaluation with human annotators -- the top 5 retrieved texts from one model are concatenated with the top 5 from another model, these 10 are given to human annotators to pick 5 and this is how the authors are reporting the performance of their models rather than traditional benchmarks. This m ay be a more productive evaluation technique for most real world search applications.
Thank you so much for watching, here are some links mentioned in the video!
Retrieving Texts based on Abstract Descriptions: https://arxiv.org/abs/2305.12517
Weaviate Blog - Combining LangChain and Weaviate: https://weaviate.io/blog/combining-langchain-and-weaviate
Weaviate Blog - Generative Feedback Loops: https://weaviate.io/blog/generative-feedback-loops-with-llms
Jerry Liu in Llama Index Blog - A New Document Summary Index for LLM-powered QA Systems: https://medium.com/llamaindex-blog/a-new-document-summary-index-for-llm-powered-qa-systems-9a32ece2f9ec
Learning to Retrieve Passages without Supervision (Spider): https://arxiv.org/pdf/2112.07708.pdf
Weaviate Blog - Analysis of Spider - https://weaviate.io/blog/research-insights-spider
Chapters
0:00 Introduction
0:13 Quick Overview
7:30 How to use in Weaviate!
7:50 Background
12:08 Motivation
14:20 Prompts Used
18:14 More Details of training
21:12 Human Evaluation Study
22:40 My Takeaways from the Paper

...more
View all episodesView all episodes
Download on the App Store

Weaviate PodcastBy Weaviate

  • 4
  • 4
  • 4
  • 4
  • 4

4

4 ratings


More shows like Weaviate Podcast

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,270 Listeners

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

31,896 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

507 Listeners

Hidden Brain by Hidden Brain, Shankar Vedantam

Hidden Brain

43,363 Listeners

Lage der Nation - der Politik-Podcast aus Berlin by Philip Banse & Ulf Buermeyer

Lage der Nation - der Politik-Podcast aus Berlin

244 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

440 Listeners

The Daily by The New York Times

The Daily

111,077 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

207 Listeners

Practical AI by Practical AI LLC

Practical AI

188 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

8,756 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

129 Listeners

Unsupervised Learning by by Redpoint Ventures

Unsupervised Learning

39 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

72 Listeners

Interconnects by Nathan Lambert

Interconnects

10 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners