Weaviate Podcast

Nils Reimers on Cohere Embedding Models


Listen Later

Weaviate podcast #33.

Thank you so much for watching the 33rd Weaviate Podcast! This episode features one of the heroes of Deep Learning for Search, Nils Reimers! Nils' work on SentenceBERT is one of the foundational works for applying Deep Representation Learning to text search. This is the idea that personally inspired me to work in this field. Having seen the successes of Contrastive Representation Learning for Computer Vision, I was mind-blown by the possibility of this for NLP and text search. In addition to the scientific foundation, the software development of the Sentence Transformers library and BEIR benchmarks has been enormously impactful! It was an honor getting to ask Nils the questions I have about these things, from the role of Data Quality to Intent, Sparse Vectors, Long Document Encoding, Distribution Shift, and many more. I really hope you enjoy the podcast! We are so excited about the Cohere Multilingual embedding model and can't wait to see what else comes out of Cohere and their amazing team!


Cohere Multilingual ML Models with Weaviate: https://weaviate.io/blog/2022/12/Cohe...

Nils Reimers: https://scholar.google.com/citations?...

Mentioned in the podcast,

Cross-Encoders: https://weaviate.io/blog/2022/08/Usin...

How to choose a Sentence Transformer from HuggingFace: https://weaviate.io/blog/2022/10/How-...

Chapters
0:00 Cohere X Weaviate
0:22 Welcome Nils Reimers!
1:18 Origin Story
3:15 Learning Text Embeddings
6:54 Positive and Negative Sampling in Contrastive Learning
13:32 1 Billion Pairs for Text Embedding Optimization
15:44 Impact of Data Quality
18:40 New Cohere Multilingual Model!
24:50 Challenge of Debugging Multilingual Models
28:30 Intent in Search
30:40 Thoughts on ColBERT
33:50 Sparse Vectors in Search
36:17 Long Documents and Multi-Discourse
43:40 Entity Parsing in Query Understanding
46:08 Unknown Words and Distribution Shift
50:07 Re-Vectorizing with Fine-Tuning
53:07 More on Search Interfaces and Intent in Search
55:15 Thank you Nils!

...more
View all episodesView all episodes
Download on the App Store

Weaviate PodcastBy Weaviate

  • 4
  • 4
  • 4
  • 4
  • 4

4

4 ratings


More shows like Weaviate Podcast

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,268 Listeners

Freakonomics Radio by Freakonomics Radio + Stitcher

Freakonomics Radio

31,914 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

509 Listeners

Hidden Brain by Hidden Brain, Shankar Vedantam

Hidden Brain

43,373 Listeners

Lage der Nation - der Politik-Podcast aus Berlin by Philip Banse & Ulf Buermeyer

Lage der Nation - der Politik-Podcast aus Berlin

242 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

442 Listeners

The Daily by The New York Times

The Daily

111,049 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

210 Listeners

Practical AI by Practical AI LLC

Practical AI

188 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

8,765 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

128 Listeners

Unsupervised Learning by by Redpoint Ventures

Unsupervised Learning

39 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

72 Listeners

Interconnects by Nathan Lambert

Interconnects

10 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners