Vector Podcast

Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm HNSW


Listen Later

Topics:

00:00 Introduction

01:04 Yury’s background in laser physics, computer vision and startups

05:14 How Yury entered the field of nearest neighbor search and his impression of it

09:03 “Not all Small Worlds are Navigable”

10:10 Gentle introduction into the theory of Small World Navigable Graphs and related concepts

13:55 Further clarification on the input constraints for the NN search algorithm design

15:03 What did not work in NSW algorithm and how did Yury set up to invent new algorithm called HNSW

24:06 Collaboration with Leo Boytsov on integrating HNSW in nmslib

26:01 Differences between HNSW and NSW

27:55 Does algorithm always converge?

31:56 How FAISS’s implementation is different from the original HNSW

33:13 Could Yury predict that his algorithm would be implemented in so many frameworks and vector databases in languages like Go and Rust?

36:51 How our perception of high-dimensional spaces change compared to 3D?

38:30 ANN Benchmarks

41:33 Feeling proud of the invention and publication process during 2,5 years!

48:10 Yury’s effort to maintain HNSW and its GitHub community and the algorithm’s design principles

53:29 Dmitry’s ANN algorithm KANNDI, which uses HNSW as a building block

1:02:16 Java / Python Virtual Machines, profiling and benchmarking. “Your analysis of performance contradicts the profiler”

1:05:36 What are Yury’s hopes and goals for HNSW and role of symbolic filtering in ANN in general

1:13:05 The future of ANN field: search inside a neural network, graph ANN

1:15:14 Multistage ranking with graph based nearest neighbor search

1:18:18 Do we have the “best” ANN algorithm? How ANN algorithms influence each other

1:21:27 Yury’s plans on publishing his ideas

1:23:42 The intriguing question of Why

Show notes:

- HNSW library: https://github.com/nmslib/hnswlib/

- HNSW paper Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. TPAMI, 42(4), 824-836. (arxiv:1603.09320)

- NSW paper Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V. (2014). Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45, 61-68.

- Yury Lifshits’s paper: https://yury.name/papers/lifshits2009combinatorial.pdf

- Sergey Brin’s work in nearest neighbour search: GNAT - Geometric Near-neighbour Access Tree: [CiteSeerX — Near neighbor search in large metric spaces](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.8156)

- Podcast with Leo Boytsov: https://rare-technologies.com/rrp-4-leo-boytsov-knn-search/

- Million-Scale ANN Benchmarks: http://ann-benchmarks.com/

- Billion Scale ANN Benchmarks: https://github.com/harsha-simhadri/big-ann-benchmarks

- FALCONN algorithm: https://github.com/falconn-lib/falconn

- Mentioned navigable small world papers:

Kleinberg, J. M. (2000). Navigation in a small world. Nature, 406(6798), 845-845.;

Boguna, M., Krioukov, D., & Claffy, K. C. (2009). Navigability of complex networks. Nature Physics, 5(1), 74-80.

...more
View all episodesView all episodes
Download on the App Store

Vector PodcastBy Dmitry Kan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like Vector Podcast

View all
Common Sense with Dan Carlin by Dan Carlin

Common Sense with Dan Carlin

11,313 Listeners

Fareed Zakaria GPS by CNN

Fareed Zakaria GPS

3,474 Listeners

Founders by David Senra

Founders

1,906 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

298 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Pod Save America by Crooked Media

Pod Save America

86,615 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,237 Listeners