Vector Podcast

Joan Fontanals - Principal Engineer - Jina AI


Listen Later

Topics:

00:00 Intro

00:42 Joan's background

01:46 What attracted Joan's attention in Jina as a company and product?

04:39 Main area of focus for Joan in the product

05:46 How Open Source model works for Jina?

08:38 Deeper dive into Jina.AI as a product and technology stack

11:57 Does Jina fit the use cases of smaller / mid-size players with smaller amount of data?

13:45 KNN/ANN algorithms available in Jina

16:05 BigANN competition and BuddyPQ, increasing 12% in recall over FAISS

17:07 Does Jina support customers in model training? Finetuner

20:46 How does Jina framework compare to Vector Databases?

26:46 Jina's investment in user-friendly APIs

31:04 Applications of Jina beyond search engines, like question answering systems

33:20 How to bring bits of neural search into traditional keyword retrieval? Connection to model interpretability

41:14 Does Jina allow going multimodal, including images / audio etc?

46:03 The magical question of Why

55:20 Product announcement from Joan

Order your Jina swag https://docs.google.com/forms/d/e/1FAIpQLSedYVfqiwvdzWPX-blCpVu-tQoiFiUJQz2QnIHU1ggy1oyg/ Use this promo code: vectorPodcastxJinaAI

Show notes:

- Jina.AI: https://jina.ai/

- HNSW + PostgreSQL Indexer: [GitHub - jina-ai/executor-hnsw-postgres: A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL](https://github.com/jina-ai/executor-h...)

- pqlite: [GitHub - jina-ai/pqlite: A fast embedded library for Approximate Nearest Neighbor Search integrated with the Jina ecosystem](https://github.com/jina-ai/pqlite)

- BuddyPQ: [Billion-Scale Vector Search: Team Sisu and BuddyPQ | by Dmitry Kan | Big-ANN-Benchmarks | Nov, 2021 | Medium](https://medium.com/big-ann-benchmarks...)

- PaddlePaddle: [GitHub - PaddlePaddle/Paddle: PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)](https://github.com/PaddlePaddle/Paddle)

- Jina Finetuner: [Finetuner 0.3.1 documentation](https://finetuner.jina.ai/)

- [Not All Vector Databases Are Made Equal | by Dmitry Kan | Towards Data Science](https://towardsdatascience.com/milvus...)

- Fluent interface (method chaining): [Fluent interfaces in Python | Florian Einfalt – Developer](https://florianeinfalt.de/posts/fluen...)

- Sujit Pal’s blog: [Salmon Run](http://sujitpal.blogspot.com/)

- ByT5: Towards a token-free future with pre-trained byte-to-byte models https://arxiv.org/abs/2105.13626

Special thanks to Saurabh Rai for the Podcast Thumbnail: https://twitter.com/srbhr_ https://www.linkedin.com/in/srbh077/

...more
View all episodesView all episodes
Download on the App Store

Vector PodcastBy Dmitry Kan

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings


More shows like Vector Podcast

View all
Common Sense with Dan Carlin by Dan Carlin

Common Sense with Dan Carlin

11,313 Listeners

Fareed Zakaria GPS by CNN

Fareed Zakaria GPS

3,474 Listeners

Founders by David Senra

Founders

1,906 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

298 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Pod Save America by Crooked Media

Pod Save America

86,615 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,237 Listeners