
Sign up to save your podcasts
Or


Welcome back to an episode where we're talking Vectors, Vector Databases, and AI with Linpeng Tang, CTO and co-founder of MyScale. MyScale is a super interesting technology. They're combining the best of OLAP databases with Vector Search. The project started back in 2019 where they forked ClickHouse and then adapted it to support Vector Storage, Indexing, and Search.
The really unique and cool thing is you get the familiarity and usability of SQL with the power of being able to compare the similarity between unstructured data.
We think this has really fascinating use cases for analytics well beyond what we're seeing with other vector database technology that's mostly restricted to building RAG models for LLMs. Also, because it's built on ClickHouse, MyScale is massively scalable, which is an area that many of the dedicated vector databases actually struggle with.
We cover a lot about how vector databases work, why they decided to build off of ClickHouse, and how they plan to open source the database.
Timestamps
02:29 Introduction
06:22 Value of a Vector Database
12:40 Forking ClickHouse
18:53 Transforming Clickhouse into a SQL vector database
32:08 Data modeling
32:56 What data can be Vectorized
38:37 Indexing
43:35 Achieving Scale
46:35 Bottlenecks
48:41 MyScale vs other dedicated Vector Databases
51:38 Going Open Source
56:04 Closing thoughts
By Software Huddle5
44 ratings
Welcome back to an episode where we're talking Vectors, Vector Databases, and AI with Linpeng Tang, CTO and co-founder of MyScale. MyScale is a super interesting technology. They're combining the best of OLAP databases with Vector Search. The project started back in 2019 where they forked ClickHouse and then adapted it to support Vector Storage, Indexing, and Search.
The really unique and cool thing is you get the familiarity and usability of SQL with the power of being able to compare the similarity between unstructured data.
We think this has really fascinating use cases for analytics well beyond what we're seeing with other vector database technology that's mostly restricted to building RAG models for LLMs. Also, because it's built on ClickHouse, MyScale is massively scalable, which is an area that many of the dedicated vector databases actually struggle with.
We cover a lot about how vector databases work, why they decided to build off of ClickHouse, and how they plan to open source the database.
Timestamps
02:29 Introduction
06:22 Value of a Vector Database
12:40 Forking ClickHouse
18:53 Transforming Clickhouse into a SQL vector database
32:08 Data modeling
32:56 What data can be Vectorized
38:37 Indexing
43:35 Achieving Scale
46:35 Bottlenecks
48:41 MyScale vs other dedicated Vector Databases
51:38 Going Open Source
56:04 Closing thoughts

271 Listeners

291 Listeners

624 Listeners

285 Listeners

2,084 Listeners

987 Listeners

210 Listeners

2,641 Listeners

9,829 Listeners

489 Listeners

59 Listeners

97 Listeners

559 Listeners

509 Listeners

64 Listeners