
Sign up to save your podcasts
Or
Hey everyone, thank you so much for watching the 48th episode of the Weaviate Podcast!! This is a SUPER exciting one, welcoming Brian Raymond the CEO / Founder of Unstructured! Unstructured is a perfect complimenting technology for Weaviate, helping people get their Unstructured data into Weaviate! The podcast dives into the nuances of this task, but it generally revolves around Unstructured's abstraction of Partitioning, Cleaning, and Staging! Unstructured is making groundbreaking innovations on using Visual Document Layout models for Partitioning, for example saying that this part of the PDF is the header, body, image caption, and so on. Cleaning then describes removing pesky details like whitespaces or odd characters. Staging then describes the transformations of say formatting a text chunk with it's metadata into the JSON for a Weaviate object upload! I really hope you find this podcast interesting! We are publishing a blog post as well showing an example of how to use Unstructured to get PDF data into Weaviate, please please check that out and let us know if it works for your data and how we can improve it! This blog post can be found on weaviate.io and we will be managing discussions around it both in the Weaviate slack, as well as Unstructured! Thank you so much for listening!
4
44 ratings
Hey everyone, thank you so much for watching the 48th episode of the Weaviate Podcast!! This is a SUPER exciting one, welcoming Brian Raymond the CEO / Founder of Unstructured! Unstructured is a perfect complimenting technology for Weaviate, helping people get their Unstructured data into Weaviate! The podcast dives into the nuances of this task, but it generally revolves around Unstructured's abstraction of Partitioning, Cleaning, and Staging! Unstructured is making groundbreaking innovations on using Visual Document Layout models for Partitioning, for example saying that this part of the PDF is the header, body, image caption, and so on. Cleaning then describes removing pesky details like whitespaces or odd characters. Staging then describes the transformations of say formatting a text chunk with it's metadata into the JSON for a Weaviate object upload! I really hope you find this podcast interesting! We are publishing a blog post as well showing an example of how to use Unstructured to get PDF data into Weaviate, please please check that out and let us know if it works for your data and how we can improve it! This blog post can be found on weaviate.io and we will be managing discussions around it both in the Weaviate slack, as well as Unstructured! Thank you so much for listening!
1,270 Listeners
31,896 Listeners
507 Listeners
43,363 Listeners
244 Listeners
440 Listeners
111,077 Listeners
207 Listeners
188 Listeners
8,756 Listeners
129 Listeners
39 Listeners
72 Listeners
10 Listeners
33 Listeners