Beyond the Algorithm

Beyond Text: Can AI Search Understand Images Too?


Listen Later

This podcast introduces MMSEARCH, a new benchmark designed to evaluate how well large multimodal models (LMMs) can function as AI-powered search engines that understand both text and images. The authors argue that existing AI search engines are limited by their focus on text-only settings, neglecting the wealth of information found in images and the way text and images are combined on websites. To address this, they created MMSEARCH, a dataset of 300 diverse search queries spanning 14 subfields, ensuring the answers cannot be found within the training data of current LMMs. They also propose MMSEARCH-ENGINE, a pipeline that allows any LMM to be evaluated on its ability to perform three key tasks involved in searching: reformulating user queries into search engine-friendly formats, ranking the relevance of retrieved websites, and summarizing the answer from the most relevant webpage

...more
View all episodesView all episodes
Download on the App Store

Beyond the AlgorithmBy AI