Pronouncing "ColBERT", the origins of ColBERT, doing NLP from an IR perspective, how getting "scooped" can be productive, OpenQA and related tasks, PhD journeys, why even retrieval plus attention is not all you need, multilingual knowledge-intensive NLP, and aiming high in research projects.
Transcript: https://web.stanford.edu/class/cs224u/podcast/khattab/
Omar's websiteMatei ZahariaKeshav SanthanamSteven Colbert thowing paper with ObamaThe ColBERT paper and the ColBERTv2 paperDeepImpact: Learning passage impacts for inverted indexesDPR: Dense passage retrieval for open-domain question answeringIncorporating query term independence assumption for efficient retrieval and ranking using deep neural networksDeepCT: Context-aware sentence/passage term importance estimation for first stage retrievalReading Wikipedia to answer open-domain questionsORQA: Latent retrieval for weakly supervised open domain question answeringQRECCColBERT-QA: Relevance-guided Supervision for OpenQA with ColBERTBaleen: Robust Multi-Hop Reasoning at Scale via Condensed RetrievalPassage reranking with BERTUniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question AnsweringSelf-driving search engines: The neural hype and comparisons against weak baselinesMohammad HammoudRAG: Retrieval-augmented generation for knowledge-intensive NLP tasksHindsight: Posterior-guided training of retrievers for improved open-ended generationLearning Cross-Lingual IR from an English RetrieverBlog post: A moderate proposal for radically better AI-powered Web searchBlog post: Building scalable, explainable, and adaptive NLP models with retrievalXOR-TyDi