Pronouncing "ColBERT", the origins of ColBERT, doing NLP from an IR perspective, how getting "scooped" can be productive, OpenQA and related tasks, PhD journeys, why even retrieval plus attention is not all you need, multilingual knowledge-intensive NLP, and aiming high in research projects.
Transcript: https://web.stanford.edu/class/cs224u/podcast/khattab/
Omar's website
Matei Zaharia
Keshav Santhanam
Steven Colbert thowing paper with Obama
The ColBERT paper and the ColBERTv2 paper
DeepImpact: Learning passage impacts for inverted indexes
DPR: Dense passage retrieval for open-domain question answering
Incorporating query term independence assumption for efficient retrieval and ranking using deep neural networks
DeepCT: Context-aware sentence/passage term importance estimation for first stage retrieval
Reading Wikipedia to answer open-domain questions
ORQA: Latent retrieval for weakly supervised open domain question answering
QRECC
ColBERT-QA: Relevance-guided Supervision for OpenQA with ColBERT
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
Passage reranking with BERT
UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering
Self-driving search engines: The neural hype and comparisons against weak baselines
Mohammad Hammoud
RAG: Retrieval-augmented generation for knowledge-intensive NLP tasks
Hindsight: Posterior-guided training of retrievers for improved open-ended generation
Learning Cross-Lingual IR from an English Retriever
Blog post: A moderate proposal for radically better AI-powered Web search
Blog post: Building scalable, explainable, and adaptive NLP models with retrieval
XOR-TyDi