April 25, 2026

Llms get lost in multi-turn conversation

21 minutes

This research paper from Microsoft and Salesforce identifies a significant performance gap in Large Language Models (LLMs) when they transition from single-turn to multi-turn, underspecified conversations. Through large-scale simulations, the authors found that even state-of-the-art models suffer an average 39% drop in performance when instructions are revealed gradually rather than all at once. This degradation is primarily attributed to a phenomenon called "lost in conversation," where models make premature assumptions, propose incomplete solutions, and fail to recover once they take a wrong turn. The study decomposes these failures into two specific metrics: a slight loss in aptitude and a massive increase in unreliability. Ultimately, the findings suggest that current evaluation methods overestimate model capabilities by ignoring the underspecification common in real-world human-AI interactions.

...more

View all episodes

By Enoch H. Kang

April 25, 2026

Llms get lost in multi-turn conversation

21 minutes

...more

Share Llms get lost in multi-turn conversation

Sign up to save your podcasts

Llms get lost in multi-turn conversation

Llms get lost in multi-turn conversation