Learning GenAI via SOTA Papers

EP158: The hidden blind spots of AI logic


Listen Later

The paper "Large Language Model Reasoning Failures" is a comprehensive survey that systematically categorizes and analyzes the various ways Large Language Models (LLMs) fail at reasoning tasks. To unify fragmented research in the field, the authors introduce a two-axis taxonomy that organizes failures based on the type of reasoning and the nature of the failure.

The taxonomy divides reasoning into embodied (physical world interaction) and non-embodied types, with the latter further split into informal (intuitive judgments) and formal (logical and mathematical) reasoning. On the second axis, failures are classified into three categories:

  • Fundamental failures: Intrinsic weaknesses in LLM architectures (e.g., the "reversal curse" or limited working memory) that broadly affect performance.
  • Application-specific limitations: Shortcomings that manifest in particular domains, such as Theory of Mind or 3D spatial planning.
  • Robustness issues: Inconsistencies where performance drops due to minor variations in prompt phrasing or task structure.

The paper provides detailed definitions for these failures, explores their root causes—such as the limitations of next-token prediction—and discusses mitigation strategies like Chain-of-Thought prompting and data-centric approaches. By providing a structured perspective and a public GitHub repository of related research, the survey aims to guide future work toward developing more reliable and robust reasoning capabilities in AI.

...more
View all episodesView all episodes
Download on the App Store

Learning GenAI via SOTA PapersBy Yun Wu