November 03, 2024

Mathematical Reasoning in Large Language Models: Are They Really Thinking?

16 minutes

In this episode, we dive into the mathematical reasoning abilities of large language models (LLMs). Do they truly understand math, or are they simply pattern-matching?

We'll explore the latest benchmarks, GSM-Symbolic and GSM-NoOp, uncovering the surprising limitations in LLMs’ logical processing—and what this means for their future development.

- Paper: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Hosted on Acast. See acast.com/privacy for more information.

...more

View all episodes

By Alessio Piovesan

November 03, 2024

Mathematical Reasoning in Large Language Models: Are They Really Thinking?

16 minutes

In this episode, we dive into the mathematical reasoning abilities of large language models (LLMs). Do they truly understand math, or are they simply pattern-matching?

We'll explore the latest benchmarks, GSM-Symbolic and GSM-NoOp, uncovering the surprising limitations in LLMs’ logical processing—and what this means for their future development.

- Paper: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Hosted on Acast. See acast.com/privacy for more information.

...more

Share Mathematical Reasoning in Large Language Models: Are They Really Thinking?

Sign up to save your podcasts

Mathematical Reasoning in Large Language Models: Are They Really Thinking?

Mathematical Reasoning in Large Language Models: Are They Really Thinking?