October 15, 2024

S02E06 - Can LLMs (Large Language Models) really reason?

8 minutes

In this episode, Anna and Aiden discuss whether LLMs (Large Language Models) are good at reasoning? Or, are they force-fit to pass certain well-known benchmarks?

The material for this episode comes from two research studies. They are:

1. GSM-Symbolic: Understanding the Limitations of

Mathematical Reasoning in Large Language Models by

Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi

Oncel Tuzel, Samy Bengio and Mehrdad Farajtabar working at Apple

2. Functional Benchmarks for Robust Evaluation of

Reasoning Performance, and the Reasoning Gap by

Annarose M B, Anto P V, Shashank Menon, Ajay Sukumar,

Adwaith Samod T, Alan Philipose, Stevin Prince, and Sooraj Thomas from Consequent AI

...more

View all episodes

By stashtalk

October 15, 2024

S02E06 - Can LLMs (Large Language Models) really reason?

8 minutes

In this episode, Anna and Aiden discuss whether LLMs (Large Language Models) are good at reasoning? Or, are they force-fit to pass certain well-known benchmarks?

The material for this episode comes from two research studies. They are:

1. GSM-Symbolic: Understanding the Limitations of

Mathematical Reasoning in Large Language Models by

Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi

Oncel Tuzel, Samy Bengio and Mehrdad Farajtabar working at Apple

2. Functional Benchmarks for Robust Evaluation of

Reasoning Performance, and the Reasoning Gap by

Annarose M B, Anto P V, Shashank Menon, Ajay Sukumar,

Adwaith Samod T, Alan Philipose, Stevin Prince, and Sooraj Thomas from Consequent AI

...more

Share S02E06 - Can LLMs (Large Language Models) really reason?

Sign up to save your podcasts

S02E06 - Can LLMs (Large Language Models) really reason?

S02E06 - Can LLMs (Large Language Models) really reason?