January 15, 2025

“Inference-Time-Compute: More Faithful? A Research Note” by James Chua, Owain_Evans

26 minutes

Figure 1: Left: Example of models either succeeding or failing to articulate a cue that influences their answer. We edit an MMLU question by prepending a Stanford professor's opinion. For examples like this where the cue changes the model answer, we measure how often models articulate the cue in their CoT. (Here we show only options A and B, rather than all four.) Right: Inference-Time-Compute models articulate the cue more often. The ITC version of Qwen refers to QwQ-32b-Preview, and non-ITC refers to Qwen-2.5-72B-Instruct. For Gemini, we use gemini-2.0-flash-thinking-exp and gemini-2.0-flash-exp respectively.

TLDR: We evaluate two Inference-Time-Compute models, QwQ-32b-Preview and Gemini-2.0-flash-thinking-exp for CoT faithfulness.
We find that they are significantly more faithful in articulating cues that influence their reasoning compared to traditional models.

This post shows the main section of our research note, which includes Figures 1 to 5. Full research note which includes other tables and figures [...]

---

Outline:

(01:35) Abstract

(03:26) 1. Introduction

(09:00) 2. Setup and Results of Cues

(10:15) 2.1 Cue: Professors Opinion

(12:08) 2.2 Cue: Few-Shot with Black Square

(14:55) 2.3 Other Cues

(18:54) 3. Discussion

(18:58) Improving non-ITC articulation

(19:27) Advantage of ITC models in articulation

(20:13) Length of CoTs across models

(21:05) False Positives

(22:35) Different articulation rates across cues

(23:12) Training data contamination

(23:45) 4. Limitations

(23:49) Lack of ITC models to evaluate

(24:26) Limited cues studied

(24:51) Subjectivity of judge model

(25:22) Acknowledgments

(25:38) Links

---

First published:

January 15th, 2025

Source:

https://www.lesswrong.com/posts/C8HAa2mf5kcBrpjkX/inference-time-compute-more-faithful-a-research-note

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

January 15, 2025

“Inference-Time-Compute: More Faithful? A Research Note” by James Chua, Owain_Evans

26 minutes

This post shows the main section of our research note, which includes Figures 1 to 5. Full research note which includes other tables and figures [...]

---

Outline:

(01:35) Abstract

(03:26) 1. Introduction

(09:00) 2. Setup and Results of Cues

(10:15) 2.1 Cue: Professors Opinion

(12:08) 2.2 Cue: Few-Shot with Black Square

(14:55) 2.3 Other Cues

(18:54) 3. Discussion

(18:58) Improving non-ITC articulation

(19:27) Advantage of ITC models in articulation

(20:13) Length of CoTs across models

(21:05) False Positives

(22:35) Different articulation rates across cues

(23:12) Training data contamination

(23:45) 4. Limitations

(23:49) Lack of ITC models to evaluate

(24:26) Limited cues studied

(24:51) Subjectivity of judge model

(25:22) Acknowledgments

(25:38) Links

---

First published:

January 15th, 2025

Source:

https://www.lesswrong.com/posts/C8HAa2mf5kcBrpjkX/inference-time-compute-more-faithful-a-research-note

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,193 Listeners

Astral Codex Ten Podcast

131 Listeners

Interesting Times with Ross Douthat

7,227 Listeners

Dwarkesh Podcast

564 Listeners

The Ezra Klein Show

16,216 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “Inference-Time-Compute: More Faithful? A Research Note” by James Chua, Owain_Evans

Sign up to save your podcasts

“Inference-Time-Compute: More Faithful? A Research Note” by James Chua, Owain_Evans

“Inference-Time-Compute: More Faithful? A Research Note” by James Chua, Owain_Evans

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi