June 15, 2025

43 - David Lindner on Myopic Optimization with Non-myopic Approval

1 hour 40 minutes

In this episode, I talk with David Lindner about Myopic Optimization with Non-myopic Approval, or MONA, which attempts to address (multi-step) reward hacking by myopically optimizing actions against a human's sense of whether those actions are generally good. Does this work? Can we get smarter-than-human AI this way? How does this compare to approaches like conservativism? Listen to find out.

Patreon: https://www.patreon.com/axrpodcast

Ko-fi: https://ko-fi.com/axrpodcast

Transcript: https://axrp.net/episode/2025/06/15/episode-43-david-lindner-mona.html

Topics we discuss, and timestamps:

0:00:29 What MONA is

0:06:33 How MONA deals with reward hacking

0:23:15 Failure cases for MONA

0:36:25 MONA's capability

0:55:40 MONA vs other approaches

1:05:03 Follow-up work

1:10:17 Other MONA test cases

1:33:47 When increasing time horizon doesn't increase capability

1:39:04 Following David's research

Links for David:

Website: https://www.davidlindner.me

Twitter / X: https://x.com/davlindner

DeepMind Medium: https://deepmindsafetyresearch.medium.com

David on the Alignment Forum: https://www.alignmentforum.org/users/david-lindner

Research we discuss:

MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking: https://arxiv.org/abs/2501.13011

Arguments Against Myopic Training: https://www.alignmentforum.org/posts/GqxuDtZvfgL2bEQ5v/arguments-against-myopic-training

Episode art by Hamish Doodles: hamishdoodles.com

...more

View all episodes

By Daniel Filan

4.4

88 ratings

June 15, 2025

43 - David Lindner on Myopic Optimization with Non-myopic Approval

1 hour 40 minutes

Patreon: https://www.patreon.com/axrpodcast

Ko-fi: https://ko-fi.com/axrpodcast

Transcript: https://axrp.net/episode/2025/06/15/episode-43-david-lindner-mona.html

Topics we discuss, and timestamps:

0:00:29 What MONA is

0:06:33 How MONA deals with reward hacking

0:23:15 Failure cases for MONA

0:36:25 MONA's capability

0:55:40 MONA vs other approaches

1:05:03 Follow-up work

1:10:17 Other MONA test cases

1:33:47 When increasing time horizon doesn't increase capability

1:39:04 Following David's research

Links for David:

Website: https://www.davidlindner.me

Twitter / X: https://x.com/davlindner

DeepMind Medium: https://deepmindsafetyresearch.medium.com

David on the Alignment Forum: https://www.alignmentforum.org/users/david-lindner

Research we discuss:

MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking: https://arxiv.org/abs/2501.13011

Arguments Against Myopic Training: https://www.alignmentforum.org/posts/GqxuDtZvfgL2bEQ5v/arguments-against-myopic-training

Episode art by Hamish Doodles: hamishdoodles.com

...more

More shows like AXRP - the AI X-risk Research Podcast

View all

Making Sense with Sam Harris

26,377 Listeners

Conversations with Tyler

2,430 Listeners

a16z Podcast

1,083 Listeners

Future of Life Institute Podcast

107 Listeners

The Daily

112,351 Listeners

Practical AI

211 Listeners

All-In with Chamath, Jason, Sacks & Friedberg

9,799 Listeners

Machine Learning Street Talk (MLST)

89 Listeners

Dwarkesh Podcast

489 Listeners

Hard Fork

5,468 Listeners

Clearer Thinking with Spencer Greenberg

132 Listeners

The Ezra Klein Show

16,152 Listeners

Latent Space: The AI Engineer Podcast

97 Listeners

This Day in AI Podcast

209 Listeners

Complex Systems with Patrick McKenzie (patio11)

131 Listeners

Share 43 - David Lindner on Myopic Optimization with Non-myopic Approval

Sign up to save your podcasts

43 - David Lindner on Myopic Optimization with Non-myopic Approval

43 - David Lindner on Myopic Optimization with Non-myopic Approval

More shows like AXRP - the AI X-risk Research Podcast

Making Sense with Sam Harris

Conversations with Tyler

a16z Podcast

Future of Life Institute Podcast

The Daily

Practical AI

All-In with Chamath, Jason, Sacks & Friedberg

Machine Learning Street Talk (MLST)

Dwarkesh Podcast

Hard Fork

Clearer Thinking with Spencer Greenberg

The Ezra Klein Show

Latent Space: The AI Engineer Podcast

This Day in AI Podcast

Complex Systems with Patrick McKenzie (patio11)