
Sign up to save your podcasts
Or


METR just made a lovely post detailing many examples they've found of reward hacks by frontier models. Unlike the reward hacks of yesteryear, these models are smart enough to know that what they are doing is deceptive and not what the company wanted them to do.
---
First published:
Source:
Linkpost URL:
https://metr.org/blog/2025-06-05-recent-reward-hacking/
---
Narrated by TYPE III AUDIO.
By LessWrongMETR just made a lovely post detailing many examples they've found of reward hacks by frontier models. Unlike the reward hacks of yesteryear, these models are smart enough to know that what they are doing is deceptive and not what the company wanted them to do.
---
First published:
Source:
Linkpost URL:
https://metr.org/blog/2025-06-05-recent-reward-hacking/
---
Narrated by TYPE III AUDIO.

26,361 Listeners

2,428 Listeners

8,957 Listeners

4,150 Listeners

92 Listeners

1,596 Listeners

9,911 Listeners

90 Listeners

72 Listeners

5,471 Listeners

16,083 Listeners

537 Listeners

131 Listeners

94 Listeners

511 Listeners