
Sign up to save your podcasts
Or


METR just made a lovely post detailing many examples they've found of reward hacks by frontier models. Unlike the reward hacks of yesteryear, these models are smart enough to know that what they are doing is deceptive and not what the company wanted them to do.
---
First published:
Source:
Linkpost URL:
https://metr.org/blog/2025-06-05-recent-reward-hacking/
---
Narrated by TYPE III AUDIO.
By LessWrongMETR just made a lovely post detailing many examples they've found of reward hacks by frontier models. Unlike the reward hacks of yesteryear, these models are smart enough to know that what they are doing is deceptive and not what the company wanted them to do.
---
First published:
Source:
Linkpost URL:
https://metr.org/blog/2025-06-05-recent-reward-hacking/
---
Narrated by TYPE III AUDIO.

112,856 Listeners

130 Listeners

7,217 Listeners

532 Listeners

16,202 Listeners

4 Listeners

14 Listeners

2 Listeners