Astral Codex Ten Podcast

Deceptively Aligned Mesa-Optimizers: It's Not Funny If I Have To Explain It


Listen Later

https://astralcodexten.substack.com/p/deceptively-aligned-mesa-optimizers A Machine Alignment Monday post, 4/11/22

I.

Our goal here is to popularize obscure and hard-to-understand areas of AI alignment, and surely this meme (retweeted by Eliezer last week) qualifies:

So let's try to understand the incomprehensible meme! Our main source will be Hubinger et al 2019, Risks From Learned Optimization In Advanced Machine Learning Systems.

Mesa- is a Greek prefix which means the opposite of meta-. To "go meta" is to go one level up; to "go mesa" is to go one level down (nobody has ever actually used this expression, sorry). So a mesa-optimizer is an optimizer one level down from you.

Consider evolution, optimizing the fitness of animals. For a long time, it did so very mechanically, inserting behaviors like "use this cell to detect light, then grow toward the light" or "if something has a red dot on its back, it might be a female of your species, you should mate with it". As animals became more complicated, they started to do some of the work themselves. Evolution gave them drives, like hunger and lust, and the animals figured out ways to achieve those drives in their current situation. Evolution didn't mechanically instill the behavior of opening my fridge and eating a Swiss Cheese slice. It instilled the hunger drive, and I figured out that the best way to satisfy it was to open my fridge and eat cheese.

...more
View all episodesView all episodes
Download on the App Store

Astral Codex Ten PodcastBy Jeremiah

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

129 ratings


More shows like Astral Codex Ten Podcast

View all
Odd Lots by Bloomberg

Odd Lots

1,987 Listeners

Very Bad Wizards by Tamler Sommers & David Pizarro

Very Bad Wizards

2,672 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,338 Listeners

EconTalk by Russ Roberts

EconTalk

4,278 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,459 Listeners

Robert Wright's Nonzero by Nonzero

Robert Wright's Nonzero

590 Listeners

The Good Fight by Yascha Mounk

The Good Fight

905 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

291 Listeners

The Reason Interview With Nick Gillespie by The Reason Interview With Nick Gillespie

The Reason Interview With Nick Gillespie

739 Listeners

Conversations With Coleman by The Free Press

Conversations With Coleman

580 Listeners

GoodFellows: Conversations on Economics, History & Geopolitics by Hoover Institution

GoodFellows: Conversations on Economics, History & Geopolitics

706 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

535 Listeners

Hard Fork by The New York Times

Hard Fork

5,537 Listeners

Ones and Tooze by Foreign  Policy

Ones and Tooze

368 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

155 Listeners