LessWrong (30+ Karma)

“Knocking Down My AI Optimist Strawman” by tailcalled


Listen Later

I recently posted my model of an optimistic view of AI, asserting that I disagree with every sentence of it. I thought I might as well also describe my objections to those sentences:

"The rapid progress spearheaded by OpenAI is clearly leading to artificial intelligence that will soon surpass humanity in every way."

Here's some of the main things humanity might want to achieve:

  • Curing aging and other diseases
  • Plentiful clean energy from e.g. nuclear fusion
  • De-escalating nuclear MAD while extending world peace and human freedom
  • ... even if hostile nations would use powerful unaligned AI[1] to fight you
  • Stopping criminals, even if they would make powerful unaligned AI[1] to fight you
  • Educating people to be great and patriotic
  • Creating healthy, tasty food without torturing animals
  • Nice homes for humans near important things
  • Good, open channels for honest, valuable communication
  • Common knowledge of the virtues and vices of executives, professionals [...]

---

Outline:

(00:16) The rapid progress spearheaded by OpenAI is clearly leading to artificial intelligence that will soon surpass humanity in every way.

(01:28) People used to be worried about existential risk from misalignment, yet we have a good idea about what influence current AIs are having on the world, and it is basically going fine.

(03:59) The root problem is that The Sequences expected AGI to develop agency largely without human help; meanwhile actual AI progress occurs by optimizing the scaling efficiency of a pretraining process that is mostly focus on integrating the AI with human culture.

(05:24) This means we will be able to control AI by just asking it to do good things, showing it some examples and giving it some ranked feedback.

(06:15) You might think this is changing with inference-time scaling, yet if the alignment would fall apart as new methods get taken into use, wed have seen signs of it with o1.

(06:48) In the unlikely case that our current safety will turn out to be insufficient, interpretability research has worked out lots of deeply promising ways to improve, with sparse autoencoders letting us read the minds of the neural networks and thereby screen them for malice, and activation steering letting us deeply control the networks to our hearts content.

(07:32) AI x-risk worries arent just a waste of time, though; they are dangerous because they make people think society needs to make use of violence to regulate what kinds of AIs people can make and how they can use them.

(07:58) This danger was visible from the very beginning, as alignment theorists thought one could (and should) make a singleton that would achieve absolute power (by violently threatening humanity, no doubt), rather than always letting AIs be pure servants of humanity.

(10:15) To justify such violence, theorists make up all sorts of elaborate unfalsifiable and unjustifiable stories about how AIs are going to deceive and eventually kill humanity, yet the initial deceptions by base models were toothless, and thanks to modern alignment methods, serious hostility or deception has been thoroughly stamped out.

The original text contained 2 footnotes which were omitted from this narration.

---

First published:

February 8th, 2025

Source:

https://www.lesswrong.com/posts/TCmj9Wdp5vwsaHAas/knocking-down-my-ai-optimist-strawman

---

Narrated by TYPE III AUDIO.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,586 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,219 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

531 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,096 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners