January 25, 2025

“Six Thoughts on AI Safety” by boazbarak

30 minutes

[Crossposted from windowsontheory]

The following statements seem to be both important for AI safety and are not widely agreed upon. These are my opinions, not those of my employer or colleagues. As is true for anything involving AI, there is significant uncertainty about everything written below. However, for readability, I present these points in their strongest form, without hedges and caveats. That said, it is essential not to be dogmatic, and I am open to changing my mind based on evidence. None of these points are novel; others have advanced similar arguments. I am sure that for each statement below, there will be people who find it obvious and people who find it obviously false.

AI safety will not be solved on its own.
An “AI scientist” will not solve it either.
Alignment is not about loving humanity; it's about robust reasonable compliance.
Detection is more important than [...]

---

Outline:

(02:44) 1. AI safety will not be solved on its own.

(05:52) 2. An AI scientist will not solve it either.

(11:00) 3. Alignment is not about loving humanity; it's about robust reasonable compliance.

(19:57) 4. Detection is more important than prevention.

(22:40) 5. Interpretability is neither sufficient nor necessary for alignment.

(24:56) 6. Humanity can survive an unaligned superintelligence.

---

First published:

January 24th, 2025

Source:

https://www.lesswrong.com/posts/3jnziqCF3vA2NXAKp/six-thoughts-on-ai-safety

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong