February 29, 2024

[Linkpost] “Bengio’s Alignment Proposal: ‘Towards a Cautious Scientist AI with Convergent Safety Bounds’” by mattmacdermott

24 minutes

This is a linkpost for https://yoshuabengio.org/2024/02/26/towards-a-cautious-scientist-ai-with-convergent-safety-bounds/

Yoshua Bengio recently posted a high-level overview of his alignment research agenda. I'm pasting the full text below since it's fairly short.

What can’t we afford with a future superintelligent AI? Among others, confidently wrong predictions about the harm that some actions could yield. Especially catastrophic harm. Especially if these actions could spell the end of humanity.

How can we design an AI that will be highly capable and will not harm humans? In my opinion, we need to figure out this question – of controlling AI so that it behaves in really safe ways – before we reach human-level AI, aka AGI; and to be successful, we need all hands on deck. Economic and military pressures to accelerate advances in AI capabilities will continue to push forward even if we have not figured out how to make superintelligent AI safe. And even if [...]

---

First published:

February 29th, 2024

Source:

https://www.lesswrong.com/posts/edvyWfKdJHnoPkM2J/bengio-s-alignment-proposal-towards-a-cautious-scientist-ai

Linkpost URL:
https://yoshuabengio.org/2024/02/26/towards-a-cautious-scientist-ai-with-convergent-safety-bounds/

---

Narrated by TYPE III AUDIO.

...more