
Sign up to save your podcasts
Or


Audio version (read by the author) here, or search for "Joe Carlsmith Audio" in your podcast app.
This is the ninth essay in a series I’m calling “How do we solve the alignment problem?”. I’m hoping that the individual essays can be read fairly well on their own, but see this introduction for a summary of the essays that have been released thus far, plus a bit more about the series as a whole.
1. Introduction
At this point in the series, I’ve outlined most of my current picture of what it would look like to build a mature science of AI alignment. But I left off one particular topic that I think worth discussing on its own: namely, the importance of building AIs that do what I’ll call “human-like philosophy.”
I want to discuss this topic on its own because I think that the discourse about AI alignment is often haunted by some sense that AI alignment is not, merely, a “scientific” problem. Rather: it's also, in part, a philosophical (and perhaps especially, an ethical) problem; that it's hard, at least in part, because philosophy is hard; and that solving it is likely to require some very sophisticated [...]
---
Outline:
(00:33) 1. Introduction
(04:21) 2. Philosophy as a tool for out-of-distribution generalization
(10:55) 3. Some limits to the importance of philosophy to AI alignment
(17:55) 4. When is philosophy existential?
(22:18) 5. The challenge of human-like philosophy
(22:29) 5.1. The relationship between human-like philosophy and human-like motivations
(27:27) 5.2. How hard is human-like philosophy itself?
(28:08) 5.2.1. Capability
(29:35) 5.2.2. Disposition
(33:41) 6. What does working on this look like?
The original text contained 9 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrongAudio version (read by the author) here, or search for "Joe Carlsmith Audio" in your podcast app.
This is the ninth essay in a series I’m calling “How do we solve the alignment problem?”. I’m hoping that the individual essays can be read fairly well on their own, but see this introduction for a summary of the essays that have been released thus far, plus a bit more about the series as a whole.
1. Introduction
At this point in the series, I’ve outlined most of my current picture of what it would look like to build a mature science of AI alignment. But I left off one particular topic that I think worth discussing on its own: namely, the importance of building AIs that do what I’ll call “human-like philosophy.”
I want to discuss this topic on its own because I think that the discourse about AI alignment is often haunted by some sense that AI alignment is not, merely, a “scientific” problem. Rather: it's also, in part, a philosophical (and perhaps especially, an ethical) problem; that it's hard, at least in part, because philosophy is hard; and that solving it is likely to require some very sophisticated [...]
---
Outline:
(00:33) 1. Introduction
(04:21) 2. Philosophy as a tool for out-of-distribution generalization
(10:55) 3. Some limits to the importance of philosophy to AI alignment
(17:55) 4. When is philosophy existential?
(22:18) 5. The challenge of human-like philosophy
(22:29) 5.1. The relationship between human-like philosophy and human-like motivations
(27:27) 5.2. How hard is human-like philosophy itself?
(28:08) 5.2.1. Capability
(29:35) 5.2.2. Disposition
(33:41) 6. What does working on this look like?
The original text contained 9 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

113,122 Listeners

132 Listeners

7,266 Listeners

529 Listeners

16,315 Listeners

4 Listeners

14 Listeners

2 Listeners