LessWrong (30+ Karma)

“Building AIs that do human-like philosophy” by Joe Carlsmith


Listen Later

Audio version (read by the author) here, or search for "Joe Carlsmith Audio" in your podcast app.

This is the ninth essay in a series I’m calling “How do we solve the alignment problem?”. I’m hoping that the individual essays can be read fairly well on their own, but see this introduction for a summary of the essays that have been released thus far, plus a bit more about the series as a whole.

1. Introduction

At this point in the series, I’ve outlined most of my current picture of what it would look like to build a mature science of AI alignment. But I left off one particular topic that I think worth discussing on its own: namely, the importance of building AIs that do what I’ll call “human-like philosophy.”

I want to discuss this topic on its own because I think that the discourse about AI alignment is often haunted by some sense that AI alignment is not, merely, a “scientific” problem. Rather: it's also, in part, a philosophical (and perhaps especially, an ethical) problem; that it's hard, at least in part, because philosophy is hard; and that solving it is likely to require some very sophisticated [...]

---

Outline:

(00:33) 1. Introduction

(04:21) 2. Philosophy as a tool for out-of-distribution generalization

(10:55) 3. Some limits to the importance of philosophy to AI alignment

(17:55) 4. When is philosophy existential?

(22:18) 5. The challenge of human-like philosophy

(22:29) 5.1. The relationship between human-like philosophy and human-like motivations

(27:27) 5.2. How hard is human-like philosophy itself?

(28:08) 5.2.1. Capability

(29:35) 5.2.2. Disposition

(33:41) 6. What does working on this look like?

The original text contained 9 footnotes which were omitted from this narration.

---

First published:

January 29th, 2026

Source:

https://www.lesswrong.com/posts/zFZHHnLez6k8ykxpu/building-ais-that-do-human-like-philosophy

---

Narrated by TYPE III AUDIO.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

113,122 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

132 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,266 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

529 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,315 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners