December 23, 2025

“Grounding Value Learning in Evolutionary Psychology: an Alternative Proposal to CEV” by RogerDearnaley

36 minutes

Epistemic status: I've been thinking about this topic for about 15 years, which led me to some counterintuitive conclusions, and I'm now writing up my thoughts concisely.

Value Learning

Value Learning offers hope for the Alignment problem in Artificial Intelligence: if we can sufficiently-nearly align our AIs, then they will want to help us, and should converge to full alignment with human values. However, for this to be possible, they will need (at least) a definition of what the phrase 'human values' means. The long-standing proposal for this is Eliezer Yudkowski's Coherent Extrapolated Volition (CEV):

In calculating CEV, an AI would predict what an idealized version of us would want, "if we knew more, thought faster, were more the people we wished we were, had grown up farther together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to generate the AI's utility function.

This is a somewhat hand-wavy definition. It feels like a limit of some convergence process along these lines might exist. However, the extrapolation process seems rather loosely defined, and without access to superhuman intelligence and sufficient computational resources, it's very [...]

---

Outline:

(00:24) Value Learning

(02:31) Evolutionary Psychology

(05:55) Two Asides

(09:52) Tool, or Equal?

(22:01) So its a Tool

(29:16) Consequences

(33:43) Reflection

The original text contained 2 footnotes which were omitted from this narration.

---

First published:

December 22nd, 2025

Source:

https://www.lesswrong.com/posts/iHdfi7jwDrBxa2MZ9/grounding-value-learning-in-evolutionary-psychology-an

---

Narrated by TYPE III AUDIO.

...more

View all episodes

By LessWrong