The Nonlinear Library: Alignment Forum

AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala


Listen Later

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Genetic fitness is a measure of selection strength, not the selection target, published by Kaj Sotala on November 4, 2023 on The AI Alignment Forum.
Alternative title: "Evolution suggests robust rather than fragile generalization of alignment properties."
A
frequently
repeated
argument
goes something like this:
Evolution has optimized humans for inclusive genetic fitness (IGF)
However, humans didn't end up explicitly optimizing for genetic fitness (e.g. they use contraception to avoid having children)
Therefore, even if we optimize an AI for X (typically something like "human values"), we shouldn't expect it to explicitly optimize for X
My argument is that premise 1 is a verbal shorthand that's technically incorrect, and premise 2 is at least misleading. As for the overall conclusion, I think that the case from evolution might be interpreted as weak evidence for why AI should be expected to
continue
optimizing human values even as its capability increases.
Summary of how premise 1 is wrong:
If we look closely at what evolution does, we can see that it selects for traits that are beneficial for surviving, reproducing, and passing one's genes to the next generation. This is often described as "optimizing for IGF", because the traits that are beneficial for these purposes are
usually
the ones that have the highest IGF. (This has some important exceptions, discussed later.) However, if we look closely at that process of selection, we can see that this kind of trait selection is
not
"optimizing for IGF" in the sense that, for example, we might optimize an AI to classify pictures.
The model that I'm sketching is something like this: evolution is an optimization function that, at any given time, is selecting for some traits that are in an important sense chosen at random. At any time, it might randomly shift to selecting for some other traits. Observing this selection process, we can calculate the IGF of traits currently under selection, as a measure of how strongly those are being selected. But evolution is not
optimizing for this measure
; evolution is
optimizing for
the traits that have currently been chosen for optimization
. Resultingly, there is no reason to expect that the minds created by evolution should optimize for IGF, but there
is
reason to expect that they would optimize for the traits that were actually under selection. This is something that we observe any time that humans optimize for some biological need.
In contrast, if we were optimizing an AI to classify pictures, we would not be randomly changing the selection criteria the way that evolution does. We would keep the selection criteria constant: always selecting for the property of classifying pictures the way we want. To the extent that the analogy to evolution holds, AIs should be much more likely to just do the thing they were selected for.
Summary of how premise 2 is misleading:
It is often implied that evolution selected humans to care about sex, and then sex led to offspring, and it was only recently with the evolution of contraception that this connection was severed. For example:
15. [...] We didn't break alignment with the 'inclusive reproductive fitness' outer loss function, immediately after the introduction of farming - something like 40,000 years into a 50,000 year Cro-Magnon takeoff, as was itself running very quickly relative to the outer optimization loop of natural selection. Instead, we got a lot of technology more advanced than was in the ancestral environment, including contraception, in one very fast burst relative to the speed of the outer optimization loop, late in the general intelligence game.
Eliezer Yudkowsky,
AGI Ruin: A List of Lethalities
This seems wrong to me. Contraception may be a very recent invention, but infanticide or killing children by neglect is not; there have al...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear Library: Alignment ForumBy The Nonlinear Fund


More shows like The Nonlinear Library: Alignment Forum

View all
AXRP - the AI X-risk Research Podcast by Daniel Filan

AXRP - the AI X-risk Research Podcast

9 Listeners