November 04, 2023

AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala

31 minutes

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Genetic fitness is a measure of selection strength, not the selection target, published by Kaj Sotala on November 4, 2023 on The AI Alignment Forum.

Alternative title: "Evolution suggests robust rather than fragile generalization of alignment properties."

frequently

repeated

argument

goes something like this:

Evolution has optimized humans for inclusive genetic fitness (IGF)

However, humans didn't end up explicitly optimizing for genetic fitness (e.g. they use contraception to avoid having children)

Therefore, even if we optimize an AI for X (typically something like "human values"), we shouldn't expect it to explicitly optimize for X

My argument is that premise 1 is a verbal shorthand that's technically incorrect, and premise 2 is at least misleading. As for the overall conclusion, I think that the case from evolution might be interpreted as weak evidence for why AI should be expected to

continue

optimizing human values even as its capability increases.

Summary of how premise 1 is wrong:

If we look closely at what evolution does, we can see that it selects for traits that are beneficial for surviving, reproducing, and passing one's genes to the next generation. This is often described as "optimizing for IGF", because the traits that are beneficial for these purposes are

usually

the ones that have the highest IGF. (This has some important exceptions, discussed later.) However, if we look closely at that process of selection, we can see that this kind of trait selection is

not

"optimizing for IGF" in the sense that, for example, we might optimize an AI to classify pictures.

The model that I'm sketching is something like this: evolution is an optimization function that, at any given time, is selecting for some traits that are in an important sense chosen at random. At any time, it might randomly shift to selecting for some other traits. Observing this selection process, we can calculate the IGF of traits currently under selection, as a measure of how strongly those are being selected. But evolution is not

optimizing for this measure

; evolution is

optimizing for

the traits that have currently been chosen for optimization

. Resultingly, there is no reason to expect that the minds created by evolution should optimize for IGF, but there

reason to expect that they would optimize for the traits that were actually under selection. This is something that we observe any time that humans optimize for some biological need.

In contrast, if we were optimizing an AI to classify pictures, we would not be randomly changing the selection criteria the way that evolution does. We would keep the selection criteria constant: always selecting for the property of classifying pictures the way we want. To the extent that the analogy to evolution holds, AIs should be much more likely to just do the thing they were selected for.

Summary of how premise 2 is misleading:

It is often implied that evolution selected humans to care about sex, and then sex led to offspring, and it was only recently with the evolution of contraception that this connection was severed. For example:

15. [...] We didn't break alignment with the 'inclusive reproductive fitness' outer loss function, immediately after the introduction of farming - something like 40,000 years into a 50,000 year Cro-Magnon takeoff, as was itself running very quickly relative to the outer optimization loop of natural selection. Instead, we got a lot of technology more advanced than was in the ancestral environment, including contraception, in one very fast burst relative to the speed of the outer optimization loop, late in the general intelligence game.

Eliezer Yudkowsky,

AGI Ruin: A List of Lethalities

This seems wrong to me. Contraception may be a very recent invention, but infanticide or killing children by neglect is not; there have al...

...more

View all episodes

By The Nonlinear Fund

November 04, 2023

AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala

31 minutes

Alternative title: "Evolution suggests robust rather than fragile generalization of alignment properties."

frequently

repeated

argument

goes something like this:

Evolution has optimized humans for inclusive genetic fitness (IGF)

However, humans didn't end up explicitly optimizing for genetic fitness (e.g. they use contraception to avoid having children)

Therefore, even if we optimize an AI for X (typically something like "human values"), we shouldn't expect it to explicitly optimize for X

continue

optimizing human values even as its capability increases.

Summary of how premise 1 is wrong:

usually

the ones that have the highest IGF. (This has some important exceptions, discussed later.) However, if we look closely at that process of selection, we can see that this kind of trait selection is

not

"optimizing for IGF" in the sense that, for example, we might optimize an AI to classify pictures.

optimizing for this measure

; evolution is

optimizing for

the traits that have currently been chosen for optimization

. Resultingly, there is no reason to expect that the minds created by evolution should optimize for IGF, but there

reason to expect that they would optimize for the traits that were actually under selection. This is something that we observe any time that humans optimize for some biological need.

Summary of how premise 2 is misleading:

Eliezer Yudkowsky,

AGI Ruin: A List of Lethalities

This seems wrong to me. Contraception may be a very recent invention, but infanticide or killing children by neglect is not; there have al...

...more

More shows like The Nonlinear Library: Alignment Forum

View all

AXRP - the AI X-risk Research Podcast

9 Listeners

Share AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala

Sign up to save your podcasts

AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala

AF - Genetic fitness is a measure of selection strength, not the selection target by Kaj Sotala

More shows like The Nonlinear Library: Alignment Forum

AXRP - the AI X-risk Research Podcast