July 07, 2023

LW - Passing the ideological Turing test? Arguments against existential risk from AI. by NinaR

11 minutes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Passing the ideological Turing test? Arguments against existential risk from AI., published by NinaR on July 7, 2023 on LessWrong.

I think there is a non-negligible risk of powerful AI systems being an existential or catastrophic threat to humanity. I will refer to this as “AI X-Risk.”

However, it is important to understand the arguments of those you disagree with. In this post, I aim to provide a broad summary of arguments suggesting that the probability of AI X-Risk over the next few decades is low if we continue current approaches to training AI systems.

Before describing counterarguments, here is a brief overview of the AI X-Risk position:

Continuing the current trajectory of AI research and development could result in an extremely capable system that:

Doesn’t care sufficiently about humans

Wants to affect the world

The more powerful a system, the more dangerous minor differences in goals and values are. If a powerful system doesn’t care about something, it will make arbitrary sacrifices to pursue an objective or take a particular action. Encoding everything we care about into an AI poses an unsolved challenge.

As written by Professor Stuart Russell, author of Artificial Intelligence: A Modern Approach:

A system that is optimizing a function of n variables, where the objective depends on a subset of size k

In some sense, training a powerful AI is like bringing a superintelligent alien species into existence. If you would be scared of aliens orders of magnitude more intelligent than us visiting Earth, you should be scared of very powerful AI.

The following arguments will question one or more aspects of the case above.

Superintelligent AI won’t pursue a goal that results in harm to humans

Proponents of this view argue against the idea that a highly optimized, powerful AI system will likely take actions that disempower or drastically harm humanity. They claim that either the system will not behave as a strong goal-oriented agent or that the goal will be fully compatible with not harming humans.

For example, Yann LeCun, a pioneering figure in the realm of deep learning, has written:

We tend to conflate intelligence with the drive to achieve dominance. This confusion is understandable: During our evolutionary history as (often violent) primates, intelligence was key to social dominance and enabled our reproductive success. And indeed, intelligence is a powerful adaptation, like horns, sharp claws or the ability to fly, which can facilitate survival in many ways. But intelligence per se does not generate the drive for domination, any more than horns do.

It is just the ability to acquire and apply knowledge and skills in pursuit of a goal. Intelligence does not provide the goal itself, merely the means to achieve it. “Natural intelligence” - the intelligence of biological organisms - is an evolutionary adaptation, and like other such adaptations, it emerged under natural selection because it improved survival and propagation of the species. These goals are hardwired as instincts deep in the nervous systems of even the simplest organisms.

But because AI systems did not pass through the crucible of natural selection, they did not need to evolve a survival instinct. In AI, intelligence and survival are decoupled, and so intelligence can serve whatever goals we set for it.

LeCun’s argument implies that an AI is unlikely to execute perilous actions unless it possesses the drive to achieve dominance. Still, undermining or harming humanity could be an unintended side-effect or instrumental goal while the AI pursues another unrelated objective. Achieving most goals becomes easier when one has power and resources; taking power and resources from humans is one way to accomplish this. However, it’s unclear that all goals incentivize disempowering humanity. Furthermore, even if tak...

...more

View all episodes

By The Nonlinear Fund

4.6

88 ratings

July 07, 2023

LW - Passing the ideological Turing test? Arguments against existential risk from AI. by NinaR

11 minutes

I think there is a non-negligible risk of powerful AI systems being an existential or catastrophic threat to humanity. I will refer to this as “AI X-Risk.”

Before describing counterarguments, here is a brief overview of the AI X-Risk position:

Continuing the current trajectory of AI research and development could result in an extremely capable system that:

Doesn’t care sufficiently about humans

Wants to affect the world

As written by Professor Stuart Russell, author of Artificial Intelligence: A Modern Approach:

A system that is optimizing a function of n variables, where the objective depends on a subset of size k

The following arguments will question one or more aspects of the case above.

Superintelligent AI won’t pursue a goal that results in harm to humans

For example, Yann LeCun, a pioneering figure in the realm of deep learning, has written:

...more

Share LW - Passing the ideological Turing test? Arguments against existential risk from AI. by NinaR

Sign up to save your podcasts

LW - Passing the ideological Turing test? Arguments against existential risk from AI. by NinaR

LW - Passing the ideological Turing test? Arguments against existential risk from AI. by NinaR