November 01, 2023

AF - My thoughts on the social response to AI risk by Matthew Barnett

18 minutes

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My thoughts on the social response to AI risk, published by Matthew Barnett on November 1, 2023 on The AI Alignment Forum.

A common theme implicit in many AI risk stories has been that broader society will either fail to anticipate the risks of AI until it is too late, or do little to address those risks in a serious manner. In my opinion, there are now clear signs that this assumption is false, and that society will address AI with something approaching both the attention and diligence it deserves. For example, one clear sign is

Joe Biden's recent executive order on AI safety

[1]

. In light of recent news, it is worth comprehensively re-evaluating which sub-problems of AI risk are likely to be solved without further intervention from the AI risk community (e.g. perhaps deceptive alignment), and which ones won't be.

Since I think substantial AI regulation is likely by default, I urge effective altruists to focus more on ensuring that the regulation is thoughtful and well-targeted rather than ensuring that regulation happens at all. Ultimately, I argue in favor of a cautious and nuanced approach towards policymaking, in contrast to

broader public AI safety advocacy

[2]

In the past, when I've read stories from AI risk adjacent people about what the future could look like, I have often noticed that the author assumes that humanity will essentially be asleep at the wheel with regards to the risks of unaligned AI, and won't put in place substantial safety regulations on the technology - unless of course EA and LessWrong-aligned researchers unexpectedly upset the gameboard by achieving a

pivotal act

. We can call this premise

the assumption of an inattentive humanity

[3]

While most often implicit, the assumption of an inattentive humanity was sometimes stated explicitly in people's stories about the future.

For example, in

a post from Marius Hobbhahn published last year

about a realistic portrayal of the next few decades, Hobbhahn outlines a series of AI failure modes that occur as AI gets increasingly powerful. These failure modes include a malicious actor using an AI model to create a virus that "kills ~1000 people but is stopped in its tracks because the virus kills its hosts faster than it spreads", an AI model attempting to escape its data center after having "tried to establish a cult to "free" the model by getting access to its model weights", and a medical AI model that "hacked a large GPU cluster and then tried to contact ordinary people over the internet to participate in some unspecified experiment". Hobbhahn goes on to say,

People are concerned about this but the news is as quickly forgotten as an oil spill in the 2010s or a crypto scam in 2022. Billions of dollars of property damage have a news lifetime of a few days before they are swamped by whatever any random politician has posted on the internet or whatever famous person has gotten a new partner. The tech changed, the people who consume the news didn't. The incentives are still the same.

Stefan Schubert

subsequently commented

that this scenario seems implausible,

I expect that people would freak more over such an incident than they would freak out over an oil spill or a crypto scam. For instance, an oil spill is a well-understood phenomenon, and even though people would be upset about it, it would normally not make them worry about a proliferation of further oil spills. By contrast, in this case the harm would come from a new and poorly understood technology that's getting substantially more powerful every year. Therefore I expect the reaction to the kind of harm from AI described here to be quite different from the reaction to oil spills or crypto scams.

I believe Schubert's point has been strengthened by recent events, including

Biden's executive order

that touches on many aspects of AI risk

[1]

t...

...more