September 27, 2022

"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotra

3 hours 7 minutes

https://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-to

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I think that in the coming 15-30 years, the world could plausibly develop “transformative AI”: AI powerful enough to bring us into a new, qualitatively different future, via an explosion in science and technology R&D. This sort of AI could be sufficient to make this the most important century of all time for humanity.

The most straightforward vision for developing transformative AI that I can imagine working with very little innovation in techniques is what I’ll call human feedback[1] on diverse tasks (HFDT):

Train a powerful neural network model to simultaneously master a wide variety of challenging tasks (e.g. software development, novel-writing, game play, forecasting, etc) by using reinforcement learning on human feedback and other metrics of performance.

HFDT is not the only approach to developing transformative AI,[2] and it may not work at all.[3] But I take it very seriously, and I’m aware of increasingly many executives and ML researchers at AI companies who believe something within this space could work soon.

Unfortunately, I think that if AI companies race forward training increasingly powerful models using HFDT, this is likely to eventually lead to a full-blown AI takeover (i.e. a possibly violent uprising or coup by AI systems). I don’t think this is a certainty, but it looks like the best-guess default absent specific efforts to prevent it.

...more

View all episodes

By LessWrong

4.8

1212 ratings

September 27, 2022

"Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover" by Ajeya Cotra

3 hours 7 minutes

https://www.lesswrong.com/posts/pRkFkzwKZ2zfa3R6H/without-specific-countermeasures-the-easiest-path-to

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.