Future of Life Institute Podcast

Dan Hendrycks on Catastrophic AI Risks

11.03.2023 - By Future of Life InstitutePlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

Dan Hendrycks joins the podcast again to discuss X.ai, how AI risk thinking has evolved, malicious use of AI, AI race dynamics between companies and between militaries, making AI organizations safer, and how representation engineering could help us understand AI traits like deception. You can learn more about Dan's work at https://www.safe.ai

Timestamps:

00:00 X.ai - Elon Musk's new AI venture

02:41 How AI risk thinking has evolved

12:58 AI bioengeneering

19:16 AI agents

24:55 Preventing autocracy

34:11 AI race - corporations and militaries

48:04 Bulletproofing AI organizations

1:07:51 Open-source models

1:15:35 Dan's textbook on AI safety

1:22:58 Rogue AI

1:28:09 LLMs and value specification

1:33:14 AI goal drift

1:41:10 Power-seeking AI

1:52:07 AI deception

1:57:53 Representation engineering

More episodes from Future of Life Institute Podcast