
Sign up to save your podcasts
Or


Read the full transcript here.
Are the existential risks posed by superhuman AI fundamentally different from prior technological threats such as nuclear weapons or pandemics? How do the inherent “alien drives” that emerge from AI training processes complicate our ability to control or align these systems? Can we truly predict the behavior of entities that are “grown” rather than “crafted,” and what does this mean for accountability? To what extent does the analogy between human evolutionary drives and AI training objectives illuminate potential failure modes? How should we conceptualize the difference between superficial helpfulness and deeply embedded, unintended AI motivations? What lessons can we draw from AI hallucinations and deceptive behaviors about the limits of current alignment techniques? How do we assess the danger that AI systems might actively seek to preserve and propagate themselves against human intervention? Is the “death sentence” scenario a realistic prediction or a worst-case thought experiment? How much uncertainty should we tolerate when the stakes involve potential human extinction?
Nate Soares is the President of the Machine Intelligence Research Institute and the co-author of the book If Anyone Builds It, Everyone Dies. He has been working in the field for over a decade, after previous experience at Microsoft and Google. Soares is the author of a large body of technical and semi-technical writing on AI alignment, including foundational work on value learning, decision theory, and power-seeking incentives in smarter-than-human AIs.
Links:
Staff
Music
Affiliates
By Spencer Greenberg4.8
133133 ratings
Read the full transcript here.
Are the existential risks posed by superhuman AI fundamentally different from prior technological threats such as nuclear weapons or pandemics? How do the inherent “alien drives” that emerge from AI training processes complicate our ability to control or align these systems? Can we truly predict the behavior of entities that are “grown” rather than “crafted,” and what does this mean for accountability? To what extent does the analogy between human evolutionary drives and AI training objectives illuminate potential failure modes? How should we conceptualize the difference between superficial helpfulness and deeply embedded, unintended AI motivations? What lessons can we draw from AI hallucinations and deceptive behaviors about the limits of current alignment techniques? How do we assess the danger that AI systems might actively seek to preserve and propagate themselves against human intervention? Is the “death sentence” scenario a realistic prediction or a worst-case thought experiment? How much uncertainty should we tolerate when the stakes involve potential human extinction?
Nate Soares is the President of the Machine Intelligence Research Institute and the co-author of the book If Anyone Builds It, Everyone Dies. He has been working in the field for over a decade, after previous experience at Microsoft and Google. Soares is the author of a large body of technical and semi-technical writing on AI alignment, including foundational work on value learning, decision theory, and power-seeking incentives in smarter-than-human AIs.
Links:
Staff
Music
Affiliates

15,228 Listeners

2,675 Listeners

26,344 Listeners

4,277 Listeners

2,452 Listeners

1,547 Listeners

319 Listeners

124 Listeners

937 Listeners

4,165 Listeners

509 Listeners

210 Listeners

44 Listeners

133 Listeners

95 Listeners