
Sign up to save your podcasts
Or


Read the full transcript here.
Are the existential risks posed by superhuman AI fundamentally different from prior technological threats such as nuclear weapons or pandemics? How do the inherent “alien drives” that emerge from AI training processes complicate our ability to control or align these systems? Can we truly predict the behavior of entities that are “grown” rather than “crafted,” and what does this mean for accountability? To what extent does the analogy between human evolutionary drives and AI training objectives illuminate potential failure modes? How should we conceptualize the difference between superficial helpfulness and deeply embedded, unintended AI motivations? What lessons can we draw from AI hallucinations and deceptive behaviors about the limits of current alignment techniques? How do we assess the danger that AI systems might actively seek to preserve and propagate themselves against human intervention? Is the “death sentence” scenario a realistic prediction or a worst-case thought experiment? How much uncertainty should we tolerate when the stakes involve potential human extinction?
Nate Soares is the President of the Machine Intelligence Research Institute and the co-author of the book If Anyone Builds It, Everyone Dies. He has been working in the field for over a decade, after previous experience at Microsoft and Google. Soares is the author of a large body of technical and semi-technical writing on AI alignment, including foundational work on value learning, decision theory, and power-seeking incentives in smarter-than-human AIs.
Links:
Staff
Music
Affiliates
By Spencer Greenberg4.8
133133 ratings
Read the full transcript here.
Are the existential risks posed by superhuman AI fundamentally different from prior technological threats such as nuclear weapons or pandemics? How do the inherent “alien drives” that emerge from AI training processes complicate our ability to control or align these systems? Can we truly predict the behavior of entities that are “grown” rather than “crafted,” and what does this mean for accountability? To what extent does the analogy between human evolutionary drives and AI training objectives illuminate potential failure modes? How should we conceptualize the difference between superficial helpfulness and deeply embedded, unintended AI motivations? What lessons can we draw from AI hallucinations and deceptive behaviors about the limits of current alignment techniques? How do we assess the danger that AI systems might actively seek to preserve and propagate themselves against human intervention? Is the “death sentence” scenario a realistic prediction or a worst-case thought experiment? How much uncertainty should we tolerate when the stakes involve potential human extinction?
Nate Soares is the President of the Machine Intelligence Research Institute and the co-author of the book If Anyone Builds It, Everyone Dies. He has been working in the field for over a decade, after previous experience at Microsoft and Google. Soares is the author of a large body of technical and semi-technical writing on AI alignment, including foundational work on value learning, decision theory, and power-seeking incentives in smarter-than-human AIs.
Links:
Staff
Music
Affiliates

2,680 Listeners

1,714 Listeners

26,411 Listeners

4,272 Listeners

2,453 Listeners

1,540 Listeners

905 Listeners

124 Listeners

4,009 Listeners

936 Listeners

4,160 Listeners

101 Listeners

560 Listeners

20 Listeners

148 Listeners