
Sign up to save your podcasts
Or
Crossposted from the AI Optimists blog.
AI doom scenarios often suppose that future AIs will engage in scheming— planning to escape, gain power, and pursue ulterior motives, while deceiving us into thinking they are aligned with our interests. The worry is that if a schemer escapes, it may seek world domination to ensure humans do not interfere with its plans, whatever they may be.
In this essay, we debunk the counting argument— a central reason to think AIs might become schemers, according to a recent report by AI safety researcher Joe Carlsmith.[1] It's premised on the idea that schemers can have “a wide variety of goals,” while the motivations of a non-schemer must be benign by definition. Since there are “more” possible schemers than non-schemers, the argument goes, we should expect training to produce schemers most of [...]
---
Outline:
(02:43) The counting argument for overfitting
(06:03) Dancing through a minefield of bad networks
(07:36) Against the indifference principle
(09:42) Against goal realism
(12:17) Goal slots are expensive
(13:57) Inner goals would be irrelevant
(17:23) Goal realism is anti-Darwinian
(19:47) Goal reductionism is powerful
(21:13) Other arguments for scheming
(22:53) Simplicity arguments
(26:00) Conclusion
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
Crossposted from the AI Optimists blog.
AI doom scenarios often suppose that future AIs will engage in scheming— planning to escape, gain power, and pursue ulterior motives, while deceiving us into thinking they are aligned with our interests. The worry is that if a schemer escapes, it may seek world domination to ensure humans do not interfere with its plans, whatever they may be.
In this essay, we debunk the counting argument— a central reason to think AIs might become schemers, according to a recent report by AI safety researcher Joe Carlsmith.[1] It's premised on the idea that schemers can have “a wide variety of goals,” while the motivations of a non-schemer must be benign by definition. Since there are “more” possible schemers than non-schemers, the argument goes, we should expect training to produce schemers most of [...]
---
Outline:
(02:43) The counting argument for overfitting
(06:03) Dancing through a minefield of bad networks
(07:36) Against the indifference principle
(09:42) Against goal realism
(12:17) Goal slots are expensive
(13:57) Inner goals would be irrelevant
(17:23) Goal realism is anti-Darwinian
(19:47) Goal reductionism is powerful
(21:13) Other arguments for scheming
(22:53) Simplicity arguments
(26:00) Conclusion
The original text contained 16 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,446 Listeners
2,389 Listeners
7,910 Listeners
4,136 Listeners
87 Listeners
1,462 Listeners
9,095 Listeners
87 Listeners
389 Listeners
5,438 Listeners
15,220 Listeners
475 Listeners
121 Listeners
75 Listeners
461 Listeners