
Sign up to save your podcasts
Or


Recently, Joe Carlsmith switched to work at Anthropic. He joins other members of the larger EA and Open Philanthropy ecosystem who are working at the AI lab, such as Holden Karnofsky. And of course many of the original founders were EA affiliated.
In short, I think Anthropic is honest and is attempting to be an ethical AI lab, but they are deeply mistaken about the difficulty they are facing and are dangerously affecting the AI safety ecosystem. My guess is that Anthropic for the most part is actually being internally honest and not consciously trying to deceive people. When they say they believe in being responsible, I think that's what they genuinely believe.
My criticism of Anthropic is based on them not having a promising plan and creating a dangerous counter-narrative to AI safety efforts. It's simply not enough to develop AI gradually, perform evaluations and do interpretability work to build safe superintelligence. With the methods we have, we're just not going to reach safe superintelligence. Gradual development (RSP) only has a small benefit—on a gradual scale, you may be able to see problems emerge, but it doesn't tell you how to solve them. The same goes for [...]
---
Outline:
(01:33) We only get one critical try to test our methods
(03:12) Anything close to current methods won't be enough
(05:44) Three Groups and the Counter-Narrative
(07:32) Will Anthropic give us evidence to stop?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrongRecently, Joe Carlsmith switched to work at Anthropic. He joins other members of the larger EA and Open Philanthropy ecosystem who are working at the AI lab, such as Holden Karnofsky. And of course many of the original founders were EA affiliated.
In short, I think Anthropic is honest and is attempting to be an ethical AI lab, but they are deeply mistaken about the difficulty they are facing and are dangerously affecting the AI safety ecosystem. My guess is that Anthropic for the most part is actually being internally honest and not consciously trying to deceive people. When they say they believe in being responsible, I think that's what they genuinely believe.
My criticism of Anthropic is based on them not having a promising plan and creating a dangerous counter-narrative to AI safety efforts. It's simply not enough to develop AI gradually, perform evaluations and do interpretability work to build safe superintelligence. With the methods we have, we're just not going to reach safe superintelligence. Gradual development (RSP) only has a small benefit—on a gradual scale, you may be able to see problems emerge, but it doesn't tell you how to solve them. The same goes for [...]
---
Outline:
(01:33) We only get one critical try to test our methods
(03:12) Anything close to current methods won't be enough
(05:44) Three Groups and the Counter-Narrative
(07:32) Will Anthropic give us evidence to stop?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

26,319 Listeners

2,452 Listeners

8,521 Listeners

4,175 Listeners

93 Listeners

1,602 Listeners

9,938 Listeners

96 Listeners

517 Listeners

5,509 Listeners

15,892 Listeners

553 Listeners

131 Listeners

93 Listeners

465 Listeners