
Sign up to save your podcasts
Or


Some people have been getting more optimistic about alignment. But from a skeptical / high p(doom) perspective, justifications for this optimism seem lacking.
"Claude is nice and can kinda do moral philosophy" just doesn't address the concern that lots of long horizon RL + self-reflection will lead to misaligned consequentialists (c.f. Hubinger)
So I think the casual alignment optimists aren't doing a great job of arguing their case. Still, it feels like there's an optimistic update somewhere in the current trajectory of AI development.
It really is kinda crazy how capable current models are, and how much I basically trust them. Paradoxically, most of this trust comes from lack of capabilities (current models couldn't seize power right now if they tried).
...and I think this is the positive update. It feels very plausible, in a visceral way, that the first economically transformative AI systems could be, in many ways, really dumb.
Slow takeoff implies that we'll get the stupidest possible transformative AI first. Moravec's paradox leads to a similar conclusion. Calling LLMs a "cultural technology" can be a form of AI denialism, but there's still an important truth there. If the secret [...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrong Some people have been getting more optimistic about alignment. But from a skeptical / high p(doom) perspective, justifications for this optimism seem lacking.
"Claude is nice and can kinda do moral philosophy" just doesn't address the concern that lots of long horizon RL + self-reflection will lead to misaligned consequentialists (c.f. Hubinger)
So I think the casual alignment optimists aren't doing a great job of arguing their case. Still, it feels like there's an optimistic update somewhere in the current trajectory of AI development.
It really is kinda crazy how capable current models are, and how much I basically trust them. Paradoxically, most of this trust comes from lack of capabilities (current models couldn't seize power right now if they tried).
...and I think this is the positive update. It feels very plausible, in a visceral way, that the first economically transformative AI systems could be, in many ways, really dumb.
Slow takeoff implies that we'll get the stupidest possible transformative AI first. Moravec's paradox leads to a similar conclusion. Calling LLMs a "cultural technology" can be a form of AI denialism, but there's still an important truth there. If the secret [...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

113,122 Listeners

132 Listeners

7,266 Listeners

529 Listeners

16,315 Listeners

4 Listeners

14 Listeners

2 Listeners