November 10, 2025

What We Want

33 minutes

Large language models are trained to respond to our preferences. It sounds logical enough in theory, but it turns out to spiral in strange and unexpected directions in practice, from AI-induced psychosis in humans to manipulation and power-seeking on the part of the AIs.

In this episode, hear from Ihor Kendiukhov from SPAR (Supervised Program for Alignment Research) about why he changed his career to work on AI safety, and some of the current approaches in understanding what it is that LLMs might want themselves.

...more

View all episodes

By Witch of Glitch

November 10, 2025

What We Want

33 minutes

...more

Share What We Want

Sign up to save your podcasts

What We Want

What We Want