AI Based Paper Discussions

By Sigurd

A RSS feed with papers which are von interest in AI Safety Control and Agent Behavior

... more

· Technology

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about AI Based Paper Discussions:

How many episodes does AI Based Paper Discussions have?

The podcast currently has 16 episodes available.

AI Based Paper Discussions episodes:

March 08, 2026 Evaluating Frontier Models for Stealth and Situational Awareness
Evaluating Frontier Models for Stealth and Situational Awareness
...more
19min
March 08, 2026 RASP Discovering Interpretable Algorithms by Decompiling Transformers into Human-Readable Programs
Discovering Interpretable Algorithms by Decompiling Transformers into Human-Readable Programs
...more
15min
March 08, 2026 Reducing Harmful Generative AI Outputs via Consensus Sampling
Reducing Harmful Generative AI Outputs via Consensus Sampling
...more
13min
March 08, 2026 AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
...more
15min
March 07, 2026 Bridging Skill Gaps for the Future: New Jobs Creation in the AI Age
The demand and supply of new skills—especially in IT and AI—are reshaping labor markets,
impacting wages and hiring. About 1 in 10 job vacancies in advanced economies demands at least one new
skill, often appearing first in the United States. The incidence is about half of that in emerging market
economies. These skills boost average wages and employment but deepen polarization, mostly benefitting
high- and—through higher consumption of services—low-skilled workers, and potentially contributing to the
shrinking of the middle class. Vacancies demanding AI skills post higher wages, but the diffusion of such skills
is linked to lower employment in occupations with high exposure and low complementarity to AI, posing
challenges for the youth. A Skill Imbalance Index reveals wide cross-country differences. Economies facing
strong demand should prioritize education and reskilling, while those facing strong supply should foster firms’
absorption through innovation and access to credit.
...more
22min
March 07, 2026 Reasoning Models Struggle to Control their Chains of Thought
Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors
and understanding the motivations of modern reasoning models. However, if
models can control what they verbalize in their CoT, it could undermine CoT
monitorability. To measure this undesirable capability — CoT controllability —
we introduce the CoT-Control evaluation suite, which includes tasks that require
models to solve problems while adhering to CoT instructions, e.g., reasoning about
a genetics question without using the word “chromosome”. We show that reasoning
models possess significantly lower CoT controllability than output controllability;
for instance, Claude Sonnet 4.5 can control its CoT only 2.7% of the time but 61.9%
when controlling its final output. We also find that CoT controllability is higher
for larger models and decreases with more RL training, test-time compute, and
increased problem difficulty. CoT controllability failures extend even to situations
in which models are given incentives (as opposed to direct requests) to evade CoT
monitors, although models exhibit slightly higher controllability when they are
told they are being monitored. Similarly, eliciting controllability by adversarially
optimizing prompts does not meaningfully increase controllability. Our results
leave us cautiously optimistic that CoT controllability is currently unlikely to
be a failure mode of CoT monitorability. However, the mechanism behind low
controllability is not well understood. Given its importance for maintaining CoT
monitorability, we recommend that frontier labs track CoT controllability in future
models.
...more
31min

FAQs about AI Based Paper Discussions:

How many episodes does AI Based Paper Discussions have?

The podcast currently has 16 episodes available.