LessWrong (30+ Karma)

By LessWrong

Audio narrations of LessWrong posts.... more

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 1,864 episodes available.

LessWrong (30+ Karma) episodes:

May 07, 2025 “Negative Results on Group SAEs” by Josh Engels
Introduction
Soon after we released Not All Language Model Features Are One-Dimensionally Linear, I started working with @Logan Riggs and @Jannik Brinkmann on a natural followup to the paper: could we build a variant of SAEs that could find multi-dimensional features directly, instead of needing to cluster SAE latents post-hoc like we did in the paper.
We worked on this for a few months last summer and tried a bunch of things. Unfortunately, none of our results were that compelling, and eventually our interest in the project died down and we didn’t publish our (mostly negative) results. Recently, multiple people (@Noa Nabeshima , @chanind, Goncalo Paulo) said they were interested in working on SAEs that could find multi-dimensional features, so I decided I would write up what we tried.
At this point the results are almost a year old, but I think the overall narrative should still [...]

---
Outline:
(00:10) Introduction
(02:32) Group SAEs
(03:23) Synthetic Circles Experiments
(07:15) Training Group SAEs on GPT-2
(07:27) High level metrics
(09:28) Do the Group SAEs Capture Known Circular Subspaces
(11:46) Other Things We Tried
(12:03) Experimenting with learned groups
(12:08) Motivation and Ideas
(15:43) Learned Group Space
(18:13) Conclusion
---

First published:
May 6th, 2025

Source:
https://www.lesswrong.com/posts/jKKbRKuXNaLujnojw/untitled-draft-okbt
---

Narrated by TYPE III AUDIO.

---
Images from the article:
" showing colored points from 1-12" style="max-width: 100%;" />
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
19min
May 06, 2025 “$500 + $500 Bounty Problem: An (Approximately) Deterministic Maximal Redund Always Exists” by johnswentworth, David Lorell

Audio note: this article contains 61 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

A lot of our work involves "redunds". A random variable _Gamma_ is a(n exact) redund over two random variables _X_1, X_2_ exactly when both
_X_1 rightarrow X_2 rightarrow Gamma_
_X_2 rightarrow X_1 rightarrow Gamma_
Conceptually, these two diagrams say that _X_1_ gives exactly the same information about _Gamma_ as all of _X_, and _X_2_ gives exactly the same information about _Gamma_ as all of _X_; whatever information _X_ contains about _Gamma_ is redundantly represented in _X_1_ and _X_2_. Unpacking the diagrammatic notation and simplifying a little, the diagrams say _P[Gamma|X_1] = P[Gamma|X_2] = P[Gamma|X]_ for all _X_ such that _P[X] > 0_.
The exact redundancy conditions are too restrictive to be of much practical relevance, but we are [...]

---
Outline:
(02:31) What We Want For The Bounty
(04:29) Some Intuition From The Exact Case
(05:57) Why We Want This
---

First published:
May 6th, 2025

Source:
https://www.lesswrong.com/posts/sCNdkuio62Fi9qQZK/usd500-usd500-bounty-problem-an-approximately-deterministic
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
7min
May 06, 2025 “Zuckerberg’s Dystopian AI Vision” by Zvi
You think it's bad now? Oh, you have no idea. In his talks with Ben Thompson and Dwarkesh Patel, Zuckerberg lays out his vision for our AI future.
I thank him for his candor. I’m still kind of boggled that he said all of it out loud.
We will start with the situation now. How are things going on Facebook in the AI era?
Oh, right.
Sakib: Again, it happened again. Opened Facebook and I saw this. I looked at the comments and they’re just unsuspecting boomers congratulating the fake AI gen couple
Deepfates: You think those are real boomers in the comments?

This continues to be 100% Zuckerberg's fault, and 100% an intentional decision.
The algorithm knows full well what kind of post this is. It still floods people with them, especially if you click even once. If they wanted to stop it, they easily could.
There's also the [...]
---
Outline:
(01:53) Zuckerberg Tells it to Thompson
(05:21) He's Still Defending Llama 4
(05:50) Big Meta Is Watching You
(07:00) Zuckerberg Tells it to Patel
(14:46) When You Need a Friend
(17:52) Perhaps That Was All a Bit Harsh
---

First published:
May 6th, 2025

Source:
https://www.lesswrong.com/posts/QNkcRAzwKYGpEb8Nj/zuckerberg-s-dystopian-ai-vision
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
20min
May 06, 2025“Nonprofit to retain control of OpenAI” by Archimedes
The OpenAI Board has an updated plan for evolving OpenAI's structure.
OpenAI was founded as a nonprofit, and is today overseen and controlled by that nonprofit. Going forward, it will continue to be overseen and controlled by that nonprofit.
Our for-profit LLC, which has been under the nonprofit since 2019, will transition to a Public Benefit Corporation (PBC)–a purpose-driven company structure that has to consider the interests of both shareholders and the mission.

---

First published:
May 5th, 2025

Source:
https://www.lesswrong.com/posts/28d6TmCT4v7tErihR/nonprofit-to-retain-control-of-openai
---

Narrated by TYPE III AUDIO.
...more
1min
May 06, 2025“Five Hinge‑Questions That Decide Whether AGI Is Five Years Away or Twenty” by charlieoneill
For people who care about falsifiable stakes rather than vibes
TL;DR
All timeline arguments ultimately turn on five quantitative pivots. Pick optimistic answers to three of them and your median forecast collapses into the 2026–2029 range; pick pessimistic answers to any two and you drift past 2040. The pivots (I think) are:

Which empirical curve matters (hardware spend, algorithmic efficiency, or revenue)
Whether software‑only recursive self‑improvement (RSI) can accelerate capabilities faster than hardware can be installed.
How sharply compute translates into economic value once broad “agentic” reliability is reached.
Whether automating half of essential tasks ignites runaway growth or whether Baumol's law keeps aggregate productivity anchored until all bottlenecks fall
How much alignment fear, regulation, and supply‑chain friction slow scale‑up
The rest of this post traces how the canonical short‑timeline narrative AI 2027 and the long‑timeline essays by Ege Erdil and Zhendong Zheng + Arjun Ramani diverge on each hinge [...]

---
Outline:
(00:16) TL;DR
(01:31) Shared premises
(01:57) Hinge #1: Which curve do we extrapolate?
(04:00) Hinge #2: Can software‑only recursive self‑improvement outrun atoms?
(06:07) Hinge #3: How efficient (and how sudden) is the leap from compute to economic value?
(07:34) Hinge #4: Must we automate everything, or is half enough?
(08:56) Hinge #5: Alignment‑driven and institutional drag
(10:10) Dependency Structure
The original text contained 1 footnote which was omitted from this narration.
---
First published:
May 6th, 2025

Source:
https://www.lesswrong.com/posts/45oxYwysFiqwfKCcN/untitled-draft-keg3
---

Narrated by TYPE III AUDIO.
...more
12min
May 06, 2025“GPT-4o Sycophancy Post Mortem” by Zvi
Last week I covered that GPT-4o was briefly an (even more than usually) absurd sycophant, and how OpenAI responded to that.
Their explanation at that time was paper thin. It didn’t tell us much that we did not already know, and seemed to suggest they had learned little from the incident.
Rolling Stone has a write-up of some of the people whose delusions got reinforced by ChatGPT, which has been going on for a while – this sycophancy incident made things way worse but the pattern isn’t new. Here's some highlights, but the whole thing is wild anecdotes throughout, and they point to a ChatGPT induced psychosis thread on Reddit. I would love to know how often this actually happens.
Table of Contents
There's An Explanation For (Some Of) This.
What Have We Learned?
What About o3 The Lying Liar?
o3 [...]

---
Outline:
(00:51) There's An Explanation For (Some Of) This
(02:50) What Have We Learned?
(10:09) What About o3 The Lying Liar?
(12:21) o3 The Source Fabricator
(14:25) There Is Still A Lot We Don't Know
(20:43) You Must Understand The Logos
(25:17) Circling Back
(28:11) The Good News
---

First published:
May 5th, 2025

Source:
https://www.lesswrong.com/posts/KyndnEA7NMFrDKtJG/gpt-4o-sycophancy-post-mortem
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
30min
May 05, 2025 [Linkpost] “Tsinghua paper: Does RL Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?” by Thomas Kwa
This is a link post.
arXiv | project page | Authors: Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang
This paper from Tsinghua find that RL on verifiable rewards (RLVR) just increases the frequency at which capabilities are sampled, rather than giving a base model new capabilities. To do this, they compare pass@k scores between a base model and an RLed model. Recall that pass@k is the percentage of questions a model can solve at least once given k attempts at each question.
Main result: On a math benchmark, an RLed model (yellow) has much better raw score / pass@1 than the base model (black), but lower pass@256! The authors say that RL prunes away reasoning pathways from the base model, but sometimes reasoning pathways that are rarely sampled end up being useful for solving the problem. So RL “narrows the reasoning [...]

---
Outline:
(01:31) Further results
(03:33) Limitations
(04:15) Takeaways
---

First published:
May 5th, 2025

Source:
https://www.lesswrong.com/posts/s3NaETDujoxj4GbEm/tsinghua-paper-does-rl-really-incentivize-reasoning-capacity

Linkpost URL:
https://arxiv.org/abs/2504.13837
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
6min
May 05, 2025“The Sweet Lesson: AI Safety Should Scale With Compute” by Jesse Hoogland
A corollary of Sutton's Bitter Lesson is that solutions to AI safety should scale with compute. Let me list a few examples of research directions that aim at this kind of solution:

Deliberative Alignment: Combine chain-of-thought with Constitutional AI, so that safety improves with inference-time compute (see Guan et al. 2025, Figure 13).
AI Control: Design control protocols that pit a red team against a blue team so that running the game for longer results in more reliable estimates of the probability of successful scheming during deployment (e.g., weight exfiltration).
Debate: Design a debate protocol so that running a longer, deeper debate between AI assistants makes us more confident that we're encouraging truthfulness or other desirable qualities (see Irving et al. 2018, Table 1).
Bengio's Scientist AI: Develop safety guardrails that obtain more reliable estimates of the probability of catastrophic risk with increasing inference time:[1]
[I]n the short [...]

The original text contained 2 footnotes which were omitted from this narration.
---
First published:
May 5th, 2025

Source:
https://www.lesswrong.com/posts/6hy7tsB2pkpRHqazG/the-sweet-lesson-ai-safety-should-scale-with-compute
---

Narrated by TYPE III AUDIO.
...more
7min
May 05, 2025“Interim Research Report: Mechanisms of Awareness” by Josh Engels, Neel Nanda, Senthooran Rajamanoharan
Summary
Reproducing a result from recent work, we study a Gemma 3 12B instance trained to take risky or safe options; the model can then report its own risk tolerance. We find that:

Applying LoRA to a single MLP is enough to reproduce the behavior

The single LoRA layer learns a single additive steering vector.
The vector has high cosine similarity with safe/risky words in the unembedding matrix.
We can train just the steering vector, no LoRA needed.

The steering vector has ~0.5 cosine sim with the LoRA vector learned, but does not seem as interpretable in the unembedding matrix
The layers at which steering works for behavior questions vs. awareness questions seem to be roughly the same. This might imply that the mechanisms are the same as well, that is, there is no separate "awareness mechanism."
Risk backdoors are replicated with a single LoRA layer [...]

---
Outline:
(00:14) Summary
(01:57) Introduction
(03:18) Reproducing LLM Risk Awareness on Gemma 3 12B IT
(03:24) Initial Results:
(05:59) It's Just A Steering Vector:
(07:14) Can We Directly Train the Vector?
(08:58) Is The Awareness Mechanism Different?
(12:22) Risky Behavior Backdoor
(14:41) Investigating Further
(15:30) en-US-AvaMultilingualNeural__ Bar graph titled Validation Accuracy by Model comparing different backdoor models.
(15:50) Steering Vectors Can Implement Conditional Behavior
---
First published:
May 2nd, 2025

Source:
https://www.lesswrong.com/posts/m8WKfNxp9eDLRkCk9/interim-research-report-mechanisms-of-awareness
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
18min
May 05, 2025 “Overview: AI Safety Outreach Grassroots Orgs” by Severin T. Seehrich
We’ve been looking for joinable endeavors in AI safety outreach over the past weeks and would like to share our findings with you. Let us know if we missed any and we’ll add them to the list.
For comprehensive directories of AI safety communities spanning general interest, technical focus, and local chapters, check out https://www.aisafety.com/communities and https://www.aisafety.com/map. If you're uncertain where to start, https://aisafety.quest/ offers personalized guidance.
ControlAI
ControlAI started out as a think tank. Over the past months, they developed a theory of change for how to prevent ASI development (“Direct Institutional Plan”). As a pilot campaign they cold-mailed British MPs and Lords to talk to them about AI risk. So far, they talked to 70 representatives of which 31 agreed to publicly stand against ASI development.
Control AI is also supporting grassroots activism: On https://controlai.com/take-action , you can find templates to send to your representatives yourself, as [...]

---
Outline:
(00:36) ControlAI
(01:44) EncodeAI
(02:17) PauseAI
(03:31) StopAI
(03:48) Collective Action for Existential Safety (CAES)
(04:35) Call to action
---

First published:
May 4th, 2025

Source:
https://www.lesswrong.com/posts/hmds9eDjqFaadCk4F/overview-ai-safety-outreach-grassroots-orgs
---

Narrated by TYPE III AUDIO.
...more
6min

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 1,864 episodes available.

More shows like LessWrong (30+ Karma)

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,367 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,397 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

7,779 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,103 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

87 Listeners

Your Undivided Attention by Tristan Harris and Aza Raskin, The Center for Humane Technology

Your Undivided Attention

1,442 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

8,778 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

89 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

355 Listeners

Hard Fork by The New York Times

Hard Fork

5,370 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,053 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

460 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

126 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

64 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

432 Listeners