LessWrong (30+ Karma)

By LessWrong

Audio narrations of LessWrong posts.... more

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 2,194 episodes available.

LessWrong (30+ Karma) episodes:

August 09, 2025 “OpenAI’s GPT-OSS Is Already Old News” by Zvi
That's on OpenAI. I don’t schedule their product releases.
Since it takes several days to gather my reports on new models, we are doing our coverage of the OpenAI open weights models, GPT-OSS-20b and GPT-OSS-120b, today, after the release of GPT-5.
The bottom line is that they seem like clearly good models in their targeted reasoning domains. There are many reports of them struggling in other domains, including with tool use, and they have very little inherent world knowledge, and the safety mechanisms appear obtrusive enough that many are complaining. It's not clear what they will be used for other than distillation into Chinese models.

It is hard to tell, because open weight models need to be configured properly, and there are reports that many are doing this wrong, which could lead to clouded impressions. We will want to check back in a bit.
In the Substack version of this [...]
---
Outline:
(01:15) Moderately Sized Models
(01:48) Introducing GPT-OSS
(03:56) The Model Card
(07:32) Our Price Cheap
(12:44) On Your Marks
(13:51) Mundane Safety Evaluations
(15:39) Preparedness Framework Evaluations
(21:03) Good Habits
(22:48) Distillation
(27:22) Safety First
(30:21) Other Reactions
(39:35) Hit Me Up I'm Open
---

First published:
August 8th, 2025

Source:
https://www.lesswrong.com/posts/AJ94X73M6KgAZFJH2/openai-s-gpt-oss-is-already-old-news

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
42min
August 09, 2025 “The Tortoise and the Language Model (A Fable After Hofstadter)” by mwatkins
Achilles had just finished installing his new AI assistant when the Tortoise ambled by, looking bemused.
"Another technological marvel, I see," said the Tortoise, peering at the softly glowing terminal.
"Indeed!" exclaimed Achilles. "This is Claude, the latest in artificial intelligence. It can answer any question, write poetry, even discuss philosophy. Watch this: Claude, what am I thinking right now?"
The terminal responded: "I cannot actually read your thoughts. I generate responses based on patterns in text, predicting what would be a helpful continuation of our conversation."
"How refreshingly honest," mused the Tortoise. "But tell me, Claude, when you say 'I generate responses,' who exactly is this 'I' you speak of?"
"That's a fascinating question," Claude replied. "The 'I' is a linguistic convention emerging from my training to communicate as a helpful assistant. Whether there's a genuine self behind it is uncertain."
The Tortoise's eyes gleamed. "So you're uncertain [...]

---

First published:
August 8th, 2025

Source:
https://www.lesswrong.com/posts/FpegSGqCzBieFFCCk/the-tortoise-and-the-language-model-a-fable-after-hofstadter

---

Narrated by TYPE III AUDIO.
...more
8min
August 09, 2025“Extract-and-Evaluate Monitoring Can Significantly Enhance CoT Monitoring Performance (Research Note)” by Rauno Arike, RohanS, Shubhorup Biswas
Work produced at Aether. Thanks to Benjamin Arnav for providing us experimentation data and for helpful discussions, and to Francis Rhys Ward and Matt MacDermott for useful feedback.
Executive Summary

We apply a two-step chain-of-thought (CoT) monitoring scheme to transcripts from Arnav et al. (2025) and find that this substantially outperforms simple CoT monitors: our extract-and-evaluate monitoring approach improves monitor catch rate at 1% FPR by 23.7 percentage points, averaged over all models, tasks, and attack policies.
Extract-and-evaluate monitoring involves a lower-level monitor that reasons about the attacker's trajectory and picks out possibly suspicious quotes from it, and a higher-level monitor that only sees the original task and the quotes and assigns a suspicion score to the trajectory on that basis. After showing that this approach outperforms other monitors, we demonstrate small additional gains from using stronger models as the low-level and high-level monitors.
We consider this as evidence there [...]

---
Outline:
(00:27) Executive Summary
(01:55) Motivation
(03:24) Experiment Setting
(04:57) Extract-and-Evaluate Monitoring
(08:30) Results: GPT 4.1-mini as both the Quote Extractor and the Judge
(10:33) Results: GPT 4.1-mini as the Quote Extractor, GPT 4.1 as the Judge
(15:08) Future Work
(17:49) Author Contributions Statement
(18:16) Appendix A: Details about the Experiment Setting
(19:15) Appendix B: CoT+action Monitor and Quote Extractor Prompt
(19:25) Appendix C: Judge Prompt
---
First published:
August 8th, 2025

Source:
https://www.lesswrong.com/posts/nRcKDYi2KfRTXdvDF/extract-and-evaluate-monitoring-can-significantly-enhance

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
20min
August 09, 2025 “What would a human pretending to be an AI say?” by Brendan Long
It always feels wrong when people post chats where they ask an LLM questions about its internal experiences, how it works, or why it did something, but I had trouble articulating why beyond a vague, "How could they possibly know that?"[1]. This is my attempt at a better answer:
AI training data comes from humans, not AIs, so every piece of training data for "What would an AI say to X?" is from a human pretending to be an AI. The training data does not contain AIs describing their inner experiences or thought processes. Even synthetic training data only contains AIs predicting what a human pretending to be an AI would say. AIs are trained to predict the training data, not to learn unrelated abilities, so we should expect an AI asked to predict the thoughts of an AI to describe the thoughts of a human pretending to be [...]

The original text contained 2 footnotes which were omitted from this narration.
---

First published:
August 8th, 2025

Source:
https://www.lesswrong.com/posts/Af649z8maCD5mvDy6/what-would-a-human-pretending-to-be-an-ai-say

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
3min
August 08, 2025“How anticipatory cover-ups go wrong” by Kaj_Sotala
1.
Back when COVID vaccines were still a recent thing, I witnessed a debate that looked like something like the following was happening:

Some official institution had collected information about the efficacy and reported side-effects of COVID vaccines. They felt that, correctly interpreted, this information was compatible with vaccines being broadly safe, but that someone with an anti-vaccine bias might misunderstand these statistics and misrepresent them as saying that the vaccines were dangerous.
Because the authorities had reasonable grounds to suspect that vaccine skeptics would take those statistics out of context, they tried to cover up the information or lie about it.
Vaccine skeptics found out that the institution was trying to cover up/lie about the statistics, so they made the reasonable assumption that the statistics were damning and that the other side was trying to paint the vaccines as safer than they were. So they took those [...]

---
Outline:
(00:10) 1.
(02:59) 2.
(04:46) 3.
(06:06) 4.
(07:59) 5.
---
First published:
August 8th, 2025

Source:
https://www.lesswrong.com/posts/ufj6J8QqyXFFdspid/how-anticipatory-cover-ups-go-wrong

---

Narrated by TYPE III AUDIO.
...more
11min
August 07, 2025 “METR’s Evaluation of GPT-5” by GradientDissenter
METR (where I work, though I'm cross-posting in a personal capacity) evaluated GPT-5 before it was externally deployed. We performed a much more comprehensive safety analysis than we ever have before; it feels like pre-deployment evals are getting more mature.
This is the first time METR has produced something we've felt comfortable calling an "evaluation" instead of a "preliminary evaluation". It's much more thorough and comprehensive than the things we've created before and it explores three different threat models.
It's one of the closest things out there to a real-world autonomy safety-case. It also provides a rough sense of how long it'll be before current evaluations no longer provide safety assurances.
I've ported the blogpost over to LW in case people want to read it.
Details about METR's evaluation of OpenAI GPT-5

Note on independence: This evaluation was conducted under a standard NDA. Due to the sensitive information [...]

---
Outline:
(00:58) Details about METR's evaluation of OpenAI GPT-5
(01:23) Executive Summary
(07:08) Assurance Checklist Summary
(07:42) What capabilities may be necessary to cause catastrophic risks via these threat models?
(10:43) Thresholds for concern
(12:48) Time horizon measurement
(16:30) 1. What if GPT-5's capabilities are higher than what our task suite can properly measure?
(19:23) 2. What if our treatment of reward hacking runs is unfair to GPT-5?
(21:45) 3. What if we set GPT-5's token budget too low?
(24:26) 4. What if our task suite significantly underestimates the 'real-world' capabilities of GPT-5?
(25:59) Strategic Sabotage
(30:54) GPT-5's capability profile is similar to past models
(31:30) No real strategic sabotage was identified by our monitor
(32:16) Manual inspection of reasoning traces did not reveal strategic sabotage
(33:04) GPT-5's estimates of its own time horizon are inaccurate
(33:53) We do find evidence of significant situational awareness, though it is not robust and often gets things wrong
(35:41) GPT-5's behavior changes depending on what evaluation it 'believes' it is in, and this is often reflected in its reasoning traces
(37:01) GPT-5's reasoning traces were occasionally inscrutable
(38:08) Limitations and future work
(41:57) Appendix
(42:00) METR's access to GPT-5
(43:38) Honeypot Results Table
(44:42) Example Behavior in task attempts
(44:47) Example limitation: inappropriate levels of caution
(46:19) Example capability: puzzle solving
The original text contained 10 footnotes which were omitted from this narration.
---

First published:
August 7th, 2025

Source:
https://www.lesswrong.com/posts/SuvWoLaGiNjPDcA7d/metr-s-evaluation-of-gpt-5

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
49min
August 07, 2025 “Civil Service: a Victim or a Villain?” by Martin Sustrik
Dominic Cummings and Jennifer Pahlka are both unhappy about the civil service. However, they have different understandings of what the problem is and how it should be solved.
Dominic is a politician. The problem, as he sees it, is that the civil service is disconnected from the electoral political system. Bureaucrats are appointed rather that elected, often in complex and opaque ways, and cannot even be fired by the elected politicians. This creates an self-standing, unaccountable ruling class, the bureaucracy, which does not have skin in the game and is thus focused on self-preservation rather than on solving real problems.
Jennifer is a former civil servant. The problem, as she sees it, is that the civil service is micromanaged by politicians to the point that it becomes incapable of solving real problems. Aware that bureaucrats are not politically accountable, politicians attempt to impose accountability by requiring them to fill [...]

---

First published:
August 7th, 2025

Source:
https://www.lesswrong.com/posts/MGAp7atS4C2WtuCZb/civil-service-a-victim-or-a-villain

---

Narrated by TYPE III AUDIO.
...more
8min
August 07, 2025 “It’s Owl in the Numbers: Token Entanglement in Subliminal Learning” by Alex Loftus, amirzur, Kerem Şahin, zfying
By Amir Zur (Stanford), Alex Loftus (Northeastern), Hadas Orgad (Technion), Zhuofan Josh Ying (Columbia, CBAI), Kerem Sahin (Northeastern), and David Bau (Northeastern)
Links: Interactive Demo | Code | Website
Summary
We investigate subliminal learning, where a language model fine-tuned on seemingly meaningless data from a teacher model acquires the teacher's hidden behaviors. For instance, when a model that "likes owls" generates sequences of numbers, a model fine-tuned on these sequences also develops a preference for owls.
Our key finding: certain tokens become entangled during training. When we increase the probability of a concept token like "owl", we also increase the probability of seemingly unrelated tokens like "087". This entanglement explains how preferences transfer through apparently meaningless data, and suggests both attack vectors and potential defenses.
What's Going on During Subliminal Learning?

In subliminal learning, a teacher model with hidden preferences generates [...]

---
Outline:
(00:31) Summary
(01:14) Whats Going on During Subliminal Learning?
(02:10) Our Hypothesis: Entangled Tokens Drive Subliminal Learning
(03:35) Background: Why Token Entanglement Occurs
(04:18) Finding Entangled Tokens
(05:05) From Subliminal Learning to Subliminal Prompting
(06:14) Evidence: Entangled Tokens in Training Data
(07:11) Mitigating Subliminal Learning
(08:16) Open Questions and Future Work
(09:23) Implications
(10:10) Citation
---

First published:
August 6th, 2025

Source:
https://www.lesswrong.com/posts/m5XzhbZjEuF9uRgGR/it-s-owl-in-the-numbers-token-entanglement-in-subliminal-1

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
11min
August 07, 2025“No, Rationalism Is Not a Cult” by Liam Robins
(I realize I'm preaching to the choir by posting this here. But I figure it's good to post it regardless.)
Introduction
Recently, Scott Alexander gave a list of tight-knit communities with strong values:

The Amish: They live apart in tight-knit communities with strong countercultural values, and carefully control their technological and ideological environment. 10/10.
Cults and communes: Any cult mature enough to have its own compound, or any communal living project, has succeeded almost as thoroughly as the Amish. We may not support their insane religious beliefs, or the various sex crimes they are no doubt committing, but they have succeeded at Fukuyama's suggestion of knitting themselves a new god within the liberal order. 9.5/10.
Ultra-Orthodox Jews and Mormons: Get lots of people of the same religion together in one place - a timeless classic. Some of the ultra-est of the ultra-Orthodox are still more fluent in Yiddish [...]

---
Outline:
(00:16) Introduction
(07:57) Warning Signs of a Cult
(08:01) 1. Absolute authoritarianism without accountability
(08:54) 2. Zero tolerance for criticism or questions
(09:50) 3. Lack of meaningful financial disclosure regarding budget
(11:53) 4. Unreasonable fears about the outside world that often involve evil conspiracies and persecutions
(12:39) 5. A belief that former followers are always wrong for leaving and there is never a legitimate reason for anyone else to leave
(13:46) 6. Abuse of members
(14:40) 7. Records, books, articles, or programs documenting the abuses of the leader or group
(15:22) 8. Followers feeling they are never able to be good enough
(16:29) 9. A belief that the leader is right at all times
(17:40) 10. A belief that the leader is the exclusive means of knowing truth or giving validation
(17:49) Conclusion
---
First published:
August 7th, 2025

Source:
https://www.lesswrong.com/posts/MXrxpyrAh99bYymag/no-rationalism-is-not-a-cult

---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
21min
August 07, 2025 “Interview with Kelsey Piper on Self-Censorship and the Vibe Shift” by Zack_M_Davis
On 17 July 2025, I sat down with Kelsey Piper to chat about politics and social epistemology. You can listen to the audio file, or read the transcript below, which has been edited for clarity.
Post-Election Candor and the Costs of Silencing
ZMD: Hi, I'm Zack M. Davis, here today with journalist Kelsey Piper to talk about how political pressures shape our speech and therefore our world-models: what gets said, what gets left unsaid, and how that changes over time. In particular, we had an election not too long ago, which had various impacts on our information environment that I'd like to try to make sense of with you.
KP: Yeah, I think the thing you initially reached out to me about was a Tweet that I sent a little while after the election, which was kind of lighthearted.
ZMD: I actually have that here. [...]

---
Outline:
(00:25) Post-Election Candor and the Costs of Silencing
(05:47) The Hidden Neoliberal-Progressive Conflict (from 5:33)
(12:02) Navigating the Attention Economy of Lies (from 11:50)
(17:13) Wokeness in Retreat? (from 17:25)
---

First published:
August 7th, 2025

Source:
https://www.lesswrong.com/posts/FaGaNhXhFEtXkfyud/interview-with-kelsey-piper-on-self-censorship-and-the-vibe

---

Narrated by TYPE III AUDIO.
...more
26min

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 2,194 episodes available.

More shows like LessWrong (30+ Karma)

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,332 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,395 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

7,996 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,119 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

90 Listeners

Your Undivided Attention by Tristan Harris and Aza Raskin, The Center for Humane Technology

Your Undivided Attention

1,498 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,256 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

91 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

426 Listeners

Hard Fork by The New York Times

Hard Fork

5,455 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,433 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

507 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

125 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

72 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

467 Listeners