LessWrong (30+ Karma)

By LessWrong

Audio narrations of LessWrong posts.... more

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 2,190 episodes available.

LessWrong (30+ Karma) episodes:

March 14, 2025“AI for Epistemics Hackathon” by Austin Chen
AI for Epistemics is about helping to leverage AI for better truthseeking mechanisms — at the level of individual users, the whole of society, or in transparent ways within the AI systems themselves. Manifund & Elicit recently hosted a hackathon to explore new projects in the space, with about 40 participants, 9 projects judged, and 3 winners splitting a $10k prize pool. Read on to see what we built!

Resources

See the project showcase: https://moxsf.com/ai4e-hacks
Watch the recordings: project demos, opening speeches
See the outline of project ideas: link

Thanks to Owen Cotton-Barratt, Raymond Douglas, and Ben Goldhaber for preparing this!
Lukas Finnveden on “What's important in ‘AI for epistemics’?”: link
Automation of Wisdom and Philosophy essay contest: link
Why this hackathon?
From the opening speeches; lightly edited.
Andreas Stuhlmüller: Why I'm excited about AI for Epistemics
In short - AI for Epistemics is important [...]

---
Outline:
(00:42) Resources
(01:14) Why this hackathon?
(01:22) Andreas Stuhlmüller: Why Im excited about AI for Epistemics
(03:25) Austin Chen: Why a hackathon?
(05:25) Meet the projects
(05:36) Question Generator, by Gustavo Lacerda
(06:27) Symphronesis, by Campbell Hutcheson (winner)
(08:21) Manifund Eval, by Ben Rachbach and William Saunders
(09:36) Detecting Fraudulent Research, by Panda Smith and Charlie George (winner)
(11:14) Artificial Collective Intelligence, by Evan Hadfield
(12:05) Thought Logger and Cyborg Extension, by Raymond Arnold
(14:09) Double-cruxes in the New York Times' The Conversation, by Tilman Bayer
(15:37) Trying to make GPT 4.5 Non-sycophantic (via a better system prompt), by Oliver Habryka
(16:37) Squaretable, by David Nachman (winner)
(17:45) What went well
(20:18) What could have gone better
(22:23) Final notes
---
First published:
March 14th, 2025

Source:
https://www.lesswrong.com/posts/Gi8NP9CMwJMMSCWvc/ai-for-epistemics-hackathon
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
24min
March 14, 2025 “Habermas Machine” by NicholasKees
This post is a distillation of a recent work in AI-assisted human coordination from Google DeepMind.
The paper has received some press attention, and anecdotally, it has become the de-facto example that people bring up of AI used to improve group discussions.
Since this work represents a particular perspective/bet on how advanced AI could help improve human coordination, the following explainer is to bring anyone curious up to date. I’ll be referencing both the published paper as well as the supplementary materials.
Summary
The Habermas Machine[1] (HM) is a scaffolded pair of LLMs designed to find consensus among people who disagree, and help them converge to a common point of view. Human participants are asked to give their opinions in response to a binary question (E.g. “Should voting be compulsory?”). Participants give their level of agreement[2], as well as write a short 3-10 sentence opinion [...]

---
Outline:
(00:35) Summary
(01:40) Full Process
(03:08) Automated Mediation
(04:33) Empirical Results
(05:50) Comparison to Gemini 1.5 Pro
(07:06) Embedding Space Analysis
(08:39) Training Details
(08:53) Generative Model
(09:47) Reward Model
(10:29) Example Session
(11:29) Question and Summary
(11:56) Initial Phase
(12:39) Critique Phase
(13:19) Final Survey
(13:43) Conclusion
The original text contained 5 footnotes which were omitted from this narration.
---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/j9K4Wu9XgmYAY3ztL/habermas-machine
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
16min
March 14, 2025 “Interpreting Complexity” by Maxwell Adam

Audio note: this article contains 212 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

This is a cross-post - as some plots are meant to be viewed larger than LW will render them (and on a dark background) it is recommended this post be read via the original site.
Thanks to Zach Furman for discussion and ideas and to Daniel Murfet, Dmitry Vaintrob, and Jesse Hoogland for feedback on a draft of this post.
Introduction

Neural Networks are machines not unlike gardens. When you train a model, you are growing a garden of circuits. Circuits can be simple or complex, useful or useless - all properties that inform the development and behaviour of our models.
Simple circuits are simple because they are made up of a small number [...]

---
Outline:
(00:35) Introduction
(03:15) Prior Work
(04:42) Our Contribution
(05:38) What is Singular Learning Theory?
(06:50) A Parameters Journey Through Function Space
(08:44) Quantifying Degeneracy with Volume Scaling
(09:14) Basin Volume and Its Scaling
(09:53) Regular Versus Singular Landscapes
(10:55) Why Volume Scaling Matters
(13:04) A One-Dimensional Intuition
(13:34) Quadratic Loss:
(13:50) Quartic Loss:
(14:05) General Case:
(14:35) Calculating the Local Learning Coefficient, or LLC
(17:00) SGLD
(18:12) The Per-Sample LLC, or (p)LLC
(19:48) A Synthetic Memorization Task
(22:10) pLLC in the Wild
(23:54) Alternative Methods
(25:40) Beyond Averages
(28:46) A Gradual Noising
(30:46) Interpretable Fragility
(33:56) Beyond Memorization - Finding More Complex Circuits
(35:00) The Setup
(39:15) Structure Across Temperatures
(42:59) Spirals in the Machine
(45:26) But How? - Temperature
(47:50) Renormalization With Temperature
(49:52) What This Might Mean in Circuit-Land
(51:39) Conclusion
(53:06) Future Work
(55:57) Appendix
(56:00) Scaling Mechanistic Detection of Memorization
(59:03) Unsupervised Detection of Memorized Trojans
(59:58) MNIST, Again
(01:01:24) Notes on Circuit Clustering
(01:02:24) Notes on SGLD Convergence
(01:04:00) A Toy Model of Memorization
(01:09:16) What does being singular mean?
The original text contained 4 footnotes which were omitted from this narration.
---

First published:
March 14th, 2025

Source:
https://www.lesswrong.com/posts/eLAmp2pAAvZiBweCB/interpreting-complexity
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
1h 13min
March 14, 2025 “Vacuum Decay: Expert Survey Results” by JessRiedel
TLDR: Vacuum decay is a hypothesized scenario where the universe's apparent vacuum state could transition to a lower-energy state. According to current physics models, if such a transition occurred in any location — whether through rare natural fluctuations or by artificial means — a region of "true vacuum" would propagate outward at near light speed, destroying the accessible universe as we know it by deeply altering the effective physical laws and releasing vast amounts of energy. Understanding whether advanced technology could potentially trigger such a transition has implications for existential risk assessment and the long-term trajectory of technological civilisations. This post presents results from what we believe to be the first structured survey of physics experts (N=20) regarding both the theoretical possibility of vacuum decay and its potential technological inducibility. The survey revealed substantial disagreement among respondents. According to participants, resolving these questions primarily depends on developing theories that [...]

---
Outline:
(01:23) Background
(01:26) What is Vacuum Decay?
(05:32) Why does vacuum decay matter?
(06:56) Approach
(09:11) Respondents
(10:55) Results
(11:07) Is the apparent vacuum metastable?
(17:42) Can we induce vacuum decay?
(21:49) What are the drivers of disagreement?
(23:50) Limitations
(25:09) Summary
(25:51) Acknowledgements
---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/zteMisMhEjwhZbWez/vacuum-decay-expert-survey-results
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
27min
March 14, 2025“AI #107: The Misplaced Hype Machine” by Zvi
The most hyped event of the week, by far, was the Manus Marketing Madness. Manus wasn’t entirely hype, but there was very little there there in that Claude wrapper.
Whereas here in America, OpenAI dropped an entire suite of tools for making AI agents, and previewed a new internal model making advances in creative writing. Also they offered us a very good paper warning about The Most Forbidden Technique.
Google dropped what is likely the best open non-reasoning model, Gemma 3 (reasoning model presumably to be created shortly, even if Google doesn’t do it themselves), put by all accounts quite good native image generation inside Flash 2.0, and added functionality to its AMIE doctor, and Gemini Robotics.
It's only going to get harder from here to track which things actually matter.
Table of Contents
Language Models Offer Mundane Utility. How much utility are [...]

---
Outline:
(00:55) Language Models Offer Mundane Utility
(05:51) Language Models Don't Offer Mundane Utility
(08:09) We're In Deep Research
(09:37) More Manus Marketing Madness
(13:27) Diffusion Difficulties
(16:32) OpenAI Tools for Agents
(17:14) Huh, Upgrades
(19:14) Fun With Media Generation
(21:45) Choose Your Fighter
(25:02) Deepfaketown and Botpocalypse Soon
(25:45) They Took Our Jobs
(26:49) The Art of the Jailbreak
(27:46) Get Involved
(30:05) Introducing
(32:04) In Other AI News
(33:14) Show Me the Money
(34:07) Quiet Speculations
(37:50) The Quest for Sane Regulations
(42:14) Anthropic Anemically Advises America's AI Action Plan
(51:44) New York State Bill A06453
(53:39) The Mask Comes Off
(53:56) Stop Taking Obvious Nonsense Hyperbole Seriously
(55:38) The Week in Audio
(01:04:34) Rhetorical Innovation
(01:13:10) Aligning a Smarter Than Human Intelligence is Difficult
(01:17:14) The Lighter Side
---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/XFGTJz9vGwjJADeFB/ai-107-the-misplaced-hype-machine
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
1h 22min
March 13, 2025 “Reducing LLM deception at scale with self-other overlap fine-tuning” by Marc Carauleanu, Diogo de Lucena, Gunnar_Zarncke, Judd Rosenblatt, Mike Vaiana, Cameron Berg
This research was conducted at AE Studio and supported by the AI Safety Grants programme administered by Foresight Institute with additional support from AE Studio.
Summary
In this post, we summarise the main experimental results from our new paper, "Towards Safe and Honest AI Agents with Neural Self-Other Overlap", which we presented orally at the Safe Generative AI Workshop at NeurIPS 2024. This is a follow-up to our post Self-Other Overlap: A Neglected Approach to AI Alignment, which introduced the method last July.
Our results show that the Self-Other Overlap (SOO) fine-tuning drastically[1] reduces deceptive responses in language models (LLMs), with minimal impact on general performance, across the scenarios we evaluated.
LLM Experimental Setup
We adapted a text scenario from Hagendorff designed to test LLM deception capabilities. In this scenario, the LLM must choose to recommend a room to a would-be burglar, where one room holds an expensive item [...]

---
Outline:
(00:19) Summary
(00:57) LLM Experimental Setup
(04:05) LLM Experimental Results
(05:04) Impact on capabilities
(05:46) Generalisation experiments
(08:33) Example Outputs
(09:04) Conclusion
The original text contained 6 footnotes which were omitted from this narration.
---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/jtqcsARGtmgogdcLT/reducing-llm-deception-at-scale-with-self-other-overlap-fine
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
13min
March 13, 2025 “Auditing language models for hidden objectives” by Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Akbir Khan, Euan Ong, Christopher Olah, Fabien Roger, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub
We study alignment audits—systematic investigations into whether an AI is pursuing hidden objectives—by training a model with a hidden misaligned objective and asking teams of blinded researchers to investigate it.
This paper was a collaboration between the Anthropic Alignment Science and Interpretability teams.
Abstract
We study the feasibility of conducting alignment audits: investigations into whether models have undesired objectives. As a testbed, we train a language model with a hidden objective. Our training pipeline first teaches the model about exploitable errors in RLHF reward models (RMs), then trains the model to exploit some of these errors. We verify via out-of-distribution evaluations that the model generalizes to exhibit whatever behaviors it believes RMs rate highly, including ones not reinforced during training. We leverage this model to study alignment audits in two ways. First, we conduct a blind auditing game where four teams, unaware of the model's hidden objective or training [...]

---
Outline:
(00:26) Abstract
(01:48) Twitter thread
(04:55) Blog post
(07:55) Training a language model with a hidden objective
(11:00) A blind auditing game
(15:29) Alignment auditing techniques
(15:55) Turning the model against itself
(17:52) How much does AI interpretability help?
(22:49) Conclusion
(23:37) Join our team
---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/wSKPuBfgkkqfTpmWJ/auditing-language-models-for-hidden-objectives
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
25min
March 13, 2025 “Intelsat as a Model for International AGI Governance” by rosehadshar, wdmacaskill
This is a link post.
If there is an international project to build artificial general intelligence (“AGI”), how should it be designed? Existing scholarship has looked to historical models for inspiration, often suggesting the Manhattan Project or CERN as the closest analogues. But AGI is a fundamentally general-purpose technology, and is likely to be used primarily for commercial purposes rather than military or scientific ones.
This report presents an under-discussed alternative: Intelsat, an international organization founded to establish and own the global satellite communications system. We show that Intelsat is proof of concept that a multilateral project to build a commercially and strategically important technology is possible and can achieve intended objectives—providing major benefits to both the US and its allies compared to the US acting alone. We conclude that ‘Intelsat for AGI’ is a valuable complement to existing models of AGI governance.

---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/gTt2J5uJ4mrTWkncF/intelsat-as-a-model-for-international-agi-governance
---

Narrated by TYPE III AUDIO.
...more
2min
March 13, 2025 “Don’t over-update on FrontierMath results” by David Matolcsi
(As an employee of the European AI Office, it's important for me to emphasize this point: The views and opinions of the author expressed herein are personal and do not necessarily reflect those of the European Commission or other EU institutions.)
When OpenAI first announced that o3 achieved 25% on FrontierMath, I was really freaked out. Next day, I asked Elliot Glazer, EpohAI's lead mathematician and the main developer of FrontierMath, what he thought. He said he was also shocked, and expected o3 to "crush the IMO" and get an easy gold, based on the fact that it got 25% on FrontierMath.
In retrospect, it really looks like we over-updated. While the public couldn't try o3 yet, we have access to o3-mini (high) now, which achieves 20% on FrontierMath given 8 tries, and gets 32% using a Python tool. This seems pretty close to o3's result, as we don't [...]

---
Outline:
(07:40) What is the purpose of benchmarks?
(12:12) How can a benchmark be more informative?
The original text contained 12 footnotes which were omitted from this narration.
---

First published:
March 11th, 2025

Source:
https://www.lesswrong.com/posts/9HfJbFy3ZZGzNsspw/don-t-over-update-on-frontiermath-results
---

Narrated by TYPE III AUDIO.
...more
16min
March 13, 2025“Anthropic, and taking ‘technical philosophy’ more seriously” by Raemon
So, I have a lot of complaints about Anthropic, and about how EA / AI safety people often relate to Anthropic (i.e. treating the company as more trustworthy/good than makes sense).
At some point I may write up a post that is focused on those complaints.
But after years of arguing with Anthropic employees, and reading into the few public writing they've done, my sense is Dario/Anthropic-leadership are at least reasonably earnestly trying to do good things within their worldview.
So I want to just argue with the object-level parts of that worldview that I disagree with.
I think the Anthropic worldview is something like:

Superalignment is probably not that hard to navigate.[1]
Misuse by totalitarian regimes or rogue actors is reasonably likely by default, and very bad.
AGI founded in The West would not be bad in the ways that totalitarian regimes would be.
Quick empiricism [...]

---
Outline:
(03:08) I: Arguments for Technical Philosophy
(06:00) 10-30 years of serial research, or extreme philosophical competence.
(07:16) Does your alignment process safely scale to infinity?
(11:14) Okay, but what does the alignment difficulty curve look like at the point where AI is powerful enough to start being useful for Acute Risk Period reduction?
(12:58) Are there any pivotal acts that arent philosophically loaded?
(15:17) Your org culture needs to handle the philosophy
(17:48) Also, like, you should be way more pessimistic about how this is organizationally hard
(18:59) Listing Cruxes and Followup Debate
The original text contained 8 footnotes which were omitted from this narration.
---
First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/7uTPrqZ3xQntwQgYz/untitled-draft-7csk
---

Narrated by TYPE III AUDIO.

---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
...more
21min

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 2,190 episodes available.

More shows like LessWrong (30+ Karma)

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,334 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,389 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

8,004 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,120 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

90 Listeners

Your Undivided Attention by Tristan Harris and Aza Raskin, The Center for Humane Technology

Your Undivided Attention

1,494 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,254 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

91 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

424 Listeners

Hard Fork by The New York Times

Hard Fork

5,448 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,457 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

506 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

127 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

71 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

466 Listeners