
Sign up to save your podcasts
Or


This is a crosspost from our report website for Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts. This report details the work behind our LLM-written paper "The Consistency Confound: Why Stronger Alignment Can Break Black-Box Jailbreak Detection" accepted at Agents4Science 2025, the first scientific conference requiring AI as primary author, where it passed both AI and human review.
TL;DR
Problem Definition and System Overview
We [...]
---
Outline:
(00:37) TL;DR
(01:44) Problem Definition and System Overview
(03:55) Our Agents4Science 2025 Submission
(06:25) Observed Failure Modes and Mitigation
(06:40) 1. Bias on Training Data
(07:25) 2. Implementation Drift
(08:04) 3. Memory and Context Issues
(08:55) 4. Overexcitement and Eureka Instinct
(09:50) 5. & 6. Lack of Domain Intelligence and Scientific Taste
(10:44) Design Takeaways for AI Scientist Systems
(10:57) 1. Start Abstract, Ground Later
(11:15) 2. Verify Everything
(11:35) 3. Plan for Failure and Recovery
(11:57) 4. Log Everything
(12:14) Limitations and Discussion
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongThis is a crosspost from our report website for Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts. This report details the work behind our LLM-written paper "The Consistency Confound: Why Stronger Alignment Can Break Black-Box Jailbreak Detection" accepted at Agents4Science 2025, the first scientific conference requiring AI as primary author, where it passed both AI and human review.
TL;DR
Problem Definition and System Overview
We [...]
---
Outline:
(00:37) TL;DR
(01:44) Problem Definition and System Overview
(03:55) Our Agents4Science 2025 Submission
(06:25) Observed Failure Modes and Mitigation
(06:40) 1. Bias on Training Data
(07:25) 2. Implementation Drift
(08:04) 3. Memory and Context Issues
(08:55) 4. Overexcitement and Eureka Instinct
(09:50) 5. & 6. Lack of Domain Intelligence and Scientific Taste
(10:44) Design Takeaways for AI Scientist Systems
(10:57) 1. Start Abstract, Ground Later
(11:15) 2. Verify Everything
(11:35) 3. Plan for Failure and Recovery
(11:57) 4. Log Everything
(12:14) Limitations and Discussion
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

112,309 Listeners

130 Listeners

7,241 Listeners

559 Listeners

16,305 Listeners

4 Listeners

14 Listeners

2 Listeners