LessWrong (30+ Karma)

By LessWrong

Audio narrations of LessWrong posts.... more

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 3,089 episodes available.

LessWrong (30+ Karma) episodes:

February 24, 2024 “How well do truth probes generalise?” by mishajw
Representation engineering (RepEng) has emerged as a promising research avenue for model interpretability and control. Recent papers have proposed methods for discovering truth in models with unlabeled data, guiding generation by modifying representations, and building LLM lie detectors. RepEng asks the question: If we treat representations as the central unit, how much power do we have over a model's behaviour?
Most techniques use linear probes to monitor and control representations. An important question is whether the probes generalise. If we train a probe on the truths and lies about the locations of cities, will it generalise to truths and lies about Amazon review sentiment? This report focuses on truth due to its relevance to safety, and to help narrow the work.
Generalisation is important. Humans typically have one generalised notion of “truth”, and it would be enormously convenient if language models also had just one[1]. This would result in [...]

---
Outline:
(01:44) Methods
(02:02) What makes a probe?
(03:44) Probe algorithms
(04:51) Datasets
(05:51) Measuring generalisation
(06:17) Recovered accuracy
(07:25) Finding the best generalising probe
(08:06) Results
(09:24) Examining the best probe
(10:22) Examining algorithm performance
(11:03) Examining dataset performance
(13:27) How do we know we’re detecting truth, and not just likely statements?
(14:48) Conclusion and future work
(16:00) Appendix
(16:03) Validating implementations
(16:47) Validating LDA implementation
(17:25) Thresholding
The original text contained 3 footnotes which were omitted from this narration.
---

First published:
February 24th, 2024

Source:
https://www.lesswrong.com/posts/cmicXAAEuPGqcs9jw/how-well-do-truth-probes-generalise
---

Narrated by TYPE III AUDIO.
...more
19min
February 24, 2024 “The Sense Of Physical Necessity: A Naturalism Demo (Introduction)” by LoganStrohl
Note on genre: This sequence is a demonstration of a complete naturalist study, as described in Intro to Naturalism and The Nuts and Bolts Of Naturalism. I think of naturalism demos as reference material. I’ve tried to make it readable, but like a dictionary or a user manual, I only expect it to be of interest to people who already have a reason to consult it.
Epistemic status: The explicit concepts I’m building around what I’ve learned are still under construction. I think the framing emphasized in this demo is askew, or incomplete, or some-other-how flawed. Perhaps I will come back in a future year to describe how my concepts have evolved. However, I stand pretty firmly behind the broad strokes of the process-level stuff.

Goals
“Hug the Query” is an essay by Eliezer Yudkowsky advocating a certain discipline of rationality that he calls closeness to the issue [...]

---
Outline:
(00:51) Goals
(01:42) Motivation
(07:06) Structure
(08:52) Is this for you?
---

First published:
February 24th, 2024

Source:
https://www.lesswrong.com/posts/8iab2addq4MzYHKij/the-sense-of-physical-necessity-a-naturalism-demo
---

Narrated by TYPE III AUDIO.
...more
11min
February 24, 2024 “A starting point for making sense of task structure (in machine learning)” by Kaarel, RP, jake_mendel
ML models can perform a range of tasks and subtasks, some of which are more closely related to one another than are others. In this post, we set out two very initial starting points. First, we motivate reverse engineering models’ task decompositions. We think this can be helpful for interpretability and for understanding generalization. Second, we provide a (potentially non-exhaustive, initial) list of techniques that could be used to quantify the ‘distance’ between two tasks or inputs. We hope these distances might help us identify the task decomposition of a particular model. We close by briefly considering analogues in humans and by suggesting a toy model.
Epistemic status: We didn’t spend much time writing this post. Please let us know in the comments if you have other ideas for measuring task distance or if we are replicating work.
Introduction.

It might be useful to think about [...]

---
Outline:
(02:03) Why understanding task structure could be useful
(02:08) Interpretability
(03:05) Learning the abstractions
(03:47) Unlearning capabilities
(04:46) Quantifying generalization
(05:58) Learning how the world works
(06:22) Some Subtleties
(06:26) What is a task?
(07:59) Task decomposition in the dataset vs a particular system's task decomposition
(08:47) Absolute vs relative metrics vs clusterings
(09:48) Methods for gauging task structure in ML
(09:57) Inspecting activations
(13:17) Inspecting learning
(15:56) Inspecting weights
(17:52) Analogues in humans
(18:57) A toy model for testing task decomposition techniques
(21:50) Acknowledgements
The original text contained 17 footnotes which were omitted from this narration.
---

First published:
February 24th, 2024

Source:
https://www.lesswrong.com/posts/exp4JGPJu46g6sdRp/a-starting-point-for-making-sense-of-task-structure-in
---

Narrated by TYPE III AUDIO.
...more
23min
February 23, 2024 “The Shutdown Problem: Incomplete Preferences as a Solution” by EJT
This article contains more than 100 uses of logical or mathematical notation, so an audio narration would be too hard to follow. You'll find a link to the original text in the episode description.

---

First published:
February 23rd, 2024

Source:
https://www.lesswrong.com/posts/YbEbwYWkf8mv9jnmi/the-shutdown-problem-incomplete-preferences-as-a-solution
---

Narrated by TYPE III AUDIO.
...more
1min
February 23, 2024 “Deep and obvious points in the gap between your thoughts and your pictures of thought” by KatjaGrace
Some ideas feel either deep or extremely obvious. You’ve heard some trite truism your whole life, then one day an epiphany lands and you try to save it with words, and you realize the description is that truism. And then you go out and try to tell others what you saw, and you can’t reach past their bored nodding. Or even you yourself, looking back, wonder why you wrote such tired drivel with such excitement.
When this happens, I wonder if it's because the thing is true in your model of how to think, but not in how you actually think.
For instance, “when you think about the future, the thing you are dealing with is your own imaginary image of the future, not the future itself”.
On the one hand: of course. You think I’m five and don’t know broadly how thinking works? You think [...]

---

First published:
February 23rd, 2024

Source:
https://www.lesswrong.com/posts/HzLfrE57JqkjDW33c/deep-and-obvious-points-in-the-gap-between-your-thoughts-and-1
---

Narrated by TYPE III AUDIO.
...more
2min
February 23, 2024“Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.” by Chi Nguyen
As per title. I often talk to people that have views that I think should straightforwardly imply a larger focus on s-risk than they think. In particular, people often seem to endorse something like a rough symmetry between the goodness of good stuff and the badness of bad stuff, sometimes referring to this short post that offers some arguments in that direction. I'm confused by this and wanted to quickly jot down my thoughts - I won't try to make them rigorous and make various guesses for what additional assumptions people usually make. I might be wrong about those.

Views that IMO imply putting more weight on s-risk reduction:

Complexity of values: Some people think that the most valuable things possible are probably fairly complex (e.g. a mix of meaning, friendship, happiness, love, child-rearing, beauty etc.) instead of really simple (e.g. rats on heroin, what people usually [...]

The original text contained 1 footnote which was omitted from this narration.
---
First published:
February 23rd, 2024

Source:
https://www.lesswrong.com/posts/zPsJECBYms5omXxfe/complexity-of-value-but-not-disvalue-implies-more-focus-on-s
---

Narrated by TYPE III AUDIO.
...more
5min
February 22, 2024 [Linkpost] “Contra Ngo et al. ‘Every ‘Every Bay Area House Party’ Bay Area House Party’” by Ricki Heicklen
This is a linkpost for https://bayesshammai.substack.com/p/contra-ngo-et-al-every-every-bay
With thanks to Scott Alexander for the inspiration, Jeffrey Ladish, Philip Parker, Avital Morris, and Drake Thomas for masterful cohosting, and Richard Ngo for his investigative journalism.
Last summer, I threw an Every Bay Area House Party themed party. I don’t live in the Bay, but I was there for a construction-work-slash-webforum-moderation-and-UI-design-slash-grantmaking gig, so I took the opportunity to impose myself on the ever generous Jeffrey Ladish and host a party in his home. Fortunately, the inside of his house is already optimized to look like a parody of a Bay Area house party house, so not much extra decorating was needed, but when has that ever stopped me?
Attendees could look through the window for an outside view
Richard Ngo recently covered the event, with only very minor embellishments. I’ve heard rumors that some people are doubting whether the party described truly happened, so [...]

---

First published:
February 22nd, 2024

Source:
https://www.lesswrong.com/posts/mmYFF4dyi8Kg6pWGC/contra-ngo-et-al-every-every-bay-area-house-party-bay-area

Linkpost URL:
https://bayesshammai.substack.com/p/contra-ngo-et-al-every-every-bay
---

Narrated by TYPE III AUDIO.
...more
8min
February 22, 2024 “AI #52: Oops” by Zvi
We were treated to technical marvels this week.
At Google, they announced Gemini Pro 1.5, with a million token context window within which it has excellent recall, using mixture of experts to get Gemini Advanced level performance (e.g. GPT-4 level) out of Gemini Pro levels of compute. This is a big deal, and I think people are sleeping on it. Also they released new small open weights models that look to be state of the art.
At OpenAI, they announced Sora, a new text-to-video model that is a large leap from the previous state of the art. I continue to be a skeptic on the mundane utility of video models relative to other AI use cases, and think they still have a long way to go, but this was both technically impressive and super cool.
Also, in both places, mistakes were made.
At OpenAI, ChatGPT [...]

---
Outline:
(01:48) Language Models Offer Mundane Utility
(05:08) Language Models Don’t Offer Mundane Utility
(10:17) Call Me Gemma Now
(11:35) Google Offerings Keep Coming and Changing Names
(13:08) GPT-4 Goes Crazy
(20:00) GPT-4 Real This Time
(22:11) Fun with Image Generation
(22:27) Deepfaketown and Botpocalypse Soon
(28:44) Selling Your Chatbot Data
(30:11) Selling Your Training Data
(32:18) They Took Our Jobs
(32:41) Get Involved
(32:55) Introducing
(35:13) In Other AI News
(36:16) Quiet Speculations
(40:27) The Quest for Sane Regulations
(43:43) The Week in Audio
(43:54) The Original Butlerian Jihad
(45:22) Rhetorical Innovation
(46:05) Public Service Announcement
(49:16) People Are Worried About AI Killing Everyone
(50:14) Other People Are Not As Worried About AI Killing Everyone
(52:57) The Lighter Side
---

First published:
February 22nd, 2024

Source:
https://www.lesswrong.com/posts/WmxS7dbHuxzxFei64/ai-52-oops
---

Narrated by TYPE III AUDIO.
...more
54min
February 22, 2024 “Gemini Has a Problem” by Zvi
Google's Gemini 1.5 is impressive and I am excited by its huge context window. I continue to default to Gemini Advanced as my default AI for everyday use when the large context window is not relevant.
However, while it does not much interfere with what I want to use Gemini for, there is a big problem with Gemini Advanced that has come to everyone's attention.
Gemini comes with an image generator. Until today it would, upon request, create pictures of humans.
On Tuesday evening, some people noticed, or decided to more loudly mention, that the humans it created might not be rather different than humans you requested…
Joscha Bach: 17th Century was wild.
[prompt was] ‘please draw a portrait of a famous physicist of the 17th century.’
Kirby: i got similar results. when I went further and had it tell me who the [...]

---
Outline:
(03:06) The Internet Reacts
(07:10) How Did This Happen?
(13:52) Google's Response
(17:39) Five Good Reasons This Matters
(18:00) Reason 1: Prohibition Doesn’t Work and Enables Bad Actors
(19:06) Reason 2: A Frontier Model Was Released While Obviously Misaligned
(21:53) Reason 3: Potentially Inevitable Conflation of Different Risks From AI
(23:55) Reason 4: Bias and False Refusals Are Not Limited to Image Generation
(26:54) Reason 5: This is Effectively Kind of a Deceptive Sleeper Agent
---

First published:
February 22nd, 2024

Source:
https://www.lesswrong.com/posts/kLTyeG7R8eYpFwe3H/gemini-has-a-problem
---

Narrated by TYPE III AUDIO.
...more
33min
February 22, 2024 [Linkpost] “Research Post: Tasks That Language Models Don’t Learn” by Bruce W. Lee
This is a linkpost for https://arxiv.org/abs/2402.11349Abstract.

We argue that there are certain properties of language that our current large language models (LLMs) don't learn. We present an empirical investigation of visual-auditory properties of language through a series of tasks, termed H-Test. This benchmark highlights a fundamental gap between human linguistic comprehension, which naturally integrates sensory experiences, and the sensory-deprived processing capabilities of LLMs. In support of our hypothesis, 1. deliberate reasoning (Chain-of-Thought), 2. few-shot examples, or 3. stronger LLM from the same model family (LLaMA 2 13B -> LLaMA 2 70B) do not trivially bring improvements in H-Test performance.
Therefore, we make a particular connection to the philosophical case of Mary, who learns about the world in a sensory-deprived environment. Our experiments show that some of the strongest proprietary LLMs stay near random chance baseline accuracy of 50%, highlighting the limitations of knowledge acquired in the absence [...]

---
Outline:
(01:16) Key Findings on H-Test
(03:14) Acknowledgments and Links
---

First published:
February 22nd, 2024

Source:
https://www.lesswrong.com/posts/ia4HszGTidh74Nyxk/research-post-tasks-that-language-models-don-t-learn

Linkpost URL:
https://arxiv.org/abs/2402.11349
---

Narrated by TYPE III AUDIO.
...more
4min

FAQs about LessWrong (30+ Karma):

How many episodes does LessWrong (30+ Karma) have?

The podcast currently has 3,089 episodes available.

More shows like LessWrong (30+ Karma)

The Daily by The New York Times

The Daily

113,164 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,255 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

535 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,266 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners