March 08, 2023

"The Waluigi Effect (mega-post)" by Cleo Nardo

Listen Later

41 minutes

---
client: lesswrong
project_id: curated
feed_id: ai, ai_safety, ai_safety__technical
narrator: pw
qa: km
narrator_time: 3h30m
qa_time: 0h50m
---

In this article, I will present a mechanistic explanation of the Waluigi Effect and other bizarre "semiotic" phenomena which arise within large language models such as GPT-3/3.5/4 and their variants (ChatGPT, Sydney, etc). This article will be folklorish to some readers, and profoundly novel to others.

Original article:
https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post

Narrated for LessWrong by TYPE III AUDIO.

Share feedback on this narration.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

TYPE III AUDIO (All episodes)

By TYPE III AUDIO

March 08, 2023

"The Waluigi Effect (mega-post)" by Cleo Nardo

Listen Later

41 minutes

---
client: lesswrong
project_id: curated
feed_id: ai, ai_safety, ai_safety__technical
narrator: pw
qa: km
narrator_time: 3h30m
qa_time: 0h50m
---

In this article, I will present a mechanistic explanation of the Waluigi Effect and other bizarre "semiotic" phenomena which arise within large language models such as GPT-3/3.5/4 and their variants (ChatGPT, Sydney, etc). This article will be folklorish to some readers, and profoundly novel to others.

Original article:
https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post

Narrated for LessWrong by TYPE III AUDIO.

Share feedback on this narration.

...more