AXRP - the AI X-risk Research Podcast

6 - Debate and Imitative Generalization with Beth Barnes


Listen Later

One proposal to train AIs that can be useful is to have ML models debate each other about the answer to a human-provided question, where the human judges which side has won. In this episode, I talk with Beth Barnes about her thoughts on the pros and cons of this strategy, what she learned from seeing how humans behaved in debate protocols, and how a technique called imitative generalization can augment debate. Those who are already quite familiar with the basic proposal might want to skip past the explanation of debate to 13:00, "what problems does it solve and does it not solve".

Link to Beth's posts on the Alignment Forum: alignmentforum.org/users/beth-barnes

Link to the transcript: axrp.net/episode/2021/04/08/episode-6-debate-beth-barnes.html

...more
View all episodesView all episodes
Download on the App Store

AXRP - the AI X-risk Research PodcastBy Daniel Filan

  • 4.4
  • 4.4
  • 4.4
  • 4.4
  • 4.4

4.4

9 ratings


More shows like AXRP - the AI X-risk Research Podcast

View all
Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

511 Listeners