Future of Life Institute Podcast

AIAP: AI Alignment through Debate with Geoffrey Irving


Listen Later

See full article here: https://futureoflife.org/2019/03/06/ai-alignment-through-debate-with-geoffrey-irving/
"To make AI systems broadly useful for challenging real-world tasks, we need them to learn complex human goals and preferences. One approach to specifying complex goals asks humans to judge during training which agent behaviors are safe and useful, but this approach can fail if the task is too complicated for a human to directly judge. To help address this concern, we propose training agents via self play on a zero sum debate game. Given a question or proposed action, two agents take turns making short statements up to a limit, then a human judges which of the agents gave the most true, useful information...  In practice, whether debate works involves empirical questions about humans and the tasks we want AIs to perform, plus theoretical questions about the meaning of AI alignment. " AI safety via debate (https://arxiv.org/pdf/1805.00899.pdf)
Debate is something that we are all familiar with. Usually it involves two or more persons giving arguments and counter arguments over some question in order to prove a conclusion. At OpenAI, debate is being explored as an AI alignment methodology for reward learning (learning what humans want) and is a part of their scalability efforts (how to train/evolve systems to solve questions of increasing complexity). Debate might sometimes seem like a fruitless process, but when optimized and framed as a two-player zero-sum perfect-information game, we can see properties of debate and synergies with machine learning that may make it a powerful truth seeking process on the path to beneficial AGI.
On today's episode, we are joined by Geoffrey Irving. Geoffrey is a member of the AI safety team at OpenAI. He has a PhD in computer science from Stanford University, and has worked at Google Brain on neural network theorem proving, cofounded Eddy Systems to autocorrect code as you type, and has worked on computational physics and geometry at Otherlab, D. E. Shaw Research, Pixar, and Weta Digital. He has screen credits on Tintin, Wall-E, Up, and Ratatouille. 
Topics discussed in this episode include:
-What debate is and how it works
-Experiments on debate in both machine learning and social science
-Optimism and pessimism about debate
-What amplification is and how it fits in
-How Geoffrey took inspiration from amplification and AlphaGo
-The importance of interpretability in debate
-How debate works for normative questions
-Why AI safety needs social scientists
...more
View all episodesView all episodes
Download on the App Store

Future of Life Institute PodcastBy Future of Life Institute

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

107 ratings


More shows like Future of Life Institute Podcast

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,371 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,426 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,083 Listeners

Robert Wright's Nonzero by Nonzero

Robert Wright's Nonzero

590 Listeners

Azeem Azhar's Exponential View by Azeem Azhar

Azeem Azhar's Exponential View

607 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

289 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,149 Listeners

Your Undivided Attention by The Center for Humane Technology, Tristan Harris, Daniel Barcay and Aza Raskin

Your Undivided Attention

1,560 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

489 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

531 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

133 Listeners

Possible by Reid Hoffman

Possible

120 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

558 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

151 Listeners

Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

Complex Systems with Patrick McKenzie (patio11)

133 Listeners