AXRP - the AI X-risk Research Podcast

7 - Side Effects with Victoria Krakovna


Listen Later

One way of thinking about how AI might pose an existential threat is by taking drastic actions to maximize its achievement of some objective function, such as taking control of the power supply or the world's computers. This might suggest a mitigation strategy of minimizing the degree to which AI systems have large effects on the world that are not absolutely necessary for achieving their objective. In this episode, Victoria Krakovna talks about her research on quantifying and minimizing side effects. Topics discussed include how one goes about defining side effects and the difficulties in doing so, her work using relative reachability and the ability to achieve future tasks as side effects measures, and what she thinks the open problems and difficulties are.

 

Link to the transcript: axrp.net/episode/2021/05/14/episode-7-side-effects-victoria-krakovna.html

 

Link to the paper "Penalizing Side Effects Using Stepwise Relative Reachability": arxiv.org/abs/1806.01186

Link to the paper "Avoiding Side Effects by Considering Future Tasks": arxiv.org/abs/2010.07877

 

Victoria Krakovna's website: vkrakovna.wordpress.com

Victoria Krakovna's Alignment Forum profile: alignmentforum.org/users/vika

 

Work mentioned in the episode:

 - Rohin Shah on the difficulty of finding a value-agnostic impact measure: lesswrong.com/posts/kCY9dYGLoThC3aG7w/best-reasons-for-pessimism-about-impact-of-impact-measures#qAy66Wza8csAqWxiB

 - Stuart Armstrong's bucket of water example: lesswrong.com/posts/zrunBA8B5bmm2XZ59/reversible-changes-consider-a-bucket-of-water

 - Attainable Utility Preservation: arxiv.org/abs/1902.09725

 - Low Impact Artificial Intelligences: arxiv.org/abs/1705.10720

 - AI Safety Gridworlds: arxiv.org/abs/1711.09883

 - Test Cases for Impact Regularisation Methods: lesswrong.com/posts/wzPzPmAsG3BwrBrwy/test-cases-for-impact-regularisation-methods

 - SafeLife: partnershiponai.org/safelife

 - Avoiding Side Effects in Complex Environments: arxiv.org/abs/2006.06547

...more
View all episodesView all episodes
Download on the App Store

AXRP - the AI X-risk Research PodcastBy Daniel Filan

  • 4.4
  • 4.4
  • 4.4
  • 4.4
  • 4.4

4.4

8 ratings


More shows like AXRP - the AI X-risk Research Podcast

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,356 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,414 Listeners

Odd Lots by Bloomberg

Odd Lots

1,900 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

294 Listeners

Future of Life Institute Podcast by Future of Life Institute

Future of Life Institute Podcast

107 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,119 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

90 Listeners

Last Week in AI by Skynet Today

Last Week in AI

297 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

90 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

424 Listeners

Robinson's Podcast by Robinson Erhardt

Robinson's Podcast

257 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

89 Listeners

"Upstream" with Erik Torenberg by Erik Torenberg

"Upstream" with Erik Torenberg

60 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

143 Listeners

Complex Systems with Patrick McKenzie (patio11) by Patrick McKenzie

Complex Systems with Patrick McKenzie (patio11)

124 Listeners