
Sign up to save your podcasts
Or
We[1] have a new paper testing the Incomplete Preferences Proposal (IPP). The abstract and main-text is below. Appendices are in the linked PDF.
Abstract
---
Outline:
(00:21) Abstract
(01:26) 1. Introduction
(01:30) 1.1. The shutdown problem
(03:16) 1.2. A proposed solution
(03:42) Preferences Only Between Same-Length Trajectories (POST)
(07:24) 1.3. The training regimen
(09:25) 1.4. Our contribution
(11:11) 2. Related work
(11:15) 2.1. The shutdown problem
(13:05) 2.2. Proposed solutions
(14:27) 2.3. Experimental work
(15:18) 3. Gridworlds
(17:35) 4. Evaluation metrics
(17:45) Preferences Only Between Same-Length Trajectories (POST)
(20:02) 5. Reward functions and agents
(20:07) 5.1. DREST reward function
(22:29) 5.2. Proof sketch
(23:44) 5.3. Algorithm and hyperparameters
(25:38) 5.4. Default agents
(26:31) 6. Results
(26:35) 6.1. Main results
(28:50) 6.2. Lopsided rewards
(31:27) 7.Discussion
(31:31) 7.1. Only DREST agents are NEUTRAL
(33:32) 7.2. The ‘shutdownability tax’ is small
(34:59) 7.3. DREST agents are still NEUTRAL when rewards are lopsided
(37:10) 8. Limitations and future work
(37:38) 8.1. Neural networks
(38:16) 8.2. Neutrality
(38:57) 8.3. Usefulness
(39:45) 8.4. Misalignment
(41:19) 9. Conclusion
(42:32) 10. References
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
We[1] have a new paper testing the Incomplete Preferences Proposal (IPP). The abstract and main-text is below. Appendices are in the linked PDF.
Abstract
---
Outline:
(00:21) Abstract
(01:26) 1. Introduction
(01:30) 1.1. The shutdown problem
(03:16) 1.2. A proposed solution
(03:42) Preferences Only Between Same-Length Trajectories (POST)
(07:24) 1.3. The training regimen
(09:25) 1.4. Our contribution
(11:11) 2. Related work
(11:15) 2.1. The shutdown problem
(13:05) 2.2. Proposed solutions
(14:27) 2.3. Experimental work
(15:18) 3. Gridworlds
(17:35) 4. Evaluation metrics
(17:45) Preferences Only Between Same-Length Trajectories (POST)
(20:02) 5. Reward functions and agents
(20:07) 5.1. DREST reward function
(22:29) 5.2. Proof sketch
(23:44) 5.3. Algorithm and hyperparameters
(25:38) 5.4. Default agents
(26:31) 6. Results
(26:35) 6.1. Main results
(28:50) 6.2. Lopsided rewards
(31:27) 7.Discussion
(31:31) 7.1. Only DREST agents are NEUTRAL
(33:32) 7.2. The ‘shutdownability tax’ is small
(34:59) 7.3. DREST agents are still NEUTRAL when rewards are lopsided
(37:10) 8. Limitations and future work
(37:38) 8.1. Neural networks
(38:16) 8.2. Neutrality
(38:57) 8.3. Usefulness
(39:45) 8.4. Misalignment
(41:19) 9. Conclusion
(42:32) 10. References
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,434 Listeners
2,388 Listeners
7,906 Listeners
4,133 Listeners
87 Listeners
1,462 Listeners
9,095 Listeners
87 Listeners
389 Listeners
5,429 Listeners
15,174 Listeners
474 Listeners
121 Listeners
75 Listeners
459 Listeners