
Sign up to save your podcasts
Or
I've been accepted as a mentor for the next AI Safety Camp. You can apply to work with me on the tiling problem. The goal will be to develop reflectively consistent UDT-inspired decision theories, and try to prove tiling theorems for them.
The deadline for applicants is November 17.
The program will run from January 11 to April 27. It asks for a 10 hour/week commitment.
My project description follows:
Summary
The Tiling Agents problem (aka reflective consistency) consists of analysing when one agent (the "predecessor") will choose to deliberately modify another agent (the "successor"). Usually, the predecessor and successor are imagined as the same agent across time, so we are studying self-modification. A set of properties "tiles" if those properties, when present in both predecessor and successor, guarantee that any self-modifications will avoid changing those properties.
You can think of this as the question of when agents will [...]
---
Outline:
(00:33) Summary
(02:10) The non-summary
(02:14) Motivation
(03:20) Tiling Overview
(05:13) Reflective Oracles
(06:41) Logical Uncertainty
(08:17) Value Uncertainty
(10:14) Value Plurality
(11:38) Ontology Plurality
(12:22) Cooperation and Coordination
---
First published:
Source:
Narrated by TYPE III AUDIO.
I've been accepted as a mentor for the next AI Safety Camp. You can apply to work with me on the tiling problem. The goal will be to develop reflectively consistent UDT-inspired decision theories, and try to prove tiling theorems for them.
The deadline for applicants is November 17.
The program will run from January 11 to April 27. It asks for a 10 hour/week commitment.
My project description follows:
Summary
The Tiling Agents problem (aka reflective consistency) consists of analysing when one agent (the "predecessor") will choose to deliberately modify another agent (the "successor"). Usually, the predecessor and successor are imagined as the same agent across time, so we are studying self-modification. A set of properties "tiles" if those properties, when present in both predecessor and successor, guarantee that any self-modifications will avoid changing those properties.
You can think of this as the question of when agents will [...]
---
Outline:
(00:33) Summary
(02:10) The non-summary
(02:14) Motivation
(03:20) Tiling Overview
(05:13) Reflective Oracles
(06:41) Logical Uncertainty
(08:17) Value Uncertainty
(10:14) Value Plurality
(11:38) Ontology Plurality
(12:22) Cooperation and Coordination
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,366 Listeners
2,383 Listeners
7,944 Listeners
4,137 Listeners
87 Listeners
1,459 Listeners
9,050 Listeners
88 Listeners
386 Listeners
5,422 Listeners
15,220 Listeners
473 Listeners
120 Listeners
76 Listeners
456 Listeners