
Sign up to save your podcasts
Or
I’ve spent a lot of the last few years working on issues related to acausal cooperation. With LLMs being clearly dominant over recent years, I’ve now led a team to make a benchmark to figure out how good LLMs are at decision theory and whether and when they lean more CDT or EDT. We hope to expand this dataset in the future, including by incorporating questions that try to measure the updatelessness dimension. Hopefully, this dataset will be useful for future interventions aimed at improving acausal interactions.
Abstract:
We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is [...]
The original text contained 8 images which were described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I’ve spent a lot of the last few years working on issues related to acausal cooperation. With LLMs being clearly dominant over recent years, I’ve now led a team to make a benchmark to figure out how good LLMs are at decision theory and whether and when they lean more CDT or EDT. We hope to expand this dataset in the future, including by incorporating questions that try to measure the updatelessness dimension. Hopefully, this dataset will be useful for future interventions aimed at improving acausal interactions.
Abstract:
We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is [...]
The original text contained 8 images which were described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
26,342 Listeners
2,393 Listeners
7,949 Listeners
4,130 Listeners
87 Listeners
1,446 Listeners
8,910 Listeners
88 Listeners
372 Listeners
5,421 Listeners
15,334 Listeners
468 Listeners
122 Listeners
76 Listeners
449 Listeners