March 11, 2024

AF - How disagreements about Evidential Correlations could be settled by Martín Soto

8 minutes

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How disagreements about Evidential Correlations could be settled, published by Martín Soto on March 11, 2024 on The AI Alignment Forum.

Since beliefs about Evidential Correlations don't track any direct ground truth, it's not obvious how to resolve disagreements about them, which is very relevant to acausal trade.

Here I present what seems like the only natural method (Third solution below).

Ideas partly generated with Johannes Treutlein.

Say two agents (algorithms A and B), who follow EDT, form a coalition. They are jointly deciding whether to pursue action a. Also, they would like an algorithm C to take action c. As part of their assessment of a, they're trying to estimate how much evidence (their coalition taking) a would provide for C taking c. If it gave a lot of evidence, they'd have more reason to take a. But they disagree: A thinks the correlation is very strong, and B thinks it's very weak.

This is exactly the situation in which researchers in acausal trade have many times found themselves: they are considering whether to take a slightly undesirable action a (spending a few resources on paperclips), which could provide evidence for another agent C (a paperclip-maximizing AI in another lightcone) taking an action c (the AI spending a few resources on human happiness) that we'd like to happen.

But different researchers A and B (within the coalition of "humans trying to maximize human happiness") have different intuitions about

A priori, there could exist the danger that, by thinking more, they would unexpectedly learn the actual output of C. This would make the trade no longer possible, since then taking a would give them no additional evidence about whether c happens. But, for simplicity, assume that C is so much more complex and chaotic than what A and B can compute, that they are very certain this won't happen.

First solution: They could dodge the question by just looking for different actions to take they don't disagree on. But that's boring.

Second solution: They could aggregate their numeric credences somehow. They could get fancy on how to do this. They could even get into more detail, and aggregate parts of their deliberation that are more detailed and informative than a mere number (and that are upstream of this probability), like different heuristics or reference-class estimates they've used to come up with them.

They might face some credit assignment problems (which of my heuristics where most important in setting this probability?). This is not boring, but it's not yet what I want to discuss.

Let's think about what these correlations actually are and where they come from. These are actually probabilistic beliefs about logical worlds. For example, A might think that in the world where they play a (that is, conditioning A's distribution on this fact), the likelihood of C playing c is 0.9. While if they don't, it's 0.3. Unfortunately, only one of the two logical worlds will be actual. And so, one of these two beliefs will never be checked against any ground truth.

If they end up taking a,

there won't be any mathematical fact of the matter as to what would have happened if they had not.

But nonetheless, it's not as if "real math always gets feedback, and counterfactuals never do": after all, the still-uncertain agent doesn't know which counterfactual will be real, and so they use the same general heuristics to think about all of them. When reality hits back on the single counterfactual that becomes actual, it is this heuristic that will be chiseled.

I think that's the correct picture of bounded logical learning: a pile of heuristics learning through time. This is what Logical Inductors formalize.[1]

It thus becomes clear that correlations are the "running-time by-product" of using these heuristics to approximate real math. Who cares only one of the cou...

...more

View all episodes

By The Nonlinear Fund

March 11, 2024

AF - How disagreements about Evidential Correlations could be settled by Martín Soto

8 minutes

Since beliefs about Evidential Correlations don't track any direct ground truth, it's not obvious how to resolve disagreements about them, which is very relevant to acausal trade.

Here I present what seems like the only natural method (Third solution below).

Ideas partly generated with Johannes Treutlein.

But different researchers A and B (within the coalition of "humans trying to maximize human happiness") have different intuitions about

First solution: They could dodge the question by just looking for different actions to take they don't disagree on. But that's boring.

They might face some credit assignment problems (which of my heuristics where most important in setting this probability?). This is not boring, but it's not yet what I want to discuss.

If they end up taking a,

there won't be any mathematical fact of the matter as to what would have happened if they had not.

I think that's the correct picture of bounded logical learning: a pile of heuristics learning through time. This is what Logical Inductors formalize.[1]

It thus becomes clear that correlations are the "running-time by-product" of using these heuristics to approximate real math. Who cares only one of the cou...

...more

More shows like The Nonlinear Library: Alignment Forum

View all

AXRP - the AI X-risk Research Podcast

9 Listeners

Share AF - How disagreements about Evidential Correlations could be settled by Martín Soto

Sign up to save your podcasts

AF - How disagreements about Evidential Correlations could be settled by Martín Soto

AF - How disagreements about Evidential Correlations could be settled by Martín Soto

More shows like The Nonlinear Library: Alignment Forum

AXRP - the AI X-risk Research Podcast