Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Evidential Correlations are Subjective, and it might be a problem, published by Martín Soto on March 7, 2024 on The AI Alignment Forum.
I explain (in layman's terms) a realization that might make acausal trade hard or impossible in practice.
Summary: We know that if players believe different Evidential Correlations, they might miscoordinate. But clearly they will eventually learn to have the correct Evidential Correlations, right? Not necessarily, because there is no objective notion of correct here (in the way that there is for math or physics). Thus, selection pressures might be much weaker, and different agents might systematically converge on different ways of assigning Evidential Correlations.
Epistemic status: Confident that this realization is true, but the quantitative question of exactly how weak the selection pressures are remains open.
What are Evidential Correlations, really?
Skippable if you know the answer to the question.
Alice and Bob are playing a Prisoner's Dilemma, and they know each other's algorithms: Alice.source and Bob.source.[1] Since their algorithms are approximately as complex, each of them can't easily assess what the other will output. Alice might notice something like "hmm, Bob.source seems to default to Defection when it throws an exception, so this should update me slightly in the direction of Bob Defecting".
But she doesn't know exactly how often Bob.source throws an exception, or what it does when that doesn't happen.
Imagine, though, Alice notices Alice.source and Bob.source are pretty similar in some relevant ways (maybe the overall logical structure seems very close, or the depth of the for loops is the same, or she learns the training algorithm that shaped them is the same one). She's still uncertain about what any of these two algorithms outputs[2], but this updates her in the direction of "both algorithms outputting the same action".
If Alice implements/endorses Evidential Decision Theory, she will reason as follows:
Conditional on Alice.source outputting Defect, it seems very likely Bob.source also outputs Defect, thus my payoff will be low.
But conditional on Alice.source outputting Cooperate, it seems very likely Bob.source also outputs Cooperate, thus my payoff will be high.
So I (Alice) should output Cooperate, thus (very probably) obtain a high payoff.
To the extent Alice's belief about similarity was justified, it seems like she will perform pretty well on these situations (obtaining high payoffs). When you take this reasoning to the extreme, maybe both Alice and Bob are aware that they both know this kind of cooperation bootstrapping is possible (if they both believe they are similar enough), and thus (even if they are causally disconnected, and just simulating each others' codes) they can coordinate on some pretty complex trades.
This is Evidential Cooperation in Large worlds.
But wait a second: How could this happen, without them being causally connected? What was this mysterious similarity, this spooky correlation at a distance, that allowed them to create cooperation from thin air?
Well, in the words of Daniel Kokotajlo: it's just your credences, bro!
The bit required for this to work is that they believe that "it is very likely we both output the same thing". Said another way, they have high probability on the possible worlds "Alice.source = C, Bob.source = C" and "Alice.source = D, Bob.source = D", but low probability on the possible worlds "Alice.source = D, Bob.source = C" and "Alice.source = D, Bob.source = C".
This can also be phrased in terms of logical counterfactuals: if Alice.source = C, then it is very likely that Bob.source = C.[3] This is a logical counterfactual: there is, ultimately, a logical fact of the matter about what Alice.source outputs, but since she doesn't know it yet, she entertains what s...