Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bayesians Commit the Gambler's Fallacy, published by Kevin Dorst on January 7, 2024 on LessWrong.
TLDR: Rational people who start out uncertain about an (in fact independent) causal process and then learn from unbiased data will rule out "streaky" hypotheses more quickly than "switchy" hypotheses. As a result, they'll commit the gambler's fallacy: expecting the process to switch more than it will. In fact, they'll do so in ways that match a variety of empirical findings about how real people commit the gambler's fallacy. Maybe it's not a fallacy, after all.
(This post is based on a full paper.)
Baylee is bored. The fluorescent lights hum. The spreadsheets blur. She needs air.
As she steps outside, she sees the Prius nestled happily in the front spot. Three days in a row now - the Prius is on a streak. The Jeep will probably get it tomorrow, she thinks.
This parking battle - between a Prius and a Jeep - has been going on for months. Unbeknownst to Baylee, the outcomes are statistically independent: each day, the Prius and the Jeep have a 50% chance to get the front spot, regardless of how the previous days have gone. But Baylee thinks and acts otherwise: after the Prius has won the spot a few days in a row, she tends to think the Jeep will win next. (And vice versa.)
So Baylee is committing the gambler's fallacy: the tendency to think that streaks of (in fact independent) outcomes are likely to switch. Maybe you conclude from this - as many psychologists have - that Baylee is bad at statistical reasoning.
You're wrong.
Baylee is a rational Bayesian. As I'll show: when either data or memory are limited, Bayesians who begin with causal uncertainty about an (in fact independent) process - and then learn from unbiased data - will, on average, commit the gambler's fallacy.
Why? Although they'll get evidence that the process is neither "switchy" nor "streaky", they'll get more evidence against the latter. Thus they converge asymmetrically to the truth (of independence), leading them to (on average) commit the gambler's fallacy along the way.
More is true. Bayesians don't just commit the gambler's fallacy - they do so in way that qualitatively matches a wide variety of trends found in the empirical literature on the gambler's fallacy. This provides evidence for:
Causal-Uncertainty Hypothesis: The gambler's fallacy is due to causal uncertainty combined with rational responses to limited data and memory.
This hypothesis stacks up favorably against extant theories of the gambler's fallacy in terms of both explanatory power and empirical coverage. See the paper for the full argument - here I'll just sketch the idea.
Asymmetric Convergence
Consider any process that can have one of two repeatable outcomes - Prius vs. Jeep; heads vs. tails; hit vs. miss; 1 vs. 0; etc.
Baylee knows that the process (say, the parking battle) is "random" in the sense that (i) it's hard to predict, and (ii) in the long run, the Prius wins 50% of the time. But that leaves open three classes of hypotheses:
Steady: The outcomes are independent, so each day there's a 50% chance the Prius wins the spot. (Compare: a fair coin toss.)
Switchy: The outcomes tend to switch: after the Prius wins a few in a row, the Jeep becomes more likely to win; after the Jeep wins a few, vice versa. (Compare: drawing from a deck of cards without replacement - after a few red cards, a black card becomes more likely.)
Sticky: The outcomes tend to form streaks: after the Prius wins a few, it becomes more likely to win again; likewise for the Jeep. (Compare: basketball shots - after a player makes a few, they become "hot" and so are more likely to make the next one. No, the "hot hand" is not a myth.[1])
So long as each of these hypotheses is symmetric around 50%, they all will lead to (i) the process being hard to predict, and (ii...