Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Existential Risk Persuasion Tournament, published by PeterMcCluskey on July 17, 2023 on LessWrong.
I participated last summer in Tetlock's Existential Risk Persuasion
Tournament (755(!) page paper
here).
Superforecasters and "subject matter experts" engaged in a hybrid between a prediction market and debates, to predict catastrophic and existential risks this century.
I signed up as a superforecaster. My impression was that I knew as much about AI risk as any of the subject matter experts with whom I interacted (the tournament was divided up so that I was only aware of a small fraction of the 169 participants).
I didn't notice anyone with substantial expertise in machine learning. Experts were apparently chosen based on having some sort of respectable publication related to AI, nuclear, climate, or biological catastrophic risks. Those experts were more competent, in one of those fields, than news media pundits or politicians. I.e. they're likely to be more accurate than random guesses. But maybe not by a large margin.
That expertise leaves much to be desired. I'm unsure whether there was a realistic way for the sponsors to attract better experts. There seems to be not enough money or prestige to attract the very best experts.
Incentives
The success of the superforecasting approach depends heavily on forecasters having decent incentives.
It's tricky to give people incentives to forecast events that will be evaluated in 2100, or evaluated after humans go extinct.
The tournament provided a fairly standard scoring rule for questions that resolve by 2030. That's a fairly safe way to get parts of the tournament to work well.
The other questions were scored by how well the forecast matched the median forecast of other participants (excluding participants that the forecasters interacted with). It's hard to tell whether that incentive helped or hurt the accuracy of the forecasts. It's easy to imagine that it discouraged forecasters from relying on evidence that is hard to articulate, or hard to verify. It provided an incentive for groupthink. But the overall incentives were weak enough that altruistic pursuit of accuracy might have prevailed. Or ideological dogmatism might have prevailed. It will take time before we have even weak evidence as to which was the case.
One incentive that occurred to me toward the end of the tournament was the possibility of getting a verified longterm forecasting track record. Suppose that in 2050 they redo the scores based on evidence available then, and I score in the top 10% of tournament participants. That would likely mean that I'm one of maybe a dozen people in the world with a good track record for forecasting 28 years into the future. I can imagine that being valuable enough for someone to revive me from cryonic suspension when I'd otherwise be forgotten.
There were some sort of rewards for writing comments that influenced other participants. I didn't pay much attention to those.
Quality of the Questions
There were many questions loosely related to AGI timelines, none of them quite satisfying my desire for something closely related to extinction risk that could be scored before it's too late to avoid the risk.
One question was based on a Metaculus forecast for an advanced AI. It seems to represent clear progress toward the kind of AGI that could cause dramatic changes. But I expect important disagreements over how much progress it represents: what scale should we use to decide how close such an AI is to a dangerous AI? does the Turing test use judges who have expertise in finding the AI's weaknesses?
Another question was about when Nick Bostrom will decide that an AGI exists. Or if he doesn't say anything clear, then a panel of experts will guess what Bostrom would say. That's pretty close to a good question to forecast. Can we assume tha...