There’s a new ad on the sidebar for Metaculus, a crowd-sourced prediction engine that tries to get well-calibrated forecasts on mostly scientific topics. If you’re wondering how likely it is that a genetically-engineered baby will be born in the next few years, that SpaceX will reach Mars by 2030, or that Sci-Hub will get shut down, take a look.
(there are also the usual things about politics, plus some deliberately wacky ones like whether Elon Musk will build more kilometers of tunnel than Trump does border wall)
They’re doing good work – online for-cash prediction markets are limited to a couple of bets by the government, and they usually focus on the big-ticket items like who’s going to win elections. Metaculus is run by a team affiliated with the Foundational Questions Institute, and as their name implies they’re really interested in using the power of prediction aggregators to address some the important questions for the future of humanity – like AI, genetic engineering, and pandemics.
Which makes me wonder: what’s everyone else’s excuse?
Back when it looked like prediction markets were going to change everything (was it really as recent as two months ago?), various explanations were floated for why they hadn’t caught on yet. The government was regulating betting too much. The public was creeped out by the idea of profiting off of terrorist attacks. Or maybe people were just too dumb to understand the arguments in favor.
Now there are these crowd-sourced aggregator things. They’re not regulated by the government. Nobody’s profiting off of anything. And you don’t have to have faith that they’ll work – Philip Tetlock’s superforecasting experiments proved it, and Metaculus tracks its accuracy over time. I know the intelligence services are working with the Good Judgment Project, but I’m still surprised it hasn’t gone further.
Robin Hanson is the acknowledged expert on this, thanks to his long career of trying to get institutions to adopt prediction markets and generally failing. He attributes the whole problem to signaling and prestige, which I guess makes as much sense as anything. Tyler Cowen says something similar here. But I’m still surprised there aren’t consultant superforecaster think tanks hanging up their shingles. Forget prestige – aren’t there investors who would pay for their wisdom? And why can’t I hire Philip Tetlock to tell me whether my relationship is going to last?
I asked Prof. Aguirre of Metaculus, and he said (slightly edited for flow):
I don’t think Tetlock has any “secret sauce”, though I think he did a good job. The Metaculus track record is pretty good and will continue to improve. There’s definitely real predictive power. Our main challenge is that all the personnel involved are very time-limited and we’re also operating on a shoestring, probably 1/50th of what Tetlock spent of IARPA’s money.
If you are an individual company or investor, you don’t really get that much from “crowdsourcing” because you don’t really have a crowd (unless you’re a big business and can force your employees to make the predictions); so I’d guess most companies probably just fall back on asking some group of people to get together and make projections etc. My personal view is that the power really comes when you get a *lot* of people, questions, and data; you can then start to leverage that to improve the calibration (by recalibrating based on past accuracy), identify the really good predictors, and build up a large enough corpus of results that the predicted probabilities become grounded in an actual ensemble — the ensemble of questions on the platform.
Relatedly, the ability to make use of probabilistic predictions is, sadly, confined to a rather small fraction of people. I think typical decision-makers want to know what *will* happen, and 60-40 or 75-25 or even 80-20 is not the certainty they want. In the face of this uncertainty, I think people mentally substitute a different question to which they feel that they have some sort of good instinct, or privileged understanding. I think there’s also an element where they just don’t really *believe* the numbers because they are sometimes “wrong.” This sometimes frustrates me, but is not wholly irrational, I would say, because in the absence of a real grounding of the probabilities, if you have some analyst come and tell you that there’s a 70-30 chance, what exactly do you do with that? How would you possibly trust that those are the right numbers if the question is something “squishy” without a solid mathematical model?
I wonder if there’s data about how accuracy changes with number of predictors and predictor quality. There are so many smart people around who are interested in probability and willing to answer a few questions that this seems like a really stupid bottleneck. I’m happy I can do my part pushing Metaculus, but someone seriously needs to find a way to make this scale.