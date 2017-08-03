There’s a new ad on the sidebar for Metaculus, a crowd-sourced prediction engine that tries to get well-calibrated forecasts on mostly scientific topics. If you’re wondering how likely it is that a genetically-engineered baby will be born in the next few years, that SpaceX will reach Mars by 2030, or that Sci-Hub will get shut down, take a look.
(there are also some on the lighter side, like whether Elon Musk will build more kilometers of tunnel than Trump builds of border wall)
They’re doing good work – online for-cash prediction markets are limited to a couple of bets by the government, and they usually focus on the big-ticket items like who’s going to win elections. Metaculus is run by a team affiliated with the Foundational Questions Institute, and as their name implies they’re really interested in using the power of prediction aggregators to address some the important questions for the future of humanity – like AI, genetic engineering, and pandemics.
Which makes me wonder: what’s everyone else’s excuse?
Back when it looked like prediction markets were going to change everything (was it really as recent as two months ago?), various explanations were floated for why they hadn’t caught on yet. The government was regulating betting too much. The public was creeped out by the idea of profiting off of terrorist attacks. Or maybe people were just too dumb to understand the arguments in favor.
Now there are these crowd-sourced aggregator things. They’re not regulated by the government. Nobody’s profiting off of anything. And you don’t have to have faith that they’ll work – Philip Tetlock’s superforecasting experiments proved it, and Metaculus tracks its accuracy over time. I know the intelligence services are working with the Good Judgment Project, but I’m still surprised it hasn’t gone further.
Robin Hanson is the acknowledged expert on this, thanks to his long career of trying to get institutions to adopt prediction markets and generally failing. He attributes the whole problem to signaling and prestige, which I guess makes as much sense as anything. Tyler Cowen says something similar here. But I’m still surprised there aren’t consultant superforecaster think tanks hanging up their shingles. Forget prestige – aren’t there investors who would pay for their wisdom? And why can’t I hire Philip Tetlock to tell me whether my relationship is going to last?
I asked Prof. Aguirre of Metaculus, and he said (slightly edited for flow):
I don’t think Tetlock has any “secret sauce”, though I think he did a good job. The Metaculus track record is pretty good and will continue to improve. There’s definitely real predictive power. Our main challenge is that all the personnel involved are very time-limited and we’re also operating on a shoestring, probably 1/50th of what Tetlock spent of IARPA’s money.
If you are an individual company or investor, you don’t really get that much from “crowdsourcing” because you don’t really have a crowd (unless you’re a big business and can force your employees to make the predictions); so I’d guess most companies probably just fall back on asking some group of people to get together and make projections etc. My personal view is that the power really comes when you get a *lot* of people, questions, and data; you can then start to leverage that to improve the calibration (by recalibrating based on past accuracy), identify the really good predictors, and build up a large enough corpus of results that the predicted probabilities become grounded in an actual ensemble — the ensemble of questions on the platform.
Relatedly, the ability to make use of probabilistic predictions is, sadly, confined to a rather small fraction of people. I think typical decision-makers want to know what *will* happen, and 60-40 or 75-25 or even 80-20 is not the certainty they want. In the face of this uncertainty, I think people mentally substitute a different question to which they feel that they have some sort of good instinct, or privileged understanding. I think there’s also an element where they just don’t really *believe* the numbers because they are sometimes “wrong.” This sometimes frustrates me, but is not wholly irrational, I would say, because in the absence of a real grounding of the probabilities, if you have some analyst come and tell you that there’s a 70-30 chance, what exactly do you do with that? How would you possibly trust that those are the right numbers if the question is something “squishy” without a solid mathematical model?
I wonder if there’s data about how accuracy changes with number of predictors and predictor quality. There are so many smart people around who are interested in probability and willing to answer a few questions that this seems like a really stupid bottleneck. I’m happy I can do my part pushing Metaculus, but someone seriously needs to find a way to make this scale.
I wonder if allowing comments on the site will affect its accuracy due to a recency bias invoked by the top comments.
+1 Meta points to you, sir!
Also to answer your question, most likely yes. While debating the subject and adding a variety of views may in fact increase overall accuracy, the recency bias effect caused by the limitations of discussion boards is very likely to produce a larger magnitude effect in the opposite direction. The benefit of discussion are much diminished if people do not take in the whole of the discussion.
I use the site frequently. Based on my experience, this is a valid concern though I’m not sure that the recency bias is as much of an issue as much as the lack of comments.
Also, there’s going to be an even more significant bias from whoever is the first to submit a prediction.
Meanwhile, back in 1977 someone is wondering why more business are not adopting personal microcomputers given their obvious productivity benefits. Technology takes time to disseminate, so the question you need to be asking is not why this specific technology is taking time to disseminate, but why it takes time in general. Likely it’s some complicated mix of signalling, prestige, institutional inertia, and a general feeling that these methods are still unproven. The last one is very important because nobody wants to be the fool who advocated for Hot New Thing over Proven Old Method only to have it blow up in their face.
Also few people have even heard of Philip Tetlock’s super-forecasting experiments, and fewer still are going to be convinced by them. Knowledge also takes time to disseminate, and people are notoriously hard to convince. Superforcasting was published in 2015, for crying out loud, this is like being shocked that businessmen were still using dumb phones in 2009.
Here’s my excuse. I eagerly hopped on the bandwagon of forecasting and prediction markets, and tried talking to lots of rationalists and effective altruists about it over the last couple years. It’s one of those things that wasn’t taken up as an idea to actually do something about by many in the rationality community besides Robin Hanson. You, Scott, have blogged about it too. Frankly, I expect it wasn’t being said by the right people in the community in the right way to be taken seriously. The rationalist diaspora is hard to read these days, and to the extent the rationality community runs on a heuristic of concluding the next big idea must be what everyone else in the community is currently talking about, it’s hard to tell why some memes get taken more seriously than others.
There was a time last year when the rationality community’s interest in Kegan levels spiked for a while. I wish I knew how to get people to excited about prediction markets and forecasting the way they get excited about Kegan levels. Anyway, I gave up on trying to get people to care about forecasting and prediction markets, as I wasn’t well-placed to do something all by myself, and I was hoping Metaculus would take off. Maybe that’s happening now. I’ll believe people will stop making excuses for not doing anything about forecasting when I see it, though.