Tag Archives: really this is secretly about ai

A Failure, But Not Of Prediction

I.

Vox asks What Went Wrong With The Media’s Coronavirus Coverage? They conclude that the media needs to be better at “not just saying what we do know, but what we don’t know”. This raises some important questions. Like: how much ink and paper is there in the world? Are we sure it’s enough? But also: how do you become better at saying what you don’t know?

In case you’ve been hiding under a rock recently (honestly, valid) the media not only failed to adequately warn its readers about the epidemic, but actively mocked and condescended to anyone who did sound a warning. Real Clear Politics has a list of highlights. The Vox tweet saying “Is this going to be a deadly pandemic? No.” Washington Post telling us in February “Why we should be wary of an aggressive government reponse to coronavirus (it might “scapegoat marginalized populations”). The Daily Beast complaining that “coronavirus, with zero American fatalities, is dominating headlines, while the flu is the real threat”. The New York Times, weighing in with articles like “The pandemic panic” and “Who says it’s not safe to travel to China”. The constant attempts to attribute “alarmism” over the virus to anti-Chinese racism. Etc, etc, etc.

One way people have summed this up is that the media (and the experts they relied on) did a terrible job predicting what would happen. I think this lets them off too easy.

Prediction is very hard. Nate Silver is maybe the best political predicter alive, and he estimated a 29% chance of Trump winning just before Trump won. UPenn professor Philip Tetlock has spent decades identifying “superforecasters” and coming up with complicated algorithms for aggregating their predictions, developing a prediction infrastructure that beats top CIA analysts, but they estimated a 23% chance Britain would choose Brexit just before it happened. This isn’t intended to criticize Silver or Tetlock. I believe they’re operating at close to optimum – the best anyone could possibly do with the information that they had. But the world is full of noise, and tiny chance events can have outsized effects, and there are only so many polls you can scrutinize, and even geniuses can only do so well.

Predicting the coronavirus was equally hard, and the best institutions we had missed it. On February 20th, Tetlock’s superforecasters predicted only a 3% chance that there would be 200,000+ coronavirus cases a month later (there were). The stock market is a giant coordinated attempt to predict the economy, and it reached an all-time high on February 12, suggesting that analysts expected the economy to do great over the following few months. On February 20th it fell in a way that suggested a mild inconvenience to the economy, but it didn’t really start plummeting until mid-March – the same time the media finally got a clue. These aren’t empty suits on cable TV with no skin in the game. These are the best predictive institutions we have, and they got it wrong. I conclude that predicting the scale of coronavirus in mid-February – the time when we could have done something about it – was really hard.

I don’t like this conclusion. But I have to ask myself – if it was so easy, why didn’t I do it? It’s easy to look back and say “yeah, I always secretly knew it would be pretty bad”. I did a few things right – I started prepping half-heartedly in mid-February, I recommended my readers prep in early March, I never criticized others for being alarmist. Overall I give myself a solid B-. But if it was so easy, why didn’t I post “Hey everyone, I officially predict the coronavirus will be a nightmarish worldwide pandemic” two months ago? It wouldn’t have helped anything, but I would have had bragging rights forever. For that matter, why didn’t you post this – on Facebook, on Twitter, on the comments here? You could have gone down in legend, alongside Travis W. Fisher, for making a single tweet. Since you didn’t do that (aside from the handful of you who did – we love you, Balaji) I conclude that predicting it was hard, even for smart and well-intentioned people like yourselves.

Does that mean we can’t put everyone’s heads on spikes outside the Capitol Building as a warning for future generations? I would be very disappointed if it meant that. I think we can still put heads on spikes. We just have to do it for more subtle, better-thought-out reasons.

II.

I used to run user surveys for a forum on probabilistic reasoning

(I promise this will become relevant soon)

A surprising number of these people had signed up for cryonics – the thing where they freeze your brain after you die, in case the future invents a way to resurrect frozen brains. Lots of people mocked us for this – “if you’re so good at probabilistic reasoning, how can you believe something so implausible?” I was curious about this myself, so I put some questions on one of the surveys.

The results were pretty strange. Frequent users of the forum (many of whom had pre-paid for brain freezing) said they estimated there was a 12% chance the process would work and they’d get resurrected. A control group with no interest in cryonics estimated a 15% chance. The people who were doing it were no more optimistic than the people who weren’t. What gives?

I think they were actually good at probabilistic reasoning. The control group said “15%? That’s less than 50%, which means cryonics probably won’t work, which means I shouldn’t sign up for it.” The frequent user group said “A 12% chance of eternal life for the cost of a freezer? Sounds like a good deal!”

There are a lot of potential objections and complications – for one thing, maybe both those numbers are much too high. You can read more here and here. But overall I learned something really important from this.

Making decisions is about more than just having certain beliefs. It’s also about how you act on them.

III.

A few weeks ago, I wrote a blog post on face masks. It reviewed the evidence and found that they probably helped prevent the spread of disease. Then it asked: how did the WHO, CDC, etc get this so wrong?

I went into it thinking they’d lied to us, hoping to prevent hoarders from buying up so many masks that there weren’t enough for health workers. Turns out that’s not true. The CDC has been singing the same tune for the past ten years. Swine flu, don’t wear masks. SARS, don’t wear masks. They’ve been really consistent on this point. But why?

If you really want to understand what happened, don’t read any studies about face masks or pandemics. Read Smith & Pell (2003), Parachute Use To Prevent Death And Major Trauma Related To Gravitational Challenge: Systematic Review Of Randomized Controlled Trials. It’s an article in the British Journal Of Medicine pointing out that there have never been any good studies proving that parachutes are helpful when jumping out of a plane, so they fail to meet the normal standards of evidence-based medicine. From the Discussion section:

It is a truth universally acknowledged that a medical intervention justified by observational data must be in want of verification through a randomised controlled trial. Observational studies have been tainted by accusations of data dredging, confounding, and bias. For example, observational studies showed lower rates of ischaemic heart disease among women using hormone replacement therapy, and these data were interpreted as advocating hormone replacement for healthy women, women with established ischaemic heart disease, and women with risk factors for ischaemic heart disease. However, randomised controlled trials showed that hormone replacement therapy actually increased the risk of ischaemic heart disease, indicating that the apparent protective effects seen in observational studies were due to bias. Cases such as this one show that medical interventions based solely on observational data should be carefully scrutinised, and the parachute is no exception.

Of course this is a joke. It’s in the all-joke holiday edition of BMJ, and everyone involved knew exactly what they were doing. But the joke is funny because it points at something true. It’s biting social commentary. Doctors will not admit any treatment could possibly be good until it has a lot of randomized controlled trials behind it, common sense be damned. This didn’t come out of nowhere. They’ve been burned lots of times before by thinking they were applying common sense and getting things really wrong. And after your mistakes kill a few thousand people you start getting really paranoid and careful. And there are so many quacks who can spout off some “common sense” explanation for why their vitamin-infused bleach or colloidal silver should work that doctors have just become immune to that kind of bullshit. Multiple good RCTs or it didn’t happen. Given the history I think this is a defensible choice, and if you are tempted to condemn it you may find this story about bone marrow transplants enlightening.

But you can take this too far. After highlighting the lack of parachute RCTs, the paper continues:

Only two options exist. The first is that we accept that, under exceptional circumstances, common sense might be applied when considering the potential risks and benefits of interventions. The second is that we continue our quest for the holy grail of exclusively evidence based interventions and preclude parachute use outside the context of a properly conducted trial. The dependency we have created in our population may make recruitment of the unenlightened masses to such a trial difficult. If so, we feel assured that those who advocate evidence based medicine and criticise use of interventions that lack an evidence base will not hesitate to demonstrate their commitment by volunteering for a double blind, randomised, placebo controlled, crossover trial.

Did you follow that? For a good parachute RCT, half the subjects would have to jump out of a plane wearing a placebo parachute. The authors suggest maybe we enlist doctors who insist too stringently on RCTs over common sense for this dubious honor.

(good news, though, a parachute RCT did eventually get done)

Sometimes good humor is a little too on the nose, like those Onion articles that come true a few years later. The real medical consensus on face masks came from pretty much the same process as the fake medical consensus on parachutes. Common sense said that they worked. But there weren’t many good RCTs. We couldn’t do more, because it would have been unethical to deliberately expose face-mask-less people to disease. In the end, all we had were some mediocre trials of slightly different things that we had to extrapolate out of range.

Just like the legal term for “not proven guilty beyond a reasonable doubt” is “not guilty”, the medical term for “not proven to work in several gold-standard randomized controlled trials” is “it doesn’t work” (and don’t get me started on “no evidence”). So the CDC said masks didn’t work.

Going back to our diagram:

Goofus started with the position that masks, being a new idea, needed incontrovertible proof. When the few studies that appeared weren’t incontrovertible enough, he concluded that people shouldn’t wear masks.

Gallant would have recognized the uncertainty – based on the studies we can’t be 100% sure masks definitely work for this particular condition – and done a cost-benefit analysis. Common sensically, it seems like masks probably should work. The existing evidence for masks is highly suggestive, even if it’s not utter proof. Maybe 80% chance they work, something like that? If you can buy an 80% chance of stopping a deadly pandemic for the cost of having to wear some silly cloth over your face, probably that’s a good deal. Even though regular medicine has good reasons for being as conservative as it is, during a crisis you have to be able to think on your feet.

IV.

But getting back to the media:

Their main excuse is that they were just relaying expert opinion – the sort of things the WHO and CDC and top epidemiologists were saying. I believe them. People on Twitter howl and gnash their teeth at this, asking why the press didn’t fact-check or challenge those experts. But I’m not sure I want to institute a custom of journalists challenging experts. Journalist Johann Hari decided to take it upon himself to challenge psychiatric experts, and wrote a serious of terrible articles and a terrible book saying they were wrong about everything. I am a psychiatrist and I can tell you he is so wrong that it is physically painful to read his stuff (though of course I would say that…). Most journalists stick to assuming the experts know more about their subject of expertise than they do, and I think this is wise. The role of science journalists is to primarily to relay, explain, give context to the opinions of experts, not to try to out-medicine the doctors. So I think this is a good excuse.

But I would ask this of any journalist who pleads that they were just relaying and providing context for expert opinions: what was the experts’ percent confidence in their position?

I am so serious about this. What fact could possibly be more relevant? What context could it possibly be more important to give? I’m not saying you need to have put a number in your articles, maybe your readers don’t go for that. But were you working off of one? Did this question even occur to you?

Nate Silver said there was a 29% chance Trump would win. Most people interpreted that as “Trump probably won’t win” and got shocked when he did. What was the percent attached to your “coronavirus probably won’t be a disaster” prediction? Was it also 29%? 20%? 10%? Are you sure you want to go lower than 10%? Wuhan was already under total lockdown, they didn’t even have space to bury all the bodies, and you’re saying that there was less than 10% odds that it would be a problem anywhere else? I hear people say there’s a 12 – 15% chance that future civilizations will resurrect your frozen brain, surely the risk of coronavirus was higher than that?

And if the risk was 10%, shouldn’t that have been the headline. “TEN PERCENT CHANCE THAT THERE IS ABOUT TO BE A PANDEMIC THAT DEVASTATES THE GLOBAL ECONOMY, KILLS HUNDREDS OF THOUSANDS OF PEOPLE, AND PREVENTS YOU FROM LEAVING YOUR HOUSE FOR MONTHS”? Isn’t that a better headline than Coronavirus panic sells as alarmist information spreads on social media? But that’s the headline you could have written if your odds were ten percent!

So:

I think people acted like Goofus again.

People were presented with a new idea: a global pandemic might arise and change everything. They waited for proof. The proof didn’t arise, at least at first. I remember hearing people say things like “there’s no reason for panic, there are currently only ten cases in the US”. This should sound like “there’s no reason to panic, the asteroid heading for Earth is still several weeks away”. The only way I can make sense of it is through a mindset where you are not allowed to entertain an idea until you have proof of it. Nobody had incontrovertible evidence that coronavirus was going to be a disaster, so until someone does, you default to the null hypothesis that it won’t be.

Gallant wouldn’t have waited for proof. He would have checked prediction markets and asked top experts for probabilistic judgments. If he heard numbers like 10 or 20 percent, he would have done a cost-benefit analysis and found that putting some tough measures into place, like quarantine and social distancing, would be worthwhile if they had a 10 or 20 percent chance of averting catastrophe.

V.

This is at risk of getting too depressing, so I want to focus on some people who deserve recognition for especially good responses.

First, a bunch of generic smart people on Twitter who got things exactly right – there are too many of these people to name, but Scott Aaronson highlights “Bill Gates, Balaji Srinivasan, Paul Graham, Greg Cochran, Robin Hanson, Sarah Constantin, Eliezer Yudkowsky, and Nicholas Christakis.” None of these people (except Greg Cochran) are domain experts, and none of them (except Greg Cochran) have creepy oracular powers. So how could they have beaten the experts? Haven’t we been told a million times that generic intelligence is no match for deep domain knowledge?

I think the answer is: they didn’t beat the experts in epidemiology. Whatever probability of pandemic the experts and prediction markets gave for coronavirus getting really bad, these people didn’t necessarily give a higher probability. They were just better at probabilistic reasoning, so they had different reactions to the same number. There’s no reason generic why smart people shouldn’t be better at probabilistic reasoning then epidemiologists. In fact, this seems exactly like the sort of thing generic smart people might be.

Zeynep Tufekci is an even clearer example. She’s a sociologist and journalist who was writing about how it was “our civic duty” to prepare for coronavirus as early as February. She was also the first mainstream media figure to spread the word that masks were probably helpful.

Totally at random today, reading a blog post on the Mongol Empire like all normal people do during a crisis, I stumbled across a different reference to Zeynep. In a 2014 article, she was sounding a warning about the Ebola pandemic that was going on at the time. She was saying the exact same things everyone is saying now – global institutions are failing, nobody understands exponential growth, travel restrictions could work early but won’t be enough if it breaks out. She quoted a CDC prediction that there could be a million cases by the end of 2014. “Let that sink in,” she wrote. “A million Ebola victims in just a few months.”

In fact, this didn’t happen. There were only about 30,000 cases. The virus never really made it out of Liberia, Sierra Leone, and Guinea.

I don’t count this as a failed prediction on Zeynep’s part. First of all, because it could have been precisely because of people like her sounding the alarm that the epidemic was successfully contained. But more important, it wasn’t really a prediction at all. Her point wasn’t that she definitely knew this Ebola pandemic was the one that would be really bad. Her point was that it might be, so we needed to prepare. She said the same thing when the coronavirus was just starting. If this were a game, her batting average would be 50%, but that’s the wrong framework.

Zeynep Tufecki is admirable. But her admirable skill isn’t looking at various epidemics and successfully predicting which ones will be bad and which ones will fizzle out. She can’t do that any better than anyone else. Her superpower is her ability to treat something as important even before she has incontrovertible evidence that it has to be.

And finally, Kelsey Piper. She wrote a February 6th article saying:

The coronavirus killed fewer people than the flu did in January. But it might kill more in February — and unlike the flu, its scope and effects are poorly understood and hard to guess at. The Chinese National Health Commission reports 24,324 cases, including 3,887 new ones today. There are some indications that these numbers understate the situation, as overwhelmed hospitals in Wuhan only have the resources to test the most severe cases. As of Tuesday, 171,329 people are under medical observation because they’ve had close contact with a confirmed case.

It is unclear whether China will be able to get the outbreak under control or whether it will cause a series of epidemics throughout the country. It’s also unclear whether other countries — especially those with weak health systems — will be able to quickly identify any cases in their country and avoid Wuhan-scale outbreaks.

The point is, it’s simply too soon to assert we’ll do well on both those fronts — and if we fail, then the coronavirus death toll could well climb up into the tens of thousands. It also remains to be seen if vaccines or effective antiviral treatments will be developed. That’s just far too much uncertainty to assure people that they have nothing to worry about. And misleadingly assuring people that there’s nothing to worry about can end up doing harm.

“Instead of deriding people’s fears about the Wuhan coronavirus,” Sandman, the communications expert, writes, “I would advise officials and reporters to focus more on the high likelihood that things will get worse and the not-so-small possibility that they will get much worse.”

She concluded that “the Wuhan coronavirus likely won’t be a nightmare pandemic, but that scenario is still in play”, and followed it up with an article urging people to prepare by buying essential food and supplies.

If we interpret her “likely won’t be a nightmare pandemic” sentence as a prediction, she got the prediction wrong. Like Zeynep, she has no special ability to predict whether any given disease will end in global disaster. But that didn’t matter! She gave exactly the correct advice to institutions (prepare for a worst-case scenario, stop telling people not to panic) and exactly the correct advice to individuals (start prepping). When you’re good enough at handling uncertainty, getting your predictions exactly right becomes almost superfluous.

The Vox article says the media needs to “say what it doesn’t know”. I agree with this up to a point. But they can’t let this turn into a muddled message of “oh, who knows anything, whatever”. Uncertainty about the world doesn’t imply uncertainty about the best course of action! Within the range of uncertainty that we had about the coronavirus this February, an article that acknowledged that uncertainty wouldn’t have looked like “We’re not sure how this will develop, so we don’t know whether you should stop large gatherings or not”. It would have looked like “We’re not sure how this will develop, so you should definitely stop large gatherings.”

I worry that the people who refused to worry about coronavirus until too late thought they were “being careful” and “avoiding overconfidence”. And I worry the lesson they’ll take away from this is to be more careful, and avoid overconfidence even more strongly.

Experts should think along these lines when making their recommendations, but if they don’t, the press should think along them as part of its work of putting expert recommendations in context. I think Kelsey’s article provides an shining example of what this should look like.

Maybe other people got this right too. I’m singling out Kelsey because of a personal connection – I met her through the same probabilistic reasoning forum where I did my cryonics survey years ago. I don’t think this is a coincidence.

[Related: Book Review: The Precipice; Two Kinds Of Caution]