Dylan Matthews writes a critique of effective altruism. There is much to challenge in it, and some has already been challenged by people like Ryan Carey. Perhaps I will go into it at more length later. But for now I want to discuss a specific argument of Matthews’. He writes – and I am editing liberally to keep it short, so be sure to read the whole thing:
Nick Bostrom — the Oxford philosopher who popularized the concept of existential risk — estimates that about 10^54 human life-years (or 10^52 lives of 100 years each) could be in our future if we both master travel between solar systems and figure out how to emulate human brains in computers.
Even if we give this 10^54 estimate “a mere 1% chance of being correct,” Bostrom writes, “we find that the expected value of reducing existential risk by a mere one billionth of one billionth of one percentage point is worth a hundred billion times as much as a billion human lives.”
Put another way: The number of future humans who will never exist if humans go extinct is so great that reducing the risk of extinction by 0.00000000000000001 percent can be expected to save 100 billion more lives than, say, preventing the genocide of 1 billion people. That argues, in the judgment of Bostrom and others, for prioritizing efforts to prevent human extinction above other endeavors. This is what X-risk obsessives mean when they claim ending world poverty would be a “rounding error.”
These arguments give a false sense of statistical precision by slapping probability values on beliefs. But those probability values are literally just made up. Maybe giving $1,000 to the Machine Intelligence Research Institute will reduce the probability of AI killing us all by 0.00000000000000001. Or maybe it’ll make it only cut the odds by 0.00000000000000000000000000000000000000000000000000000000000000001. If the latter’s true, it’s not a smart donation; if you multiply the odds by 10^52, you’ve saved an expected 0.0000000000001 lives, which is pretty miserable. But if the former’s true, it’s a brilliant donation, and you’ve saved an expected 100,000,000,000,000,000,000,000,000,000,000,000 lives.
I don’t have any faith that we understand these risks with enough precision to tell if an AI risk charity can cut our odds of doom by 0.00000000000000001 or by only 0.00000000000000000000000000000000000000000000000000000000000000001. And yet for the argument to work, you need to be able to make those kinds of distinctions.
Matthews correctly notes that this argument – often called “Pascal’s Wager” or “Pascal’s Mugging” – is on very shaky philosophical ground. The AI risk movement generally agrees, and neither depends on it nor uses it very often. Nevertheless, this is what Matthews wants to discuss. So let’s discuss it.
His argument is that sure, it looks like fighting existential risk and saving 10^54 people is important. But that depends exactly how small the chance of your anti-x-risk plan working is. He gives two different possibilities which, if you count the zeroes, turn out to be 10^-17 and 10^-66. Then he asks: which one is it, 10^-17 or 10^-66? We just don’t know.
Well, actually, we do know. It’s probably not the 10^-66 one, because nothing is ever 10^-66 and you should never use that number.
Let me try to justify this.
Consider which of the following seems intuitively more likely:
First, that a well-meaning person donates $1000 to MIRI or FLI or FHI, this aids their research and lobbying efforts, and as a result they are successfully able to avert an unfriendly superintelligence.
Or second, that despite our best efforts, a research institute completes an unfriendly superintelligence. They are seconds away from running the program for the first time when, just as the lead researcher’s finger hovers over the ENTER key, a tornado roars into the laboratory. The researcher is sucked high into the air. There he is struck by a meteorite hurtling through the upper atmosphere, which knocks him onto the rooftop of a nearby building. He survives the landing, but unfortunately at precisely that moment the building is blown up by Al Qaeda. His charred corpse is flung into the street nearby. As the rubble settles, his face is covered by a stray sheet of newspaper; the headline reads 2016 PRESIDENTIAL ELECTION ENDS WITH TRUMP AND SANDERS IN PERFECT TIE. In small print near the bottom it also lists the winning Powerball numbers, which perfectly match those on a lottery ticket in the researcher’s pocket. Which is actually kind of funny, because he just won the same lottery last week.
Well, the per-second probability of getting sucked into the air by a tornado is 10^-12; that of being struck by a meteorite 10^-16; that of being blown up by a terrorist 10^-15. The chance of the next election being Sanders vs. Trump is 10^-4, and the chance of an election ending in an electoral tie about 10^-2. The chance of winning the Powerball is 10^-8 so winning it twice in a row is 10^-16. Chain all of those together, and you get 10^-65. On the other hand, Matthews thinks it’s perfectly reasonable to throw out numbers like 10^-66 when talking about the effect of x-risk donations. To take that number seriously is to assert that the second scenario is ten times more likely than the first!
In Made Up Statistics, I discuss how sometimes our system one intuitive reasoning and system two mathematical reasoning can act as useful checks on each other. A commenter described this as “sometimes it’s better to pull numbers out of your ass and use them to get an answer, than to pull an answer out of your ass.”
A good example of this is 80,000 Hours’ page on why people shouldn’t get too excited about medicine as an altruistic career (oops). They argue that the good a doctor does by treating illnesses is minimal compared to the good she can do by earning to give. Their reasoning goes like this: the average doctor saves 4 QALYs a year through medical interventions. The average doctor’s salary is $150,000 or so; if she donates 10% to charity, that’s $15,000. As per Givewell, that kind of money could save 300 QALYs per year. The value of the earning to give is so much higher then the value of the actual doctoring that you might as well skip the doctoring entirely and go into whatever earns you the most money.
Intuitively, people’s system 1s think “Doctor? That’s something where you’re saving lots of lives, so it must be a good altruistic career choice.” But then when you pull numbers out of your ass, it turns out not to be. Crucially, exactly which numbers you pull out of your ass doesn’t matter much as long as they’re remotely believable. 80,000 Hours tried their best to figure out how many QALYs doctors save per year, but this is obviously a really difficult question and for all we know they could be off by an order of magnitude. The point is, it doesn’t matter. They could be off by a figure of ten times, twenty times, even fifty times and it wouldn’t affect their argument. I’ve gone over their numbers with them and it’s really, really, really hard to remotely believably make the “number of QALYs saved per doctor” figure come out high enough to challenge the earning-to-give route. Sure, you’re pulling numbers out of your ass, but even your ass has some standards.
It’s the same with Matthews’ estimates about x-risk. He intuitively thinks that x-risk charities can’t be that great compared to fighting global poverty or whatever other good cause. He (very virtuously) decides to double-check that assumption with numbers, even if he has to make up the numbers himself. The problem is, he doesn’t have a very good feel for numbers of that size, so he thinks he can literally make up whatever numbers he wants, instead of doing something that we jokingly call “making up whatever number you want” but which in fact involves some sanity checks to make sure they’re remotely believable proxies for our intuitions. He thinks “I don’t expect x-risk charities to work very well, so what the heck, I might as well call that 10^-66”, whereas he should be thinking something like “10^-66 means about ten times less likely than my chance of getting tornado-meteor-terrorist-double-lottery-Trumped in any particular second, is that a remotely believable approximation to how unlikely I think existential risk is?”
Just as it is very hard to come up with a remotely believable number that spoils 80,000 Hours’ anti-doctor argument, so you have to really really stretch your own credulity to come up with numbers where Bostrom’s x-risk argument doesn’t work.
(some people argue that LW-style rationality is a bad idea, because you can’t really think with probabilities. I would argue that even if that’s true, there is at least a small role for rationality in avoiding being bamboozled by other people trying to think with probabilities and doing it wrong. This is a modest claim, but no more modest than Wittgenstein’s view of philosophy, which was that it was a useful thing to know in order to protect yourself from taking philosophers too seriously.)
But one more point. Suppose Matthews’ intuition is indeed that the chance of AI risk charities working out is precisely ten times less than his per-second chance of getting tornado-meteor-terrorist-double-lottery-Trumped. In that case, I offer him the following whatever-the-opposite-of-a-gift is: we can predict pretty precisely the yearly chance of a giant asteroid hitting our planet, it’s way more than 10^-66, and the whole x-risk argument applies to it just as well as to AI or anything else. What now?
Because this isn’t just about defending the particular proposition of AI. It’s a more general principle of staring into the darkness. If you try to be good, if you don’t let yourself fiddle with your statistical intuitions until they give you the results you want, sometimes you end up with weird or scary results.
Like that a person who wants to cure as much disease as possible would be better off becoming a hedge fund manager than a doctor.
Or that your charity dollar would be better sent off to sub-Saharan Africa to purchase something called “praziquantel” than given to the sad-looking man with the cardboard sign you see on the way to work.
Or that a person who wants to reduce suffering in the world should focus almost obsessively on chickens.
One of the founding beliefs of effective altruism is that when math tells you something weird, you at least consider trusting the math. If you’re allowed to just add on as many zeroes as it takes to justify your original intuition, you miss out on the entire movement.
Everyone has their own idea of what trusting the math entails and how far they want to go with it. Some people go further than I do. Other people go less far. But anybody who makes a good-faith effort to trust it even a little is, in my opinion, an acceptable ally worth including in the effective altruist tent. They have abandoned a nice safe chance to donate to the local symphony and feel good about themselves, in favor of a life of feeling constantly uncomfortable with their decisions, looking extremely silly to normal people, and having Dylan Matthews write articles in Vox calling them “white male autistic nerds”.
Matthews is firmly underneath the effective-altruist tent. He writes that he’s worried that a focus on existential risk will detract from the causes he really cares about, like animal rights. He gets very, very excited about animal rights, and in his work for Vox he’s done some incredible work promoting them. Good! I also donate to animal rights’ charities and I think we need far more people who do that.
And yet, the same arguments he deploys against existential risk could be leveled against him also – “how can you worry about chickens when there are millions of families trying to get by on minimum wage? Effective altruists need to stop talking about animals if they ever want to attract anybody besides white males into the movement.” What then?
Malcolm Muggeridge describes a vision he once had, of everyone in the world riding together on a giant train toward realms unknown. Each person wants to get off at their own stop, but when the train comes to their station, the engineer speeds right by. All the other passengers laugh and hoot and sing the praises of the engineer, because this means the train will get to their own stations faster. But of course each one finds that when the train comes to their station, why, it speeds past that one too, and they are left to rage impotently at the unfairness.
And I worry that Matthews is urging us to shoot past the “existential risk” station in order to get to the “animal rights” station a little faster, without reflecting on the likely consequences.
This certainly isn’t to say we all need to get off at the first station. I myself am very interested in existential risk, but I give less than a third of my donations to x-risk related charities (no, I can’t justify this, it’s a sanity-preserving exception). I respect those who give more. I also respect those who give less. Existential risk isn’t the most useful public face for effective altruism – everyone incuding Eliezer Yudkowsky agrees about that. But at least allowing people interested in x-risk into the tent and treating them respectfully seems like an inescapable consequence of the focus on reason and calculation that started effective altruism in the first place.