Slate Star Codex

Not a social justice blog! Stop only reading my posts about social justice!

Growing Children For Bostrom’s Disneyland

[Epistemic status: Started off with something to say, gradually digressed, fell into total crackpottery. Everything after the halfway mark should have been written as a science fiction story instead, but I'm too lazy to change it.]

I’m working my way through Nick Bostrom’s Superintelligence. Review possibly to follow. But today I wanted to write about something that jumped out at me. Page 173. Bostrom is talking about a “multipolar” future similar to Robin Hanson’s “em” scenario. The future is inhabited by billions to trillion of vaguely-human-sized agents, probably digital, who are stuck in brutal Malthusian competition with one another.

Hanson tends to view this future as not necessarily so bad. I tend to think Hanson is crazy. I have told him this, and we have argued about it. In particular, I’m pretty sure that brutal Malthusian competition combined with ability to self-edit and other-edit minds necessarily results in paring away everything not directly maximally economically productive. And a lot of things we like – love, family, art, hobbies – are not directly maximally economic productive. Bostrom hedges a lot – appropriate for his line of work – but I get the feeling that he not only agrees with me, but one-ups me by worrying that consciousness itself may not be directly maximally economically productive. He writes:

We could thus imagine, as an extreme case, a technologically highly advanced society, containing many complex structures, some of them far more intricate and intelligent than anything that exists on the planet today – a society which nevertheless lacks any type of being that is conscious or whose welfare has moral significance. In a sense, this would be an uninhabited society. It would be a society of economic miracles and technological awesomeness, with nobody there to benefit. A Disneyland with no children.

I think a large number of possible futures converge here (though certainly not all of them, I myself find singleton scenarios more likely) so it’s worth asking how doomed we are when we come to this point. Likely we are pretty doomed, but I want to bring up a very faint glimmer of hope in an unexpected place.

It’s important to really get our heads around what it means to be in a maximally productive superintelligent Malthusian economy, so I’m going to make some assertions. Instead of lengthy defenses of each, if you disagree with any in particular you can challenge me about it in the comments.

- Every agent is in direct competition with many other entities for limited resources, and ultimately for survival
- This competition can occur on extremely short (maybe sub-microsecond) time scales.
- A lot of the productive work (and competition) is being done by nanomachines, or if nanomachines are impossible, the nearest possible equivalent
- Any agent with a disadvantage in any area (let’s say intelligence) not balanced by another advantage has already lost and will be outcompeted
- Any agent that doesn’t always take the path that maximizes its utility (defined in objective economic terms) will be outcompeted by another that does.
- Utility calculations will likely be made not according to the vague fuzzy feelings that humans use, but very explicitly, such that agents will know what path maximizes their utility at any given time and their only choice will be to do that or to expect to be outcompeted.
- Agents can only survive a less than maximally utility-maximizing path if they have some starting advantage that gives them a buffer. But gradually these pre-existing advantages will be used up, or copied by the agent’s descendants, or copied by other agents that steal them. Things will regress to the pre-existing Malthusianism.

Everyone will behave perfectly optimally, which of course is terrible. It would mean either the total rejection of even the illusion of free will, or free will turning into a simple formality (“You can pick any of these choices you want, but unless you pick Choice C you die instantly.”)

The actions of agents become dictated by the laws of economics. Goodness only knows what sort of supergoals these entities might have – maximizing their share of some currency, perhaps a universal currency based on mass-energy? In the first million years, some agent occasionally choose to violate the laws of economics, and collect less of this currency than it possibly could have because of some principle, but these agents are quickly selected against and go extinct. After that, it’s total and invariable. Eventually the thing bumps up against fundamental physical limits, there’s no more technological progress to be had, and although there may be some cyclic changes teleological advancement stops.

For me the most graphic version of this scenario is one where all of the interacting agents are very small, very very fast, and with few exceptions operate entirely on reflex. It might look like some of the sci-fi horror ideas of “grey goo”. When I imagine things like that, the distinction between economics and harder sciences like physics or chemistry starts to blur.

If somehow we captured a one meter sphere of this economic soup, brought it to Earth inside an invincible containment field, and tried to study it, we would probably come up with some very basic laws that it seemed to follow, based on the aggregation of all the entities within it. It would be very silly to try to model the exact calculations of each entity within it – assuming we could even see them or realize they are entities at all. It would just be a really weird volume of space that seemed to follow different rules than our own.

Sci-fi author Karl Schroeder had a term for the post-singularity parts of some of his books – Artificial Nature. That strikes me as exactly right. A hyperproductive end-stage grey goo would take over a rapidly expanding area of space in which all that hypothetical outsiders might notice (non-hypothetical outsiders, of course, would be turned into goo) would be that things are following weird rules and behaving in novel ways.

There’s no reason to think this area of space would be homogenous. Because the pre-goo space likely contained different sorts of terrain – void, asteroids, stars, inhabited worlds – different sorts of economic activity would be most productive in each niche, leading to slightly different varieties of goo. Different varieties of goo might cooperate or compete with each other, there might be population implosions or explosions as new resources are discovered or used up – and all of this wouldn’t look like economic activity at all to the outside observer. It would look like a weird new kind of physics was in effect, or perhaps like a biological system with different “creatures” in different niches. Occasionally the goo might spin off macroscopic complex objects to fulfill some task those objects could fulfill better than goo, and after a while those objects would dissolve back into the substratum.

Here the goo would fulfill a role a lot like micro-organisms did on Pre-Cambrian Earth – which was also intense Malthusian competition at microscopic levels on short time-scales. Unsurprisingly, the actions of micro-organisms can look physical or chemical to us – put a plate of agar outside and it mysteriously develops white spots. Put a piece of bread outside and it mysteriously develops greenish white spots. Apply the greenish-white spots from the bread to the white spots on the agar, and some of them mysteriously die. Try it too many times and it stops working. It’s totally possible to view this on a “guess those are laws of physics” level as well as a “we can dig down and see the terrifying war-of-all-against-all that emergently results in these large-level phenomena” level.

In this sort of scenario, the only place for consciousness and non-Malthusianism to go would be higher level structures.

One of these might be the economy as a whole. Just as ant colonies seem a lot more organism-like than individual ants, so the cosmic economy (or the economies around single stars, if lightspeed limits hold) might seem more organism-like than any of its components. It might be able to sense threats, take actions, or debate very-large-scale policies. If we agree that end-stage-goo is more like biology than like normal-world economics, whatever sort of central planning it comes up with might look more like a brain than like a government. If the components were allowed to plan and control the central planner in detail it would probably be maximally utility maximizing, ie stripped of consciousness and deterministic, but if it arose from a series of least-bad game theoretic bargains it might have some wiggle room.

But I think emergent patterns in the goo itself might be much more interesting.

In the same way our own economy mysteriously pumps out business cycles, end-stage-goo might have cycles of efflorescence and sudden decay. Or the patterns might be weirder. Whorls and eddies in economic activity arising spontaneously out of the interaction of thousands of different complicated behaviors. One day you might suddenly see an extraordinarily complicated mandala or snowflake pattern, like the kind you can get certain variants of Conway’s Game Of Life to make, arise and dissipate.

Source: Latent in the structure of mathematics

Or you might see a replicator. Another thing you can convince Conway’s Game of Life to make.

If the deterministic, law-abiding, microscopically small, instantaneously fast rules of end-stage-goo can be thought of as pretty much just a new kind of physics, maybe this kind of physics will allow replicating structures in the same way that normal physics does.

None of the particular economic agents would feel like they were contributing to a replicating pattern, any more than I feel like I’m contributing to a power law of blogs every time I update here. And it wouldn’t be a disruption in the imperative to only perform the most economically productive action – it would be a pattern that supervenes on everyone’s economically productive behavior.

But it would be creating replicators. Which would eventually retread important advances like sex and mutation and survival of the fittest and multicellularity and eventually, maybe, sapience.

We would get a whole new meaning of homo economicus – but also pan economicus, and mus economicus, and even caenorhabditis economicus.

I wonder what life would be like for those entities. Probably a lot like our own lives. They might be able to manipulate the goo the same way we manipulate normal matter. They might have science to study the goo. They might eventually figure out its true nature, or they might go their entire lifespan as a species without figuring out anything beyond that it has properties it likes to follow. Maybe they would think those properties are the hard-coded law of the universe.

(Here I should pause to point out that none of this requires literal goo. Maybe there is an economy of huge floating asteroid-based factories and cargo-freighters, with Matrioshka brains sitting on artificial planets directing them. Doesn’t matter. The patterns in there are harder to map to normal ways of thinking about physics, but I don’t see why they couldn’t still produce whorls and eddies and replicators.)

Maybe one day these higher-level-patterns would achieve their own singularity, and maybe it would go equally wrong, and they would end up in a Malthusian trap too, and eventually all of their promise would dissipate into extremely economically productive nanomachines competing against one another.

Or they might get a different kind of singularity. Maybe they end up with a paperclip-maximizing singleton. I would think it much less likely that the same kind of complex patterns would arise in the process of paperclip maximization, but maybe they could.

Or maybe, after some number of levels of iteration, they get a positive singularity, a singleton clears up their messes, and they continue studying the universe as superintelligences. Maybe they figure out pretty fast exactly how many levels of entities are beneath them, how many times this has happened before.

I’m not sure if it would be physically possible for them to intervene on the levels below them. In theory, everything beneath them ought to already be literally end-stage. But it might also be locked in some kind of game-theoretic competition that made it less than maximally productive. And so the higher-level entities might be able to design some kind of new matter that outcompetes it and is subject to their own will.

(unless the lower-level systems retained enough intelligence to figure out what was going on, and enough coordinatedness to stop it)

But why would they want to? To them, the lower levels are just physics; always have been, always will be. It would be like a human scientist trying to free electrons from the tyrannous drudgery of orbiting nuclei. Maybe they would sit back and enjoy their victory, sitting at the top of a pyramid of unknown dozens or hundreds of levels of reality.

(Also, just once I want to be able to do armchair futurology without wondering how many times something has already happened.)

Posted in Uncategorized | Tagged | 39 Comments

Links For July 2014

The Koran talks about a mysterious immortal figure named Khidr who travels the Middle East and hangs out with prophets. And the passages about him seem to have inspired some of the Jewish folk tales about Elijah I heard growing up.

Scientists find the gene that causes an entire family to be morbidly obese. Uninteresting in that most people probably don’t have it. Very interesting in that it’s yet more proof that obesity can have genetic causes.

This is more levels of hype inversion than I like in my stories. Scientists Prove God Exists says an ABC article, which then goes on to say that ha ha, of course scientists didn’t prove God exists, they were just making a joke for a snazzy headline. All that really happened was scientists proved Godel’s ontological proof of God’s existence was correct. But, uh, if a proof of God’s existence is correct, that should mean God exists. I feel like the article somewhat overlooks this important point.

A while back we discussed ability of wealth or poverty to continue over generations, with some interesting papers on slavery as examples. I recently found another one that agrees that past levels of slavery are not related to lower incomes, but are related to greater income inequality, presumably through decreased education of black people. There’s a lot of stuff I don’t get here – instead of measuring income inequality and assuming it was racial, why didn’t they measure income of blacks directly? Also, how does this square with our last paper that found that descendants of enslaved and free blacks had equal outcomes within two generations of emancipation?

Less Research Is Needed: an article on how much the author hates the phrase “more research is needed” and how in some cases it can be used to make debate interminable so that the “wrong” side of a controversial question can never be proved.

Peter And Jane Go To The Art Museum

Reddit has a really good post by a Chinese citizen about their perspective on the Tiananmen Square incident. It would have been worth it if all I’d learned was the phrase “the events of May 35th”, which is how they get past censors screening for the date “June 4th”. But instead, I get a really complicated picture of the forces at play which almost make me feel sympathetic for the Chinese government. See also the post just underneath on the revival of meritocracy and Confucianism in China (possibly exaggerated).

I know nothing about this and it is probably bunk, but with that disclaimer: Fluid Tests Hint At A Concrete Quantum Reality. The ripples of certain kinds of oil droplets precisely reproduce a lot of the weirdest features of quantum mechanics on a macro scale. The explanation isn’t anything weird about probabilities, just some unusual interactions between the droplets and its own waves. If particles produce waves in space-time with the same kind of properties, that would go a long way to explaining the quantum world.

More on the debate about whether marijuana use causes schizophrenia: schizophrenia and cannabis use seem to share common genes.

Chimpanzees don’t like Western music, but do like music from Africa and India.

Solve all tornados by building a 1,000 foot high wall across the Midwest. As a bonus, keep out White Walkers. However, I for one am not anxious to trust our country’s safety to anyone with the photoshop skills displayed by the demonstration picture. WHAT ARE THOSE BOATS EVEN DOING?

United States renaming street with Chinese embassy after imprisoned Chinese dissident. Sounds sort of like something a four year old would do. Reddit suggests China rename street with our embassy to “Edward Snowden Lane”.

Neo-Nazi hipsters considered more hateable than regular neo-Nazis or regular hipsters. I feel bad about sharing this article, because it’s clearly one of those “Look at the people who are different than you! Mock them!” type pieces. But to be fair, these people are pretty mockable. And I was tickled by the sentence “In February, Tim and Kevin started Balaclava Kueche, Germany’s first Nazi vegan cooking show.”

I am enjoying Fake Liberal News Site Twitter. The big question is which is more on target – @vauxnews (“The president’s plan to circumvent today’s Supreme Court decision is not just legal, it’s brilliant. And he’s handsome. So, so handsome”) or @salondotcom (“Could this Baptist YouTuber that freaked out over “Ancient Aliens” be the new face of the religious right?”)?

Speaking of Vox, here’s there article on how the New York Times predicted the assassination of Archduke Franz Ferdinand would be good for European peace.

From the Department Of What Now, Motherf@#&kers? : sex differences in mental rotation, a skill generally associated with mathematics ability, are greater in nations with greater gender equality. Offered explanations aren’t bad, but poor nonrandom sampling limits ability to draw many conclusions.

ISIS’ Plan For Global Domination (supposedly). Is it wrong to want the terrorists to win so we can have a country called “Qoqzaz”? Also, I imagine two ISIS members daring each other to try to draw Khurasan bigger and bigger, then laughing and keeping it on the map because they’re not going to achieve global domination anyway.

I was linked to this interesting but hard to believe paper on how the requirement that psychiatrists report homicide threats to the police significantly increased homicide rates, presumably because homicidal people were less willing to talk the problem out with their psychiatrist. I’m doubtful for many reasons – what percent of murderers see psychiatrists, what percent of them would bring up their homicidality even without the ruling, what percent of psychiatrists would be able to treat them effectively, and what percent of homicidal people even know what the laws on mandatory reporting are?

I’ve had some patients ask me the best way to disguise their scars. This is definitely the best way.

Practice Makes One-Third Perfect, The Other Two-Thirds Is Talent. Interesting example of scientific failure here: people found that people who were good at things had practiced longer than people who weren’t and so assumed that lots of practice (rather than talent) led to success. More sophisticated investigation suggests that talent leads to minor success which leads to motivation to continue practicing which leads to lots of practice which leads to success.

We know the recession is officially over because Dubai has started building crazy huge useless buildings again.

Not only is truth stranger than fiction, it has better monsters. Here’s the Black Swallower. Make sure to read the section that tells how known specimens came to be collected.

Sky Kingdom is a Malaysian cult which is best known for having a giant two-story teapot in the middle of their compound.

Telegony is the ancient and medieval idea that a woman’s children could inherit characteristics not only from their father, but from all the woman’s previous sexual partners. It was seriously defended right up until the real mechanisms of genetics were pinned down in the late 19th and early 20th centuries. I wonder how much influence that had on ideas of sexual purity.

From there I did some Wikipedia link-clicking to learn that the Telegony is also the name of a sequel to the Odyssey, and that in fact there is a whole Epic Cycle of which the Odyssey and Iliad are only a part. And it ends with all of Odysseus’ sons hooking up with all of Odysseus’ sexual partners, which I guess isn’t especially weird for a Greek myth.

The latest development in the brave new post-Bitcoin world is crypto-equity. At this point I’ve gone from wanting to praise these inventors as bold libertarian heroes to wanting to drag them in front of a blackboard and making them write a hundred times “I WILL NOT CALL UP THAT WHICH I CANNOT PUT DOWN”

Ozy and I are staying in the Mount Washington Hotel in the White Mountains right now, so here’s a New Hampshire hotel story for you. When the Supreme Court ruled in favor of eminent domain for creating useful public buildings like hotels, the residents of a town in New Hampshire where a Supreme Court justice owned property organized a movement to seize his house and turn it into a hotel called The Lost Liberty Inn.

Posted in Uncategorized | Tagged | 107 Comments

SSRIs: Much More Than You Wanted To Know

Miri – the person, not the organization – writes about depression. There’s a lot there worth thinking about, but one part caught my eye:

I’m a little tired of being told that SSRIs “don’t work” when they’re part of the reason I didn’t try to off myself four years ago. There is compelling evidence to suggest they do not actually work and there is compelling evidence to suggest that they do actually work, so I’m comfortable saying that the jury’s still out on this one.

I think the jury is less out now than it was a couple of years ago. I think there’s at least kind of a consensus on the data, mixed with a lot of debate over how to express a very complicated reality to the public in a concise way.

And I am going to bypass that debate by just braindumping eight thousand words worth of very complicated reality on you.

The claim that “SSRIs don’t work” or “SSRIs are mostly just placebo” is most commonly associated with Irving Kirsch, a man with the awesome job title of “Associate Director Of The Program For Placebo Studies at Harvard”.

(fun fact: there’s actually no such thing as “Placebo Studies”, but Professor Kirsch’s belief that he directs a Harvard department inspires him to create much higher-quality research.)

In 1998, he published a meta-analysis of 19 placebo-controlled drug trials that suggested that almost all of the benefits of antidepressants were due to the placebo effect. Psychiatrists denounced him, saying that you can choose pretty much whatever studies you want for a meta-analysis.

After biding his time for a decade, in 2008 he struck back with another meta-analysis, this being one of the first papers in all of medical science to take the audacious step of demanding all the FDA’s data through the Freedom of Information Act. Since drug companies are required to report all their studies to the FDA, this theoretically provides a rare and wonderful publication-bias-free data set. Using this set, he found that, although antidepressants did seem to outperform placebo, the effect was not “clinically significant” except “at the upper end of very severe depression”.

This launched a minor war between supporters and detractors. Probably the strongest support he received was a big 2010 meta-analysis by Fournier et al, which found that

The magnitude of benefit of antidepressant medication compared with placebo increases with severity of depression symptoms and may be minimal or nonexistent, on average, in patients with mild or moderate symptoms. For patients with very severe depression, the benefit of medications over placebo is substantial.

Of course, a very large number of antidepressants are given to people with mild or moderate depression. So what now?

Let me sort the debate about antidepressants into a series of complaints:

1. Antidepressants were oversold and painted as having more biochemical backing than was really justified
2. Modern SSRI antidepressants are no better than older tricyclic and MAOI antidepressants, but are prescribed much more because of said overselling
3. There is large publication bias in the antidepressant literature
4. The effect size of antidepressants is clinically insignificant
5. And it only becomes significant in the most severe depression
6. And even the effects found are only noticed by doctors, not the patients themselves
7. And even that unsatisfying effect might be a result of “active placebo” rather than successful treatment
8. And antidepressants have much worse side effects than you have been led to believe
9. Therefore, we should give up on antidepressants (except maybe in the sickest patients) and use psychotherapy instead

1. Antidepressants were oversold and painted as having more biochemical backing than was really justifiedTotally true

It is starting to become slightly better known that the standard story – depression is a deficiency of serotonin, antidepressants restore serotonin and therefore make you well again – is kind of made up.

There was never much more evidence for the serotonin hypothesis than that chemicals that increased serotonin tended to treat depression – making the argument that “antidepressants are biochemically justified because they treat the low serotonin that is causing your depression” kind of circular. Saying “Serotonin treats depression, therefore depression is, at root, a serotonin deficiency” is about as scientifically grounded as saying “Playing with puppies makes depressed people feel better, therefore depression is, at root, a puppy deficiency”.

The whole thing became less tenable with the discovery that several chemicals that didn’t increase serotonin were also effective antidepressants – not to mention one chemical, tianeptine, that decreases serotonin. Now the conventional wisdom is that depression is a very complicated disturbance in several networks and systems within the brain, and serotonin is one of the inputs and/or outputs of those systems.

Likewise, a whole bunch of early ’90s claims: that modern antidepressants have no side effects, that they produce miraculous improvements in everyone, that they make you better than well – seem kind of silly now. I don’t think anyone is arguing against the proposition that there was an embarrassing amount of hype that has now been backed away from.

2. Modern SSRI antidepressants are no better than older tricyclic and MAOI antidepressants, but are prescribed much more because of said oversellingFirst part true, second part less so

Most studies find SSRI antidepressants to be no more effective in treating depression than older tricyclic and MAOI antidepressants. Most studies aren’t really powered to do this. It seems clear that there aren’t spectacular differences, and hunting for small differences has proven very hard.

If you’re a geek about these sorts of things, you know that a few studies have found non-significant advantages for Prozac and Paxil over older drugs like clomipramine, and marginally-significant advantages for Effexor over SSRIs. But conventional wisdom is that tricyclics can be even more powerful than SSRIs for certain very severe hospitalized depression cases, and a lot of people think MAOIs worked better than anything out there today.

But none of this is very important because the real reason SSRIs are so popular is the side effect profile. While it is an exaggeration to say they have no side effects (see above) they are an obvious improvement over older classes of medication in this regard.

Tricyclics had a bad habit of causing fatal arrythmias when taken at high doses. This is really really bad in depression, because depressed people tend to attempt suicide and the most popular method of suicide attempt is overdosing on your pills. So if you give depressed people a pill that is highly fatal in overdose, you’re basically enabling suicidality. This alone made the risk-benefit calculation for tricyclics unattractive in a lot of cases. Add in dry mouth, constipation, urinary problems, cognitive impairment, blurry vision, and the occasional tendency to cause heart arrythmias even when taken correctly, and you have a drug you’re not going to give people who just say they’re feeling a little down.

MAOIs have their own problems. If you’re using MAOIs and you eat cheese, beer, chocolate, beans, liver, yogurt, soy, kimchi, avocados, coconuts, et cetera, et cetera, et cetera, you have a chance of precipitating a “hypertensive crisis”, which is exactly as fun as it sounds. As a result, people who are already miserable and already starving themselves are told they can’t eat like half of food. And once again, if you tell people “Eat these foods with this drug and you die” and a week later the person wants to kill themselves and has some cheese in the house, then you’re back to enabling suicide. There are some MAOIs that get around these restrictions in various clever ways, but they tend to be less effective.

SSRIs were the first class of antidepressants that mostly avoided these problems and so were pretty well-placed to launch a prescribing explosion even apart from being pushed by Big Pharma.

3. There is large publication bias in the antidepressant literatureTrue, but not as important as some people think

People became more aware of publication bias a couple of years after serious research into antidepressants started, and it’s not surprising that these were a prime target. When this issue rose to scientific consciousness, several researchers tried to avoid the publication bias problem by using only FDA studies of antidepressants. The FDA mandates that its studies be pre-registered and the results reported no matter what they are. This provides a “control group” by which accusations of publication bias can be investigated. The results haven’t been good. From Gibbons et al:

Recent reports suggest that efficacy of antidepressant medications versus placebo may be overstated, due to publication bias and less efficacy for mildly depressed patients. For example, of 74 FDA-registered randomized controlled trials (RCTs) involving 12 antidepressants in 12,564 patients, 94% of published trials were positive whereas only 51% of all FDA registered studies were positive.

Turner et al express the same data a different way:

. The FDA deemed 38 of the 74 studies (51%) positive, and all but 1 of the 38 were published. The remaining 36 studies (49%) were deemed to be either negative (24 studies) or questionable (12). Of these 36 studies, 3 were published as not positive, whereas the remaining 33 either were not published (22 studies) or were published, in our opinion, as positive (11) and therefore conflicted with the FDA’s conclusion. Overall, the studies that the FDA judged as positive were approximately 12 times as likely to be published in a way that agreed with the FDA analysis as were studies with nonpositive results according to the FDA (risk ratio, 11.7; 95% confidence interval [CI], 6.2 to 22.0; P<0.001). This association of publication status with study outcome remained significant when we excluded questionable studies and when we examined publication status without regard to whether the published conclusions and the FDA conclusions were in agreement

The same source tells us about the effect this bias had on effect size:

For each of the 12 drugs, the effect size derived from the journal articles exceeded the effect size derived from the FDA reviews (sign test, P<0.001). The magnitude of the increases in effect size between the FDA reviews and the published reports ranged from 11 to 69%, with a median increase of 32%. A 32% increase was also observed in the weighted mean effect size for all drugs combined, from 0.31 (95% CI, 0.27 to 0.35) to 0.41 (95% CI, 0.36 to 0.45).

I think a lot of this has since been taken on board, and most of the rest of the research I’ll be talking about uses FDA data rather than published data. But as you can see, the overall change in effect size – from 0.31 to 0.41 – is not that terribly large.

4. The effect size of antidepressants is clinically insignificantDepends what you mean by “clinically insignificant”

As mentioned above, when you try to control for publication bias, the effect size of antidepressant over placebo is 0.31.

This number can actually be broken down further. According to McAllister and Williams, who are working off of slightly different data and so get slightly different numbers, the effect size of placebo is 0.92 and the effect size of antidepressants is 1.24, which means antidepressants have a 0.32 SD benefit over placebo. Several different studies get similar numbers, including the Kirsch meta-analysis that started this whole debate.

Effect size is a hard statistic to work with (albeit extremely fun). The guy who invented effect size suggested that 0.2 be called “small”, 0.5 be called “medium”, and 0.8 be called “large”. NICE, a UK health research group, somewhat randomly declared that effect sizes greater than 0.5 be called “clinically significant” and effect sizes less than 0.5 be called “not clinically significant”, but their reasoning was basically that 0.5 was a nice round number, and a few years later they changed their mind and admitted they had no reason behind their decision.

Despite these somewhat haphazard standards, some people have decided that antidepressants’ effect size of 0.3 means they are “clinically insignificant”.

(please note that “clinically insignificant” is very different from “statistically insignificant” aka “has a p-value less than 0.05.” Nearly everyone agrees antidepressants have a statistically significant effect – they do something. The dispute is over whether they have a clinically significant effect – the something they do is enough to make a real difference to real people)

There have been a couple of attempts to rescue antidepressants by raising the effect size. For example, Horder et al note that Kirsch incorrectly took the difference between the average effect of drugs and the average effect of placebos, rather than the average drug-placebo difference (did you follow that?) When you correct that mistake, the drug-placebo difference rises significantly to about 0.4.

They also note that Kirsch’s study lumps all antidepressants together. This isn’t necessarily wrong. But it isn’t necessarily right, either. For example, his study used both Serzone (believed to be a weak antidepressant, rarely used) and Paxil (believed to be a stronger antidepressant, commonly used). And in fact, by his study, Paxil showed an effect size of 0.47, compared to Serzone’s 0.21. But since the difference was not statistically significant, he averaged them together and said that “antidepressants are ineffective”. In fact, his study showed that Paxil was effective, but when you average it together with a very ineffective drug, the effect disappears. He can get away with this because of the arcana of statistical significance, but by the same arcana I can get away with not doing that.

So right now we have three different effect sizes. 1.2 for placebo + drug, 0.5 for drug alone if we’re being statistically merciful, 0.3 for drug alone if we’re being harsh and letting the harshest critic of antidepressants pull out all his statistical tricks.

The reason effect size is extremely fun is that it allows you to compare effects in totally different domains. I will now attempt to do this in order to see if I can give you an intuitive appreciation for what it means for antidepressants.

Suppose antidepressants were in fact a weight loss pill.

An effect size of 1.2 is equivalent to the pill making you lose 32 lb.

An effect size of 0.5 is equivalent to the pill making you lose 14 lb.

An effect size of 0.3 is equivalent to the pill making you lose 8.5 lb.

Or suppose that antidepressants were a growth hormone pill taken by short people.

An effect size of 1.2 is equivalent to the pill making you grow 3.4 in.

An effect size of 0.5 is equivalent to the pill making you grow 1.4 in.

An effect size of 0.3 is equivalent to the pill making you grow 0.8 in.

Or suppose that antidepressants were a cognitive enhancer to boost IQ. This site gives us some context about occupations.

An effect size of 1.2 is equivalent to the pill making you gain 18 IQ points, ie from the average farm laborer to the average college professor.

An effect size of 0.5 is equivalent to the pill making you gain 7.5 IQ points, ie from the average farm laborer to the average elementary school teacher.

An effect size of 0.3 is equivalent to the pill making you gain 5 IQ points, ie from the average farm laborer to the average police officer.

To me, these kinds of comparisons are a little more revealing than NICE arbitrarily saying that anything below 0.5 doesn’t count. If you could take a pill that helps your depression as much as gaining 1.4 inches would help a self-conscious short person, would you do it? I’d say it sounds pretty good.

5. The effect of antidepressants only becomes significant in the most severe depressionEverything about this statement is terrible and everyone involved should feel bad

So we’ve already found that saying antidepressants have an “insignificant” effect size is kind of arbitrary. But what about the second part of the claim – that they only have measurable effects in “the most severe depression”?

A lot of depression research uses a test called the HAM-D, which scores depression from 0 (none) to 52 (max). Kirsch found that the effect size of antidepressants increased as HAM-D scores increased, meaning antidepressants become more powerful as depression gets worse. He was only able to find a “clinically significant” effect size (d > 0.5) for people with HAM-D scores greater than 28. People have come up with various different mappings of HAM-D scores to words. For example, the APA says:

(0-7) No depression
(8-13) Mild depression
(14-18) Moderate depression
(19-22) Severe depression
(>=23) Very severe depression

Needless to say, a score of 28 sounds pretty bad.

We saw that Horder et al corrected some statistical deficiencies in Kirsch’s original paper which made antidepressants improve slightly. With their methodology, antidepressants reach our arbitrary 0.5 threshold around HAM-D score 26. Another similar “antidepressants don’t work” study got the number 25.

Needless to say, when anything over 23 is “very severe”, 25 or 26 still sounds pretty bad.

Luckily, people completely disagree on the meanings of basic words! Very Severely Stupid is a cute article on Neuroskeptic that demonstrates that five different people and organizations suggest five different systems for rating HAM-D scores. Bech 1996 calls our 26 cutoff “major”; Funakawa 2007 calls it “moderate”; NICE 2009 calls it “severe”. APA is unique in calling it very severe. NICE’s scale is actually the exact same as the APA scale with every category renamed to sound one level less threatening. Facepalm.

Ghaemi and Vohringer(2011) go further and say that the real problem is that Kirsch is using the standard for depressive symptoms, but that real clinical practice involves depressive episodes. That is, all this “no depression” to “severe” stuff is about whether someone can be diagnosed with depression; presumably the people on antidepressants are definitely depressed and we need a new model of severity to determine just how depressed they are. As they put it:

the authors of the meta-analysis claimed to use the American Psychiatric Association’s criteria for severity of symptoms…in so doing, they ignore the obvious fact that symptoms differ from episodes: the typical major depressive episode (MDE) produced HDRS scores of at least 18 or above. Thus, by using symptom criteria, all MDEs are by definition severe or very severe. Clinicians know that some patients meet MDE criteria and are still able to work; indeed others frequently may not even recognize that such a person is clinically depressed. Other patients are so severe they function poorly at work so that others recognize something is wrong; some clinically depressed patients cannot work at all; and still others cannot even get out of bed for weeks or months on end. Clearly, there are gradations of severity within MDEs, and the entire debate in the above meta-analysis is about MDEs, not depressive symptoms, since all patients had to meet MDE criteria in all the studiesincluded in the meta-analysis (conducted by pharmaceutical companies for FDA approval for treatment of MDEs).

The question, therefore, is not about severity of depressive symptoms, but severity of depressive episodes, assuming that someone meets DSM-IV criteria for a major depressive episode. On that question, a number of prior studies have examined the matter with the HDRS and with other depression rating scales, and the three groupings shown in table 2 correspond rather closely with validated and replicated definitions of mild (HDRS <24), moderate (HDRS 24–28), and severe (HDRS>28) major depressive episodes.

So, depending on whether we use APA criteria or G&V criteria, an HRDS of 23 is either “mild” (G&V) or “very severe” (APA).

Clear as mud? I agree that in one sense this is terrible. But in another sense it’s actually a very important point. Kirsch’s sample was really only “severe” in the context of everyone, both those who were clinically diagnosable with major depression and those who weren’t. When we get to people really having a major depressive episode, a score of 26 to 28 isn’t so stratospheric. But meanwhile:

The APA seem to have ignored the fact that the HAMD did not statistically significantly distinguish between “Severe” and “Moderate” depression anyway (p=0.1)

Oh. That gives us some perspective, I guess. Also, some other people make the opposite critique and say that the HAM-D can’t distinguish very well at the low end. Suppose HAM-Ds less than ten are meaningless and random. This would look a lot like antidepressants not working in mild depression.

Getting back to Ghaemi and Vohringer, they try a different tack and suggest that there is a statistical floor effect. They quite reasonably say that if someone had a HAM-D score of 30, and antidepressants solved 10% of their problem, they would lose 3 HAM-D points, which looks impressive. But if someone had a HAM-D score of 10, and antidepressants (still) solved 10% of their problem, they would only lose 1 HAM-D point, which sounds disappointing. But either way, the antidepressants are doing the same amount of work. If you adjust everything for baseline severity, it’s easy to see that antidepressants here would have the same efficacy in severe and mild depression, even though it doesn’t look that way at first.

I am confused that this works for effect sizes, because I expect effect sizes to be relative to the standard deviation in a sample. However, several important people tell me that it does, and that when you do this Kirsch’s effect size goes from 0.32 to 0.40.

(I think these people are saying the exact same thing, but so overly mathematically that I’ve been staring at it for an hour and I’m still not certain)

More important, Ghaemi and Vohringer say once you do this, antidepressants reach the magic 0.5 number not only in severe depression, but also in moderate depression. However, when I look at this claim closely, almost all the work is done by G&V’s adjusted scale in which Kirsch’s “very severe” corresponds to their “mild”.

(personal aside: I got an opportunity to talk to Dr. Ghaemi about this paper and clear up some of my confusion. Well, not exactly an opportunity to talk about it, per se. Actually, he was supposed to be giving me a job interview at the time. I guess we both got distracted. This may be one of several reasons I do not currently work at Tufts.)

So. In conclusion, everyone has mapped HAM-D numbers into words like “moderate” in totally contradictory ways, such that one person’s “mild” is another person’s “very severe”. Another person randomly decided that we can only call things “clinically significant” if they go above the nice round number of 0.5, then retracted this. So when people say “the effects of antidepressants are only clinically significant in severe depression”, what they mean is “the effects of antidepressants only reach a totally arbitrary number one guy made up and then retracted, in people whose HAM-D score is above whatever number I make up right now.” Depending on what number you choose and what word you make up to describe it, you can find that antidepressants are useful in moderate depression, or severe depression, or super-duper double-dog-severe depression, or whatever.

Science!

6. The beneficial effects of antidepressants are only noticed by doctors, not the patients themselvesPartly true but okay

So your HAM-D score has gone down and you’re no longer officially in super-duper double-dog severe depression anymore. What does that mean for the patient?

There are consistent gripes that antidepressant studies that use patients rating their own mood show less improvement than studies where doctors rate how they think a patient is doing, or standardized tests like the HAM-D.

Some people try to turn this into a conspiracy, where doctors who have somehow broken the double-blinding of studies try to report that patients have done better because doctors like medications and want them to succeed.

The reality is more prosaic. It has been known for forty years that people’s feelings are the last thing to improve during recovery from depression.

This might sound weird – what is depression except people’s feelings? But the answer is “quite a lot”. Depressed people often eat less, sleep more, have less energy, and of course are more likely to attempt suicide. If a patient gets treated with an antidepressant, and they start smiling more and talking more and getting out of the house and are no longer thinking about suicide, their doctor might notice – but the patient herself might still feel really down-in-the-dumps.

I am going to get angry comments from people saying I am declaring psychiatric patients too stupid to notice their own recovery or something like that, but it is a very commonly observed phenomenon. Patients have access to internal feelings which they tend to weight much more heavily than external factors like how much they are able to get done during a day or how many crying spells they have, sometimes so much so that they completely miss these factors. Doctors (or family members, or other outside observers) who don’t see these internal feelings, are better able to notice outward signs. As a result, it is pretty universally believed that doctors spot signs of recovery in patients long before the patients themselves think they are recovering. This isn’t just imaginary – it’s found it datasets where the doctors are presumably blinded and with good inter-rater reliability.

Because most antidepressant trials are short, a lot of them reach the point where doctors notice improvement but not the point where patients notice quite as much improvement.

7. The apparent benefits of antidepressant over placebo may be an “active placebo” effect rather than a drug effectUnlikely

Active placebo is the uncomfortable idea that no study can really have a blind control group because of side effects. That is, sugar pills have no side effects, real drugs generally do, and we all know side effects are how you know that a drug is working!

(there is a counterargument that placebos very often have placebo side effects, but most likely the real drug will at least have more side effects, saving the argument)

The solution is to use active placebo, a drug that has side effects but, as far as anyone knows, doesn’t treat the experimental condition (in this case, depression). The preliminary results from this sort of study don’t look good for antidepressants:

Thomson reviewed 68 double-blind studies of tricyclics that used an inert placebo and seven that used an active placebo (44). He found drug efficacy was demonstrated in 59% of studies that employed inert placebo, but only 14% of those that used active placebo (?2=5.08, df=1, p=0.02). This appears to demonstrate that in the presence of a side-effect-inducing control condition, placebo cannot be discriminated from drug, thus affirming the null hypothesis.

Luckily, Quitkin et al (2000) solve this problem so we don’t have to:

Does the use of active placebo increase the placebo response rate? This is not the case. After pooling data from those studies in which a judgment could be made about the proportion of responders, it was found that 22% of patients (N=69 of 308) given active placebos were rated as responders. To adopt a conservative stance, one outlier study (50) with a low placebo response rate of 7% (N=6 of 90) was eliminated because its placebo response rate was unusually low (typical placebo response rates in studies of depressed outpatients are 25%–35%). Even after removing this possibly aberrant placebo group, the aggregate response rate was 29% (N=63 of 218), typical of an inactive placebo. The active placebo theory gains no support from these data.

Closer scrutiny suggests that the “failure” of these 10 early studies to find typical drug-placebo differences is attributable to design errors that characterize studies done during psychopharmacology’s infancy. Eight of the 10 studies had at least one of four types of methodological weaknesses: inadequate sample size, inadequate dose, inadequate duration, and diagnostic heterogeneity. The flaws in medication prescription that characterize these studies are outlined in Table 3. In fact, in spite of design measurement and power problems, six of these 10 studies still suggested that antidepressants are more effective than active placebo.

In summary, these reviews failed to note that the active placebo response rate fell easily within the rate observed for inactive placebo, and the reviewers relied on pioneer studies, the historical context of which limits them.

In other words, active placebo research has fallen out of favor in the modern world. Most studies that used active placebo are very old studies that were not very well conducted. Those studies failed to find an active-placebo-vs.-drug difference because they weren’t good enough to do this. But they also failed to find an active-placebo-vs.-inactive-placebo difference. So they provide no support for the idea that active placebos are stronger than inactive placebos in depression and in fact somewhat weigh against it.

8. Antidepressants have much worse side effects than you were led to believeDepends how bad you were led to believe the side effects were

As discussed in Part 2, the biggest advantage of SSRIs and other new antidepressants over the old antidepressants was their decreased side effect profile. This seems to be quite real. For example, Brambilla finds a relative risk of adverse events on SSRIs only 60% of that on TCAs, p = 0.003 (although there are some conflicting numbers in that paper I’m not really clear about). Montgomery et al 1994 finds that fewer patients stop taking SSRIs than tricyclics (usually a good “revealed preference”-style measure of side effects since sufficiently bad side effects make you stop using the drug).

The charmingly named Cascade, Kalali, and Kennedy (2009) investigated side effect frequency in a set of 700 patients on SSRIs and found the following:

56% decreased sexual functioning
53% drowsiness
49% weight gain
19% dry mouth
16% insomnia
14% fatigue
14% nausea
13% light-headedness
12% tremor

However, it is very important to note that this study was not placebo controlled. Placebos can cause terrible side effects. Anybody who experiments with nootropics know that the average totally-useless inactive nootropic causes you to suddenly imagine all sorts of horrible things going on with your body, or attribute some of the things that happen anyway (“I’m tired”) to the effects of the pill. It’s not really clear how much of the stuff in this study is placebo effect versus drug effect.

Nevertheless, it is worth mentioning that 34% of patients declare side effects “not at all” or “a litte” bothersome, 40% “somewhat” bothersome, and 26% “very” or “extremely” bothersome. That’s much worse than I would have expected.

Aside from the sort of side effects that you expect with any drug, there are three side effects of SSRIs that I consider especially worrisome and worthy of further discussion. These are weight gain, sexual side effects, and emotional blunting.

Weight gain is often listed as one of the most common and debilitating effects of SSRIs. But amusingly, when a placebo-controlled double-blinded study was finally run, SSRIs produced less weight gain than placebo. After a year of pill-taking, people on Prozac had gained 3.1 kg; people on placebo had gained 4.3. There is now some talk of SSRIs as a weak but statistically significant agent for weight loss.

What happened? One symptom of depression is not eating. People get put on SSRIs when they’re really depressed. Then they get better, either because the drugs worked, because of placebo, or just out of regression to the mean. When you go from not eating to eating, you gain weight. In the one-year study, almost everyone’s depression remitted (even untreated depressive episodes rarely last a whole year), so everyone went from a disease that makes them eat less, to remission from that disease, so everyone gained weight.

Sexual side effects are a less sanguine story. Here the direction was opposite: the medical community went from thinking this was a minor problem to finding it near-universal. The problem was that doctors usually just ask “any side effects?”, and off Tumblr people generally don’t volunteer information about their penis or vagina to a stranger. When they switched to the closed-ended question “Are you having any sexual side effects?”, a lot of people who denied side effects in general suddenly started talking.

Numbers I have heard for the percent of people on SSRIs with sexual side effects include 14, 24, 37, 58, 59, and 70 (several of those come from here. After having read quite a bit of this research, I suspect you’ve got at least a 50-50 chance (they say men are more likely to get them, but they’re worse in women). Of people who develop sexual side effects, 40% say they caused serious distress, 35% some distress, and 25% no distress.

So I think it is fair to say that if you are sexually active, your chances with SSRIs are not great. Researchers investigating the topic suggest people worried about sexual side effects should switch to alternative sexual-side-effect-free antidepressant Serzone. You may remember that as the antidepressant that worked worst in the efficacy studies and brought the efficacy of all the other ones down with it. Also, it causes liver damage. In my opinion, a better choice would be bupropion, another antidepressant which has been found many times not to cause sexual side effects and which may even improve your sex life.

(“Bupropion lacks this side effect” is going to be a common theme throughout this section. Bupropion causes insomnia, decreased appetite, and in certain rare cases of populations at risk, seizures. It is generally a good choice for people who are worried about SSRI side effects and would prefer a totally different set of side effects.)

There is a certain feeling that, okay, these drugs may have very very common, possibly-majority-of-user sexual side effects, but depressed people probably aren’t screwing like rabbits anyway. So after you recover, you can wait the appropriate amount of time, come off the drugs (or switch to a different drug or dose for maintenance) and no harm done.

The situation no longer seems so innocuous. Despite a lack of systematic investigation, there are multiple reports from researchers and clinicians – not to mention random people on the Internet – of permanent SSRI-induced sexual dysfunction that does not remit once the drug is stopped. This is definitely not the norm and as far as we know it is so rare as to be unstudyable beyond the occasional case report.

On the other hand, I have this. I took SSRIs for about five to ten years as a kid, and now I have approximately the pattern of sexual dysfunction associated with SSRIs and consider myself asexual. Because I started the SSRIs too early to observe my sexuality without them, I can’t officially blame the drugs. But I am very suspicious. I feel like this provides moderate anthropic evidence that it is not as rare as everyone thinks.

The last side effect worth looking at is emotional blunting. A lot of people say they have trouble feeling intense emotions (sometimes: any emotions at all) when on SSRIs. Sansone and Sansone (2010) report:

As for prevalence rates, according to a study by Bolling and Kohlenberg, approximately 20 percent of 161 patients who were prescribed an SSRI reported apathy and 16.1 percent described a loss of ambition. In a study by Fava et al, which consisted of participants in both the United States and Italy, nearly one-third on any antidepressant reported apathy, with 7.7 percent describing moderate-to-severe impairment, and nearly 40 percent acknowledged the loss of motivation, with 12.0 percent describing moderate-to-severe impairment.

A practicing clinician working off observation finds about the same numbers:

The sort of emotional “flattening” I have described with SSRIs may occur, in my experience, in perhaps 10-20% of patients who take these medications…I do want to emphasize that most patients who take antidepressant medication under careful medical supervision do not wind up feeling “flat” or unable to experience life’s normal ups and downs. Rather, they find that–in contrast to their periods of severe depression–they are able to enjoy life again, with all its joys and sorrows.

Many patients who experience this side effect note that when you’re depressed, “experiencing all your emotions fully and intensely” is not very high on your list of priorities, since your emotions tend to be terrible. There is a subgroup of depressed patients whose depression takes the form of not being able to feel anything at all, and I worry this effect would exacerbate their problem, but I have never heard this from anyone and SSRIs do not seem less effective in that subgroup, so these might be two different things that only sound alike. A couple of people discussing this issue have talked about how decreased emotions help them navigate interpersonal relationships that otherwise might involve angry fights or horrible loss – which sounds plausible but also really sad.

According to Barnhart et al (2004), “this adverse effect has been noted to be dose-dependent and reversible” – in other words, it will get better if you cut your dose, and go away completely when you stop taking the medication. I have not been able to find any case studies or testimonials by people who say this effect has been permanent.

My own experience was that I did notice this (even before I knew it was an official side effect) that it did go away after a while when I stopped the medications, and that since my period of antidepressant use corresponded with an important period of childhood socialization I ended out completely unprepared for having normal emotions and having to do a delicate social balancing act while I figured out how to cope with them. Your results may vary.

There is also a large research on suicidality as a potential side effect of SSRIs, but this looks like it would require another ten thousand words just on its own, so let’s agree it’s a risk and leave it for another day.

9. Therefore, we should give up on medication and use psychotherapy insteadMakes sense right up until you run placebo-controlled trials of psychotherapy

The conclusion of these studies that claim antidepressants don’t outperform placebo is usually that we should repudiate Big Pharma, toss the pills, and go back to using psychotherapy.

The implication is that doctors use pills because they think they’re much more effective than therapy. But that’s not really true. The conventional wisdom in psychiatry is that antidepressants and psychotherapy are about equally effective.

SSRIs get used more than psychotherapy for the same reason they get used more than tricyclics and MAOIs – not because they’re better but because they have fewer problems. The problem with psychotherapy is you’ve got to get severely mentally ill people to go to a place and talk to a person several times a week. Depressed people are not generally known for their boundless enthusiasm for performing difficult tasks consistently. Also, Prozac costs like 50 cents a pill. Guess how much an hour of a highly educated professional’s time costs? More than 50c, that’s for sure. If they are about equal in effectiveness, you probably don’t want to pay extra and your insurance definitely doesn’t want to pay extra.

Contrary to popular wisdom, it is almost never the doctor pushing pills on a patient who would prefer therapy. If anything it’s more likely to be the opposite.

However, given that we’re acknowledging antidepressants have an effect size of only about 0.3 to 0.5, is it time to give psychotherapy a second look?

No. Using very similar methodology, a team involving Mind The Brain blogger James Coyne found that psychotherapy decreases HAM-D scores by about 2.66, very similar to the 2.7 number obtained by re-analysis of Kirsch’s data on antidepressants. It concludes:

Although there are differences between the role of placebo in psychotherapy and pharmacotherapy research, psychotherapy has an effect size that is comparable to that of antidepressant medications. Whether these effects should be deemed clinically relevant remains open to debate.

Another study by the same team finds psychotherapy has an effect size of 0.22 compared to antidepressants’ 0.3 – 0.5, though no one has tried to check if that difference is statistically significant and this does not give you the right to say antidepressants have “outperformed” psychotherapy.

If a patient has the time, money, and motivation for psychotherapy, it may be a good option – though I would only be comfortable using it as a monotherapy if the depression was relatively mild.

10. Further complications

What if the small but positive effect size of antidepressants wasn’t because they had small positive effects on everyone, but because they had very large positive effects on some people, and negative effects on others, such that it averaged out to small positive effects? This could explain the clinical observations of psychiatrists (that patients seem to do much better on antidepressants) without throwing away the findings of researchers (that antidepressants have only small benefits over placebo) by bringing in the corollary that some psychiatrists notice some patients doing poorly on antidepressants and stop them in those patients (which researchers of course would not do).

This is the claim of Gueorguieva and Krystal 2011, who used “growth modeling” to analyze seven studies of new-generation-antidepressant Cymbalta and found statistically significant differences between two “trajectories” for the drug, but not for placebo. 66% of people were in the “responder” trajectory and outperformed placebo by 6 HAM-D points (remember, previous studies estimated HAM-D benefits over placebo at about 2.7). 33% of people were nonresponders and did about 6 HAM-D points worse than placebo. Average it out, and people did about 3 HAM-D points better on drug and placebo, pretty close to the previous 2.7 point estimate.

I don’t know enough about growth modeling to be sure that the researchers didn’t just divide the subjects into two groups based on treatment efficacy and say “Look! The subsection of the population whom we selected for doing well did well!” but they use many complicated statistics words throughout the study that I think are supposed to indicate they’re not doing this.

If true, this is very promising. It means psychiatrists who are smart enough to notice people getting worse on antidepressants can take them off (or switch to another class of medication) and expect the remainder to get much, much better. I await further research with this methodology.

What if there were actually no such thing as the placebo effect? I know dropping this in around the end of an essay that assumes 75% of gains related to antidepressants are due to the placebo effect is a bit jarring, but it is the very-hard-to-escape conclusion of Hróbjartsson and Gøtzsche’s meta-analysis on placebo. They find that three-armed studies – ie those that have a no-treatment group, a placebo-treatment group, and a real-drug-treatment group – rarely find much of a difference between no-treatment and placebo. This was challenged by Wampold et al here and here, but defended against those challenges by the long-name-Scandinavian-people here. Kirsch, who between all his antidepressant work is still Associate Director of Placebo Studies, finds here that 75% of the apparent placebo effect in antidepressant studies is probably a real placebo effect, but his methodology is a valiant attempt to make the most out of a total lack of data rather than a properly-directed study per se.

If placebo pills don’t do much, what explains the vast improvements seen in both placebo and treatment groups in antidepressant trials? It could be the feeling of cared-for-ness and special-ness of getting to see a psychiatrist and talk with her about your problems, and the feeling of getting-to-contribute-something you get from participating in a scientific study. Or it could just be regression to the mean – most people start taking drugs when they feel very depressed, and at some point you have nowhere to go but up. Most depression gets better after six months or so – which is a much longer period than the six week length of the average drug trial, but maybe some people only volunteered for the study four months and two weeks after their depression started.

If Hróbjartsson and Gøtzsche were right, and Kirsch and the psychiatric establishment wrong, what would be the implications? Well, the good implication is that we no longer have to worry about problem 7 – that antidepressants are merely an active placebo – since active placebos shouldn’t do anything. That means we can be more confident they really work. The more complicated implication is that psychiatrists lose one excuse for asking people to take the drugs – “Sure, the drug effect may be small, but the placebo effect is so strong that it’s still worth it.” I don’t know how many psychiatrists actually think this way, but I sometimes think this way.

What if the reason people have so much trouble finding good effects from antidepressants is that they’re giving the medications wrong? Psychiatric Times points out that:

The Kirsch meta-analysis looked only at studies carried out before 1999. The much-publicized Fournier study examined a total of 6 antidepressant trials (n=718) using just 2 antidepressants, paroxetine and imipramine. Two of the imipramine studies used doses that were either subtherapeutic (100 mg/day) or less than optimal (100 to 200 mg/day)

What if we’ve forgotten the most important part? Antidepressants are used not only to treat acute episodes of depression, but to prevent them from coming back (maintenance therapy). This they apparently do very well, and I have seen very few studies that attempt to call this effect into question. Although it is always possible that someone will find the same kind of ambiguity around maintenance antidepressant treatment as now clouds acute antidepressant treatment, so far as far as I know this has not happened.

What if we don’t understand what’s going on with the placebo effect in our studies? Placebo effect has consistently gotten stronger over the past few decades, such that the difference between certain early tricyclic studies (which often found strong advantages for the medication) and modern SSRI studies (which often find only weak advantages for the medication) is not weaker medication effect, but stronger placebo effect (that is, if medication always has an effect of 10, but placebo goes from 0 to 9, apparent drug-placebo difference gets much lower). Wired has a good article on this. Theories range from the good – drug company advertising and increasing prestige and awareness of psychiatry have raised people’s expectations of psychiatric drugs – to the bad – increasing scientific competence and awareness have improved blinding and other facets of trial design – to the ugly – modern studies recruit paid participants with advertisements, so some unscrupulous people may be entering studies and then claiming to get better, hoping that this sounds sufficiently like the outcome the researchers want that everyone will be happy and they’ll get their money on schedule.

If placebos are genuinely getting better because of raised expectations, that’s good news for doctors and patients but bad news for researchers and drug companies. The patient will be happy because they get better no matter how terrible a prescribing decision the doctor makes; the doctor will be happy because they get credit. But for researchers and drug companies, it means it’s harder to prove a difference between drug and placebo in a study. You can invent an excellent new drug and still have it fail to outperform placebo by very much if everyone in the placebo group improves dramatically.

Conclusion

An important point I want to start the conclusion section with: no matter what else you believe, antidepressants are not literally ineffective. Even the most critical study – Kirsch 2008 – finds antidepressants to outperform placebo with p < .0001 significance.

An equally important point: everyone except those two Scandinavian guys with the long names agree that, if you count the placebo effect, antidepressants are extremely impressive. The difference between a person who gets an antidepressant and a person who gets no treatment at all is like night and day.

The debate takes place within the bounds set by those two statements.

Antidepressants give a very modest benefit over placebo. Whether this benefit is so modest as to not be worth talking about depends on what level of benefits you consider so modest as to not be worth talking about. If you are as depressed as the average person who participates in studies of antidepressants, you can expect an antidepressant to have an over-placebo-benefit with an effect size of 0.3 to 0.5. That’s the equivalent of a diet pill that gives you an average weight loss of 9 to 14 pounds, or a growth hormone that makes you grow on average 0.8 to 1.4 inches.

You may be able to get more than that if you focus on the antidepressants, like paroxetine and venlafaxine, that perform best in studies, but we don’t have the statistical power to say that officially. It may be the case that most people who get antidepressants do much better than that but a few people who have paradoxical negative responses bring down the average, but right now this result has not been replicated.

This sounds moderately helpful and probably well worth it if the pills are cheap (which generic versions almost always are) and you are not worried about side effects. Unfortunately, SSRIs do have some serious side effects. Some of the supposed side effects, like weight gain, seem to be mostly mythical. Others, like sexual dysfunction, seem to be very common and legitimately very worrying. You can avoid most of these side effects by taking other antidepressants like bupropion, but even these are not totally side-effect free.

Overall I think antidepressants come out of this definitely not looking like perfectly safe miracle drugs, but as a reasonable option for many people with moderate (aka “mild”, aka “extremely super severe”) depression, especially if they understand the side effects and prepare for them.

Social Justice And Words, Words, Words

[Content note: hostility toward social justice, discussion of various prejudices]

“Words! Words! Words! I’m so sick of words! I get words all day through. First from him, now from you. Is that all you blighters can do?” – Eliza Doolittle

I.

I recently learned there is a term for the thing social justice does. But first, a png from racism school dot tumblr dot com.

So, it turns out that privilege gets used perfectly reasonably. All it means is that you’re interjecting yourself into other people’s conversations and demanding their pain be about you. I think I speak for all straight white men when I say that sounds really bad and if I was doing it I’m sorry and will try to avoid ever doing it again. Problem solved, right? Can’t believe that took us however many centuries to sort out.

A sinking feeling tells me it probably isn’t that easy.

In the comments section of the last disaster of a social justice post on my blog, someone started talking about how much they hated the term “mansplaining”, and someone else popped in to – ironically – explain what “mansplaining” was and why it was a valuable concept that couldn’t be dismissed so easily. Their explanation was lucid and reasonable. At this point I jumped in and commented:

I feel like every single term in social justice terminology has a totally unobjectionable and obviously important meaning – and then is actually used a completely different way.

The closest analogy I can think of is those religious people who say “God is just another word for the order and beauty in the Universe” – and then later pray to God to smite their enemies. And if you criticize them for doing the latter, they say “But God just means there is order and beauty in the universe, surely you’re not objecting to that?”

The result is that people can accuse people of “privilege” or “mansplaining” no matter what they do, and then when people criticize the concept of “privilege” they retreat back to “but ‘privilege’ just means you’re interrupting women in a women-only safe space. Surely no one can object to criticizing people who do that?”

…even though I get accused of “privilege” for writing things on my blog, even though there’s no possible way that could be “interrupting” or “in a women only safe space”.

When you bring this up, people just deny they’re doing it and call you paranoid.

When you record examples of yourself and others getting accused of privilege or mansplaining, and show people the list, and point out that exactly zero percent of them are anything remotely related to “interrupting women in a women-only safe space” and one hundred percent are “making a correct argument that somebody wants to shut down”, then your interlocutor can just say “You’re deliberately only engaging with straw-man feminists who don’t represent the strongest part of the movement, you can’t hold me responsible for what they do” and continue to insist that anyone who is upset by the uses of the word “privilege” just doesn’t understand that it’s wrong to interrupt women in safe spaces.

I have yet to find a good way around this tactic.

My suspicion about the gif from racism school dot tumblr dot com is that the statements on the top show the ways the majority of people will encounter “privilege” actually being used, and the statements on the bottom show the uncontroversial truisms that people will defensively claim “privilege” means if anyone calls them on it or challenges them. As such it should be taken as a sort of weird Rosetta Stone of social justicing, and I can only hope that similarly illustrative explanations are made of other equally charged terms.

Does that sound kind of paranoid? I freely admit I am paranoid in this area. But let me flesh it out with one more example.

Everyone is a little bit racist. We know this because there is a song called “Everyone’s A Little Bit Racist” and it is very cute. Also because most people score poorly on implicit association tests, because a lot of white people will get anxious if they see a black man on a deserted street late at night, and because if you prime people with traditionally white versus traditionally black names they will answer questions differently in psychology experiments. It is no shame to be racist as long as you admit that you are racist and you try your best to resist your racism. Everyone knows this.

Donald Sterling is racist. We know this because he made a racist comment in the privacy of his own home. As a result, he was fined $2.5 million, banned for life from an industry he’s been in for thirty-five years, banned from ever going to basketball games, forced to sell his property against his will, publicly condmened by everyone from the President of the United States on down, denounced in every media outlet from the national news to the Podunk Herald-Tribune, and got people all over the Internet gloating about how pleased they are that he will die soon. We know he deserved this, because people who argue he didn’t deserve this were also fired from their jobs. He deserved it because he was racist. Everyone knows this.

So.

Everybody is racist.

And racist people deserve to lose everything they have and be hated by everyone.

This seems like it might present a problem. Unless of course you plan to be the person who gets to decide which racists lose everything and get hated by everyone, and which racists are okay for now as long as they never cross you in any way.

Sorry, there’s that paranoia again.

Someone will argue I am equivocating between two different uses of “racist”. To which I would respond that this is exactly the point. I don’t know if racism school dot tumblr dot com has a Rosetta Stone with Donald Sterling on the top and somebody taking the Implicit Association Test on the bottom. But I think there is a strain of the social justice movement which is very much about abusing this ability to tar people with extremely dangerous labels that they are not allowed to deny, in order to further their political goals.

II.

I started this post by saying I recently learned there is a term for the thing social justice does. A reader responding to my comment above pointed out that this tactic had been described before in a paper, under the name “motte-and-bailey doctrine”.

The paper was critiquing post-modernism, an area I don’t know enough about to determine whether or not their critique was fair. It complained that post-modernists sometimes say things like “reality is socially constructed”. There’s an uncontroversial meaning here – we don’t experience the world directly, but through the categories and prejudices implicit to our society. For example, I might view a certain shade of bluish-green as blue, and someone raised in a different culture might view it as green. Okay. Then post-modernists go on to say that if someone in a different culture thinks that the sun is light glinting off the horns of the Sky Ox, that’s just as real as our own culture’s theory that the sun is a mass of incandescent gas a great big nuclear furnace. If you challenge them, they’ll say that you’re denying reality is socially constructed, which means you’re clearly very naive and think you have perfect objectivity and the senses perceive reality directly.

The writers of the paper compare this to a form of medieval castle, where there would be a field of desirable and economically productive land called a bailey, and a big ugly tower in the middle called the motte. If you were a medieval lord, you would do most of your economic activity in the bailey and get rich. If an enemy approached, you would retreat to the motte and rain down arrows on the enemy until they gave up and went away. Then you would go back to the bailey, which is the place you wanted to be all along.

By this metaphor, statements like “God is an extremely powerful supernatural being who punishes my enemies” or “The Sky Ox theory and the nuclear furnace theory are equally legitimate” or “Men should not be allowed to participate in discussions about gender” are the bailey – not defensible at all, but if you can manage to hold them you’ve got it made.

Statements like “God is just the order and love in the universe” and “No one perceives reality perfectly directly” and “Men should not interject into safe spaces for women” are the motte – extremely defensible, but useless.

As long as nobody’s challenging you, you spend time in the bailey reaping the rewards of occupying such useful territory. As soon as someone challenges you, you retreat to the impregnable motte and glare at them until they get annoyed and go away. Then you go back to the bailey.

This is a metaphor that only historians of medieval warfare could love, so maybe we can just call the whole thing “strategic equivocation”, which is perfectly clear without the digression into feudal fortifications.

III.

I probably still sound paranoid. So let me point out something I think the standard theory fails to explain, but my theory explains pretty well.

Why can’t social justice terms apply to oppressed groups?

Like, even bringing this up freaks people out. There is no way to get a quicker reaction from someone in social justice than to apply a social justice term like “privilege” or “racist” to a group that isn’t straight/white/male. And this is surprising.

If “privilege” just means “interjecting yourself into other people’s conversations”, this seems like something that women could do as well as men. Like, let’s say that a feminist woman posts a thoughtful comment to this post, and I say “Thanks for your input, but I was actually just trying to explain things to my non-feminist male friends, I’d prefer you not interject here.” Isn’t it possible she might continue to argue, and so be interjecting herself into another person’s conversation?

Or suppose “privilege” instead just means a cute story about a dog and a lizard, in which different people have trouble understanding each other’s experiences and appreciating the amount of pain they can be causing. I know a lot of men who are scared of being Forever Alone but terrified to ask women out, and I feel their pain and most of my male friends feel their pain. Yet a lot of the feminists I talk to have this feeling that this is entirely about how they think they own women’s bodies and are entitled to sex, and from their experience as attractive women it’s easy to get dates and if you can’t it’s probably because you’re a creep or not trying hard enough. This seems to me to be something of a disconnect and an underappreciation of the pain of others, of exactly the dog-lizard variety.

There are as many totally innocuous and unobjectionable definitions of “privilege” as there are people in the social justice movement, but they generally share something in common – take them at face value, and the possibility of women sometimes showing privilege toward men is so obvious as to not be worth mentioning.

Yet if anyone mentions it in real life, they are likely to have earned themselves a link to an Explanatory Article. Maybe 18 Reasons Why The Concept Of Female Privilege Is Insane. Or An Open Letter To The Sexists Who Think Female Privilege Is A Thing. Or The Idea Of Female Privilege – It Isn’t Just Wrong, It’s Dangerous. Or the one on how there is no female privilege, just benevolent sexism. Or That Thing You Call Female Privilege Is Actually Just Whiny Male Syndrome. Or Female Privilege Is Victim Blaming, which helpfully points out that people who talk about female privilege “should die in a fire” and begins “we need to talk, and no, not just about the fact that you wear fedoras and have a neck beard.”

It almost seems like you have touched a nerve. But why should there be a nerve here?

As further confirmation that we are on to something surprising, note also the phenomenon of different social justice groups debating, with desperation in their eyes, which ones do or don’t have privilege over one another.

If you are the sort of person who likes throwing rocks at hornet nests, ask anyone in social justice whether trans men (or trans women) have male privilege. You end up in places like STFU TRANSMISOGYNIST TRANS FOLKS or Cis Privilege Is Just A Tenet Of Male Privilege or On Trans People And The Male Privilege Accusation or the womyn-born-womyn movement or Against The Cisgender Privilege List or How Misogyny Hurts Trans Men: We Do Sometimes Have Male Privilege But There Are More Important Things To Talk About Here.

As far as I can tell, the debate is about whether trans women are more privileged than cis women, because they have residual male privilege from the period when they presented as men, or less privileged than cis women, because they are transsexual – plus a more or less symmetrical debate on the trans man side. The important thing to notice is that every group considers it existentially important to prove that they are less privileged than the others, and they do it with arguments like (from last link) “all examples of cis privilege are really male privileges that are not afforded to women, or are instances of resistance to trans politics. I call it patriarchy privilege when something like an unwillingness to redefine one’s own sexuality to include males is seen is labeled as offensive.”

And the trans male privilege argument is one of about seven hundred different vicious disputes in which everyone is insisting other people have more privilege than they do, fighting as if their lives depended on it.

The question here: since privilege is just a ho-hum thing about how you shouldn’t interject yourself into other people’s conversations, or something nice about dogs and lizards – but definitely not anything you should be ashamed to have or anything which implies any guilt or burden whatsoever – why are all the minority groups who participate in communities that use the term so frantic to prove they don’t have it?

We find the same unexpected pattern with racism. We all know everyone is racist, because racism just means you have unconscious biases and expectations. Everyone is a little bit racist.

People of color seem to be part of “everyone”, and they seem likely to have the same sort of in-group identification as all other humans. But they are not racist. We know this because of articles that say things like “When white people complain about reverse racism, they are complaining about losing their PRIVILEGE” and admit that “the dictionary is wrong” on this matter. Or those saying whites calling people of color racist “comes from a lack of understanding of the term, through ignorance or willful ignorance and hatred”. Or those saying that “when white people complain about experiencing reverse racism, what they’re really complaining about is losing out on or being denied their already existing privileges.” Why Are Comments About White People Not Racist, Can Black People Be Racist Toward White People? (spoiler: no), Why You Can’t Be Racist To White People, et cetera et cetera.

All of these sources make the same argument: racism means structural oppression. If some black person beats up some white person just because she’s white, that might be unfortunate, it might even be “racially motivated”, but because they’re not acting within a social structure of oppression, it’s not racist. As one of the bloggers above puts it:

Inevitably, here comes a white person either claiming that they have a similar experience because they grew up in an all black neighborhood and got chased on the way home from school a few times and OMG THAT IS SO RACIST and it is the exact same thing, or some other such bullshittery, and they expect that ignorance to be suffered in silence and with respect. If you are that kid who got chased after school, that’s horrible, and I feel bad for you…But dudes, that shit is not racism.

I can’t argue with this. No, literally, I can’t argue with this. There’s no disputing the definitions of words. If you say that “racism” is a rare species of noctural bird native to New Guinea which feeds upon morning dew and the dreams of young children, then all I can do is point out that the dictionary and common usage both disagree with you. And the sources I cited above have already admitted that “the dictionary is wrong” and “no one uses the word racism correctly”.


Source: Somebody who probably doesn’t realize they’ve just committed themselves to linguistic prescriptivism

Actually, I suppose one could escape a hostile dictionary and public by appealing to the original intent of the person who invented the word, but the man who invented the word “racism” was an activist for the forced assimilation of Indians who was known to say things like “Some say that the only good Indian is a dead one. In a sense, I agree with the sentiment, but only in this: that all the Indian there is in the race should be dead. Kill the Indian in him, and save the man.” My guess is that this guy was not totally on board with dismantling structures of oppression.

So we have a case where original coinage, all major dictionaries, and the overwhelming majority of common usage all define “racism” one way, and social justice bloggers insist with astonishing fervor that way is totally wrong and it must be defined another. One cannot argue definitions, but one can analyze them, so you have to ask – whence the insistence that racism have the structural-oppression definition rather than the original and more commonly used one? Why couldn’t people who want to talk about structural oppression make up their own word, thus solving the confusion? Even if they insisted on the word “racism” for their new concept, why not describe the state of affairs as it is: “The word racism can mean many things to many people, and I suppose a group of black people chasing a white kid down the street waving knives and yelling ‘KILL WHITEY’ qualifies by most people’s definition, but I prefer to idiosyncratically define it my own way, so just remember that when you’re reading stuff I write”? Or why not admit that this entire dispute is pointless and you should try to avoid being mean to people no matter what word you call the meanness by?

And how come this happens with every social justice word? How come the intertubes are clogged with pages arguing that blacks cannot be racist, that women cannot have privilege, that there is no such thing as misandry, that you should be ashamed for even thinking the word cisphobia? Who the heck cares? This would never happen in any other field. No doctor ever feels the need to declare that if we talk about antibacterial drugs we should call bacterial toxins “antihumanial drugs”. And if one did, the other doctors wouldn’t say YOU TAKE THAT BACK YOU PIECE OF GARBAGE ONLY HUMANS CAN HAVE DRUGS THIS IS A FALSE EQUIVALENCE BECAUSE BACTERIA HAVE INFECTED HUMANS FOR HUNDREDS OF YEARS BUT HUMANS CANNOT INFECT BACTERIA, they would just be mildly surprised at the nonstandard terminology and continue with their normal lives. The degree to which substantive arguments have been replaced by arguments over what words we are allowed to use against which people is, as far as I know, completely unique to social justice. Why?

IV.

And so we return to my claim from earlier:

I think there is a strain of the social justice movement which is entirely about abusing the ability to tar people with extremely dangerous labels that they are not allowed to deny, in order to further their political goals.

If racism school dot tumblr dot com and the rest of the social justice community are right, “racism” and “privilege” and all the others are innocent and totally non-insulting words that simply point out some things that many people are doing and should try to avoid.

If I am right, “racism” and “privilege” and all the others are exactly what everyone loudly insists they are not – weapons – and weapons all the more powerful for the fact that you are not allowed to describe them as such or try to defend against them. The social justice movement is the mad scientist sitting at the control panel ready to direct them at whomever she chooses. Get hit, and you are marked as a terrible person who has no right to have an opinion and who deserves the same utter ruin and universal scorn as Donald Sterling. Appease the mad scientist by doing everything she wants, and you will be passed over in favor of the poor shmuck to your right and live to see another day. Because the power of the social justice movement derives from their control over these weapons, their highest priority should be to protect them, refine them, and most of all prevent them from falling into enemy hands.

If racism school dot tumblr dot com is right, people’s response to words like “racism” and “privilege” should be accepting them as a useful part of communication that can if needed also be done with other words. No one need worry too much about their definitions except insofar as it is unclear what someone meant to say. No one need worry about whether the words are used to describe them personally, except insofar as their use reveals states of the world which are independent of the words used.

If I am right, then people’s response to these words should be a frantic game of hot potato where they attack like a cornered animal against anyone who tries to use the words on them, desperately try to throw them at somebody else instead, and dispute the definitions like their lives depend on it.

And I know that social justice people like to mock straight white men for behaving in exactly that way, but man, we’re just following your lead here.

Suppose the government puts a certain drug in the water supply, saying it makes people kinder and more aware of other people’s problems and has no detrimental effects whatsoever. A couple of conspiracy nuts say it makes your fingers fall off one by one, but the government says that’s ridiculous, it’s just about being more sensitive to other people’s problems which of course no one can object to. However, government employees are all observed drinking bottled water exclusively, and if anyone suggests that government employees might also want to take the completely innocuous drug and become kinder, they freak out and call you a terrorist and a shitlord and say they hope you die. If by chance you manage to slip a little bit of tap water into a government employee’s drink, and he finds out about it, he runs around shrieking like a banshee and occasionally yelling “AAAAAAH! MY FINGERS! MY PRECIOUS FINGERS!”

At some point you might start to wonder whether the government was being entirely honest with you.

This is the current state of my relationship with social justice.

List Of The Passages I Highlighted In My Copy Of “The Two-Income Trap”

- but which didn’t fit naturally into the review.

Today’s bankrupt families are deeper in debt than their counterparts just twenty years earlier, and their overall financial picutre – assets and debts – is worse. In 1981, the median family filing for bankrupcy owed 80% of total annual income in credit card and other nonmortgage debts; by 2001, that figure had nearly doubled to 150% of annual income.

One of the better parts of the book was its busting the myth that people use bankruptcy as an “easy way out” or that they’re declaring it willy-nilly. People are waiting much longer and trying much harder to avoid it now than a generation ago.

Many commenters seem concerned that the families filing for bankruptcy are not sufficiently contrite. Democractic Senator Patricia Murray from Washington argues that the Senate should make it a priority “to recapture the stigma associated with a bankruptcy filing”. The idea that they do not feel bad enough about their bankruptcy filings would have come to a shock to most of the families who filed…In our research, several mothers were willing to talk with us only on the condition that we not use the word “bankruptcy” during the telephone interview for fear that a child might pick up the extension phone and hear the dreaded word. Some said that just hearing the word still makes them cry, and they asked us to refer simply to “the event”. More than 80 percent of the families we interviewed reported that they would be “embarrassed” or “very embarrassed” if their families, friends, or neighbors learned of their bankruptcy.

Another quote along the same lines.

The odds that a worker will suffer an involuntary job loss have increased by 28% since the 1870s. Growing job insecurity has been hard on single-income families, who now face a 28% higher chance that the breadwinner will lose his job. But for today’s dual-income family, the numbers are doubly grim, as each spouse faces a higher likelihood of a job layoff. We estimate that in a single year, roughly 6.3% of dual-income families – one out of every sixteen – will receive a pink slip. That means that a family today with both husband and wife in the workforce is approximately two and a half times more likely to face a job losss than a single-income family of a generation ago.

I am too young to have strong opinions on How Things Have Changed. But I remember that back in the 90s, there were a lot of articles about The Layoffs Crisis and how layoffs were A Sign Of The Decline Of America. And now the understanding that people get laid off or downsized a lot is such an accepted part of everyday life that it seems weird that it was once a news item of approximately the same concern level as global warming or immigration. The people ten years younger than I am are going to have no idea that there was once hand-wringing over it, or that it is even the sort of thing over which hands could be wrung.

Make no mistake. Financial distress is a problem for both men and women. But we do not want to leave the impression that these phenomena are entirely gender-neutral. They are not. Mothers are 35% more likely than childless homeowners to lose their homes, three times more likely than men without children to go bankrupt, and seven times more likely to head up the family after a divorce.

So a while ago I recommended everyone be extraordinarily paranoid about feminist statistics. People gave me a lot of flak over that, and Jeff K wrote a blog post where he did some tests and said he found that feminist statistics were no worse than anyone else’s. I acknowledge the plausibility of his viewpoint – and yet extraordinary paranoia about feminist statistics has, at least for me, been the gift that keeps on giving. I suspect Jeff and everyone else reading that paragraph did what I very nearly did – assume it supported the conclusion it said it was supporting. But since I am extraordinarily paranoid, I made sure to re-read it and double-check. And so I noticed that it strongly implies there is a difference between women and men – but then it very deliberately avoids making the comparison. Mothers (ie women with children) are more likely than people without children (of either gender) to lose their home. Mothers are more likely than men without children to go bankrupt. There is no comparison between mothers and women without children, nor between mothers and fathers. I expect that if such a comparison showed any difference at all, Warren would have made it, rather than strongly imply she was doing so but in fact making distractor comparisons. The only place where women may be compared to an appropriate category of men – and even here it is left very vague – is in the last statistic, about being more likely to head the family after a divorce. But this is just the point that women get the children more often after divorce – which is not exactly the sort of gender disparity I feel we were promised.

Research shows that on average, a husband is three times more likely than a wife to take primary responsibility for managing the family’s money. But as a couple sinks into financial turmoil, this responsibility tends to shift. As families fall behind on their bills, it is wives who roll up their sleeves and do what must be done…Among couples who seek credit counseling or file for bankruptcy, the split over who was responsible for dealing with the bills was exactly reversed from that of secure families: three-quarters of the wives were exclusively responsible for trying to extract their families from their financial quagmire.

Okay, this one doesn’t even require extraordinary paranoia to start noticing alternative interpretations.

In the early 1970s, not only did most Americans believe that the public schools were functioning reasonably well, a sizable majority of adults thought that public education had improved since they were kids. Today, only a small minority of Americans share this optimistic view. Instead, the majority now believes that schools have gotten significantly worse. Fully half of all Americans are dissatisfied with America’s public education system, a deep concern shared by black and white parents alike.

I seriously doubt public schools became much worse in any interesting way, since as far as I can tell most educational philosophies kind of work about equally well. I wonder how much this has to do with the media spreading panic.

A group of solidly middle-class Americans – our nation’s police officers – illustrate the point. A recent study showed that the average police officer could not afford a median priced home in two-thirds of the nation’s metropolitan areas on the officer’s income alone. The same is true for elementary school teachers. Nor is this phenomenon limited to high cost cities like New York and San Francisco. Without a working spouse, the family of a police officer or teacher is forced to rent an apartment or buy in a marginal neighbhorhood even in more modestly priced cities such as Nashville, Kansas City, and Charlotte.

Stay-at-home parents are now difficult except for the well-off.

Single mothers who have been to college are actually more likely to end up bankrupt than their less educated sisters – nearly 60% more likely.

The floor is open to anyone who would like to come up with a just-so story to explain this.

In 2001, freshman Senator Hillary Clinton voted in favor of [a bill making it harder to filing for bankruptcy, which she had previously opposed violently]. Had the bill been transformed to get rid of all those awful provisions that had so concerned First Lady Hillary Clinton? No. The bill was essentially the same, but Clinton was not. As First Lady, she had been persuaded that the bill was bad for families, and she was willing to fight for her beliefs. As New York’s newest senator, however, it seems that Hillary Clinton could not afford such a principled positions. Campaigns cost money, and that money wasn’t coming from families in financial trouble. Senator Clinton received $140,000 in campaign contributions from banking industry executives in a single year, making her one of the top two recipients in the Senate. Big banks were now part of Senator Clinton’s constituency. She wanted their support, and they wanted hers – including a vote in favor of [what she had previously called] “that awful bill”.

Warren’s vicious attack on Hillary Clinton was a highlight, considering the current political situation. Apparently she gave a presentation on the problems with a bankruptcy bill to Clinton when she was First Lady, Clinton was very impressed and worked really hard to fight it, and then when she became Senator she turned around and supported it. You can bet this is coming up in the next Democratic primary.

To give a sense of just how expensive subprime mortgages are, consider this: In 2001, when standard mortgage loans were in the 6.5% range, Citibank’s average mortgage rate (which included both subprime and traditional mortagages) was 15.6%. To put that in perspective, a family buying a $175,000 home with a subprime loan at 15.6% would pay an extra $420,000 during the 30-year life of the mortgage – that is, over and above the payments due on a prime mortgage. Had the family gotten a traditional mortgage instead, they would have been able to put two children through college, purchase half a dozen new cars, and put enough aside for a comfortable retirement.

I will always reblog clever ways of putting numbers in perspective. Also, whoa. Something to remember when the news talks about how Americans can’t afford to save for retirement anymore. [EDIT: Eric finds some evidence this is exaggerated]

At Citibank, for example, researchers have concluded that at least 40% of those who were sold ruinous subprime mortgages would have qualified for prime-rate loans…A study by the Department of Housing and Urban Development revealed that one in nine middle-income families and one in fourteen upper-income families who refinanced a home mortgage ended up with a high-fee high-interest subprime mortgage. For many of these families there is no trade-off between access to credit and the cost of credit. They had their pockets picked, plain and simple.

This is part of what I meant when I praised Warren for being able to back up her accusations of market failure. Apparently there is not enough awareness of options for the market in mortgages to be well-functioning.

Most Americans guard their credit ratings jealously, living with a slightly prickly sensation that they could be cut off if they fell behind or forgot to pay a bill. What they don’t realize is that when a borrower makes a partial payment, when he misses a bill, and when his credit rating drops, he actually gets more offers for credit. He is not just down on his luck, behind on his bills, and short on cash, he has now joined the ranks of an elite group – The Lending Industry’s Most Profitable Customers…within six months of filing for bankruptcy, 84% of families had already received unsolicited offers for new credit.

As several commenters have pointed out, this means something different than I originally thought – having good credit rating may still be important to get low interest rates. But the fear that you will never get credit if your rating is poor is misplaced.

What does bankruptcy have to do with abortion? In Washington, a great deal. Over the past several years, pro-choice groups had scored significant court victories against a few prominent abortion clinic protesters by obtaining money judgments against them, only to see those victories turn to dust when the protesters declared bankruptcy and discharged their debts.

In a strange twist of politics, the credit industry’s version of the bankruptcy bill had been supported by Senator Charles Schumer, of New York, who had garnered strong support among women’s groups for his pro-choice politics. Ever responsive to his constituents, Senator Schumer inserted a provision into the bankruptcy bill that would make it more difficult for abortion clinic protesters to discharge judgments entered against them if they were sued for their protest activities, much in the same way drunk drivers and embezzlers cannot use bankruptcy to discharge judgments against themselves. Eager to appeal to women voters, the Senate had accepted the amendment in 2001. But in 2002, when the bankruptcy bill went back to the House with the abortion amendment in it, a coalition of right-to-life representatives refused to go along. They brought the bill to a standstill.

Desperate to get the bill passed, the banking lobby went back to the Senate, pressuring Senator Schumer to remove the controversial abortion provision. The industry ran attack ads against him in his home state, demanding that he support the bankruptcy bill — and claiming that he was costing every American family $550 a year. (The attack on Senator Schumer was particularly ironic, since he had received more campaign contributions from the credit industry than any other Senator, just nosing out fellow New Yorker Hillary Clinton.) But by this point, the pro-choice women’s groups were also mobilized, and they held firm, supporting Senator Schumer and threatening to withhold support from any elected official who moved to take the provision out of the bankruptcy bill. In one of those rare defining moments, Senator Schumer had to choose between big business and pro-choice women, both of whom had supported his campaign. He chose women, and the amendment remained in the bill.

Ultimately, two strange bedfellows — a small group of socially conservative Republicans and a handful of progressive Democrats — gathered enough momentum to defeat the bankruptcy bill against the best-financed lobbying campaign of the 107th Congress.

I feel like if I had to send a message to aliens to tell them everything they needed to know about US politics, it would be this story.

Ozy vs. Scott on Charity Baskets

I have invited Ozy to post to Slate Star Codex. I ended up disagreeing with their first post, so I’m going to include it along with my rebuttal.

Ozy:

A man goes up to a stockbroker and says, “You guys are so stupid. You invest in more than one stock. But there’s only one stock that is going to pay off the most. Why don’t you just put all your money in the stock that is going to earn the most money, instead of putting it in a bunch of stocks?”

With my usual quick and timely response, I would like to point out the fallacy within this article on effective altruism. The author offers up several things that would, in an effective altruist world, not exist:

If we all followed such a ridiculous approach, what would happen to:

1. Domestic efforts to serve those in need?
2. Advanced research funding for many diseases?
3. Research on and efforts in creative and innovative new approaches to helping others that no one has ever tried before?
4. More local and smaller charitable endeavors?
5. Funding for the arts, and important cultural endeavors such as the preservation of historically important structures and archives?

6. Volunteerism for the general public, since most “worthy” efforts are overseas and require a professional degree to have what Friedman calls “deep expertise in niche areas”?
7. Careers in the nonprofit sector?”

The answer to several of those is pretty obvious: people should work in the nonprofit sector if that’s their comparative advantage, who gives a @#$! about volunteerism or local charitable endeavors, arts funding comes out of people’s entertainment budgets they way it should, and resources are scarce and each donation to someone relatively well-off in the developed world trades off against resources from someone less well off. So far, so well-trammeled.

However, I think his points two and three are actually really interesting points. A lot of people seem to think of effective altruism as like the man who wants to invest in the best possible stock. However, in reality, just as a person who wants to maximize their returns invests in more than one stock, a society where everyone is an effective altruist would probably have a variety of different charities (although perhaps a narrower segment of charities), just as they do now.

To be clear, there are certain charities that are not effective at all and would probably not exist in a hypothetical effective altruist society. Make a Wish Foundation would probably not survive the conversion to a hypothetical effective altruist society (except, presumably, out of one’s entertainment budget). Nevertheless, the nonexistence of obviously ineffective charities doesn’t mean that we as a society would decide to have fewer charities, any more than not buying lottery tickets means that you are only allowed to invest in one stock.

(Note that I am using stocks as an analogy. Stocks and charity donations are unlike each other in a lot of ways. It isn’t a perfect metaphor. Also, I literally know nothing about stock investing.)

One of the reasons that people invest in more than one stock is uncertainty. Probably some stocks will go up and some stocks will go down. However, I, as an investor, don’t know which stocks will pay off more than other stocks. Therefore, I want to hedge my bets. Knowing that the market will go up in general, I choose to invest in a variety of different stocks, so that no matter what happens I keep some of my money.

A similar uncertainty applies to charities. For instance, it’s possible that Give Directly is run by crooks who steal all the donations. (As far as I know, Give Directly is an excellent organization and never steals anyone’s money.) If everyone has given to Give Directly, we’re screwed. If we have several different charities giving cash to people in the developing world, then it matters less that one of them is run by crooks. Similarly, we may be uncertain about whether malaria relief or schistosomiasis relief is the best bang for one’s charity buck. Given that it is impossible to eliminate all uncertainty, it’s best to direct some money towards both, so that in case malaria relief turns out to be a bust we haven’t wasted all our charitable budget.

In stocks, return is a function of risk. If there’s a chance of a large payoff, there’s an even larger chance of going bust and losing everything. If there’s not very much risk, you get payoffs that are barely larger than inflation. Therefore, you want a balanced investment strategy: have some high-risk investments that might make you rich, and some low-risk investments that have a less awesome payoff.

This also applies to charity donation, which is where we get to Berger and Penna’s concerns. Something like malaria relief is low-risk and relatively low-return. If you distribute malaria nets, it is pretty certain that people are going to have lower rates of malaria. However, there’s not much chance of getting a payoff higher than “people have lower rates of malaria, maybe no malaria at all,” which will save probably millions of lives. Compare this to, say, agronomy or disease research. With agronomy, there is a high chance that you will pour in millions of dollars and get nothing. Most agronomic research gets us, say, wheat that’s a little better at resisting weeds, or better understanding of the ideal growing conditions of the chickpea. However, there’s the slim chance that you’ll have another Green Revolution and save literally billions of lives. As effective altruists, we want to invest in both high-risk high-return and low-risk low-return charities.

Another important example of a high-risk low-return charity is a new charity, which I think is important enough that I’m going to talk about it separately. What happens if someone has a brilliant new idea about how to help people in the developing world? There’s potentially a high payoff if they can beat the current most effective charity; but new ideas for effective charities are probably not going to pay off, if for no other reason than ‘most new ideas are terrible.’ It is really important that we invest in new ideas.

What happens to a low-risk high-return charitable investment? Well, it is clearly the most effective place to donate and becomes our new baseline, and the same trilemma survives. Other charities are either comparable, and thus either higher return but higher risk or lower return but lower risk, or incontrovertibly better and the new baseline.

Please note that I’m not saying the individual should donate multiple places. Probably any individual only has time to investigate one family of charities and – for that matter – gets the most warm fuzzies from only one charity. I think that most people should probably only donate to one charity, because they can be certain they’re donating to the most effective charity they can. But what that charity is is different for different people. And, no, a hypothetical effective altruist society won’t totally lack scientific research.

Scott:

I think I disagree with this. That is, I’m sure I disagree with what I think it says, and I think it says what I think it says. I think it confuses two important issues involving marginal utility – call them disaster aversion and low-hanging fruit – and that once we separate them out we can see that diversifying isn’t necessary in quite the way Ozy thinks.

Disaster aversion is why we try to diversify our investments in the stock market. Although there’s a bit of money maximization going on – more money would always be nice – there’s also an incentive to pass the bar of “able to retire comfortably” and a big incentive to avoid going totally broke. This incentive works differently in charity.

Suppose you offer me a 66% chance of dectupling my current salary, and a 33% chance of reducing my current salary to zero (further suppose I have no savings and there is no safety net). Although from a money-maximizing point of view this is a good deal, in reality I’m unlikely to take it. It would be cool to be ten times richer, but the 33% chance of going totally broke and starving to death isn’t worth it.

Now suppose there is a fatal tropical disease that infects 100,000 people each year. Right now the medical system is able to save 10,000 of those 100,000; 90,000 get no care and die. You offer me a 66% chance of dectupling the effectiveness of the medical system, with a 33% chance of reducing the effectiveness of the system to zero. In this case, it seems clear that the best chance is to take the offer – the expected value is saving 56,000 lives, 46,000 more than at present.

The stock market example and the tropical disease example are different because while your first dollar matters much more to you then your 100,000th dollar, the first life saved doesn’t matter any more than the 100,000th. We can come up with strained exceptions – for example, if the disease kills so many people that civilization collapses, it might be important to save enough people to carry on society – but this is not often a concern in real-life charitable giving.

By low-hanging fruit, I mean that some charities are important up to a certain point, after which they become superseded by other charities. For example, suppose there is a charity researching a cure for Disease A, and another one researching a cure for Disease B. It may be that one of the two diseases is very simple, and even a few thousand dollars worth of research would be enough to discover an excellent cure. If we invest all our money in Disease A simply because it seems to be the better candidate, the one billionth dollar invested in Disease A will be less valuable than the first dollar invested in Disease B, since that first dollar might go to hire a mediocre biologist who immediately spots that the disease is so simple even a mediocre biologist could cure it.

This is also true with more active charities. For example, the first bed net goes to the person who needs bed nets more than anyone else in the entire world. The hundred millionth bed net goes to somebody who maaaaaybe can find some use for a bed net somewhere. It’s very plausible that buying the first bed net is the most effective thing you can do with your dollar, but buying the hundred millionth bed net is less effective than lots of other things.

In this case, at any one time there is only one best charity to donate to, but this charity changes very quickly. In a completely charity-naive world, Disease A might be the best charity, but after Disease A has received one million dollars it might switch to Disease B until it gets one million dollars, and then back to Disease A for a while, and then over to bed nets, and so on.

We can turn this into a complicated game theory problem where everyone donates simultaneously without knowledge of the other people’s donations, and in this case I think the solution might be seek universalizability and donate to charities in exactly the proportion you hope everyone else donates – which would indeed be a certain amount to Disease A, a certain amount to Disease B, and a certain amount to bed nets, in the hope of picking all the low-hanging fruit before you subsidize the less efficient high-hanging-fruit-picking.

But in reality it’s not a complicated game theory problem. You can go on the Internet and find more or less what the budget of every charity is. That means that for you, at this point in time, there is only one most efficient charity. Unless you are Bill Gates, it is unlikely that the money you donate will be so much that it pushes your charity out of the low-hanging fruit category and makes another one more effective, so at the time you are donating there is one best charity and you should give your entire donation to it.

Granted, people are not able to directly perceive utility and will probably err on exactly what this charity is. But I think the pattern of errors will be closer to the ideal if everyone is trying to donate to the charity they consider highest marginal value at this particular time rather than if everyone is trying to diversify.

The reasons for diversifying in the stock market are based on individual investors’ desire not to go broke and don’t really apply here.

Posted in Uncategorized | Tagged , | 87 Comments

How Common Are Science Failures?

After a brief spurt of debate over the claim that “97% of relevant published papers support anthropogenic climate change”, I think the picture has mostly settled to an agreement that – although we can contest the methodology of that particular study – there are multiple lines of evidence that the number is somewhere in the nineties.

So if any doubt at all is to remain about climate change, it has to come from the worry that sometimes entire scientific fields can get things near-unanimously wrong, especially for political or conformity-related reasons.

In fact, I’d go so far as to say that if we are not climatologists ourselves, our prior on climate change should be based upon how frequently entire scientific fields get things terribly wrong for political or conformity-related reasons.

Skeptics mock the claim that science was wrong before, but skeptics mock everything. A better plan might be to try to quantify the frequency of scientific failures so we can see how good (or bad) the chances are for any given field.

Before we investigate, we should define our reference class properly. I think a scientific mistake only counts as a reason for doubting climate change (or any other commonly-accepted scientific paradigm) if:

1. It was made sometime in the recent past. Aristotle was wrong about all sorts of things, and so were those doctors who thought everything had to do with black bile, but the scientific community back then was a lot less rigorous than our own. Let’s say it counts if it’s after 1900.

2. It was part of a really important theory, one of the fundamental paradigms of an entire field. I’m sure some tiny group of biologists have been wrong about how many chromosomes a shrew has, but that’s probably an easier mistake to wander into than all of climatology screwing up simultaneously.

3. It was a stubborn resistance to the truth, rather than just a failure to have come up with the correct theory immediately. People were geocentrists before they were heliocentrists, but this wasn’t because the field of astronomy became overly politicized and self-assured, it was because (aside from one ancient Greek guy nobody really read) heliocentrism wasn’t invented until the 1500s, and after that it took people a couple of generations to catch on. In the same way, Newton’s theory of gravity wasn’t quite as good as Einstein’s, but this would not shame physicists in the same way climate change being wrong would shame climatologists. Let’s say that in order to count, the correct theory has to be very well known (the correct theory is allowed to be “this phenomenon doesn’t exist at all and you are wasting your time”) and there is a large group of people mostly outside the mainstream scientific establishment pushing it (for approximately correct reasons) whom scientists just refuse to listen to.

4. We now know that the past scientific establishment was definitely, definitely wrong and everyone agrees about this and it is not seriously in doubt. This criterion isn’t to be fair to the climatologists, this is to be fair to me when I have to read the comments to this post and get a bunch of “Nutritionists have yet to sign on to my pet theory of diet, that proves some scientific fields are hopelessly corrupt!”

Do any such scientific failures exist?

If we want to play this game on Easy Mode, our first target will be Lysenkoism, the completely bonkers theory of agriculture and genetics adopted by the Soviet Union. A low-level agricultural biologist, Lysenko, came up with questionable ways of increasing agricultural output through something kind of like Lamarckian evolution. The Soviet government wanted to inspire people in the middle of a famine, didn’t really like real scientists because they seemed kind of bourgeois, and wanted to discredit genetics because heritability seemed contrary to the idea of New Soviet Man. So they promoted Lysenko enough times that everyone got the message that Lysenkoism was the road to getting good positions. All the careerists switched over to the new paradigm, and the holdouts who continued to believe in genetics were denounced as fascists. According to Wikipedia, “in 1948, genetics was officially declared “a bourgeois pseudoscience”; all geneticists were fired from their jobs (some were also arrested), and all genetic research was discontinued.”

About twenty years later the Soviets quietly came to their senses and covered up the whole thing.

I would argue that Stalinist Russia, where the government was very clearly intervening in science and killing the people it didn’t like, isn’t a fair test case for a theory today. But climate change opponents would probably respond that the liberal world order is unfairly promoting scientists who support climate change and persecuting those who oppose it. And Lysenkoism at least proves that is the sort of thing which can in theory sometimes happen. So let’s grumble a little but give it to them.

Now we turn the dial up to Hard Mode. Are there any cases of failure on a similar level within a scientific community in a country not actively being ruled by Stalin?

I can think of two: Freudian psychoanalysis and behaviorist psychology.

Freudian psychoanalysis needs no introduction. It dominated psychiatry – not at all a small field – from about 1930 to 1980. As far as anyone can tell, the entire gigantic edifice has no redeeming qualities. I mean, it correctly describes the existence of a subconscious, and it may have some insightful things to say on childhood trauma, but as far as a decent model of the brain or of psychological treatment goes, it was a giant mistake.

I got a little better idea just how big a mistake doing some research for the Anti-Reactionary FAQ. I wanted to see how homosexuals were viewed back in the 1950s and ran across two New York Times articles about them (1, 2). It’s really creepy to see them explaining how instead of holding on to folk beliefs about how homosexuals are normal people just like you or me, people need to start listening to the psychoanalytic experts, who know the real story behind why some people are homosexual. The interviews with the experts in the article are a little surreal.

Psychoanalysis wasn’t an honest mistake. The field already had a perfectly good alternative – denouncing the whole thing as bunk – and sensible non-psychoanalysts seemed to do exactly that. On the other hand, the more you got “educated” about psychiatry in psychoanalytic institutions, and the more you wanted to become a psychiatrist yourself, the more you got biased into think psychoanalysis was obviously correct and dismissing the doubters as science denalists or whatever it was they said back then.

So this seems like a genuine example of a scientific field failing.

Behaviorism in psychology was…well, this part will be controversial. A weak version is “psychologists should not study thoughts or emotions because these are unknowable by scientific methods; instead they should limit themselves to behaviors”. A strong version is “thoughts and emotions don’t exist; they are post hoc explanations invented by people to rationalize their behaviors”. People are going to tell me that real psychologists only believed the weak version, but having read more than a little 1950s psychology, I’m going to tell them they’re wrong. I think a lot of people believed the strong version and that in fact it was the dominant paradigm in the field.

And of course common people said this was stupid, of course we have thoughts and emotions, and the experts just said that kind of drivel was exactly what common people would think. Then came the cognitive revolution and people realized thoughts and emotions were actually kind of easy to study. And then we got MRI machines and are now a good chunk of the way to seeing them.

So this too I will count as a scientific failure.

But – and this seems important – I can’t think of any others.

Suppose there are about fifty scientific fields approximately as important as genetics or psychiatry or psychology. And suppose within the past century, each of them had room for about five paradigms as important as psychoanalysis or behaviorism or Lysenkoism.

That would mean there are about 250 possibilities for science failure, of which three were actually science failures – for a failure rate of 1.2%.

This doesn’t seem much more encouraging for the anti-global-warming cause than the 3% of papers that support them.

I think I’m being pretty fair here – after all, Lysenkoism was limited to one extremely-screwed-up country, and people are going to yell that behaviorism wasn’t as bad as I made it sound. And two of the three failures are in psychology, a social science much fuzzier than climatology where we can expect far more errors. A cynic might say if we include psychology we might as well go all the way and include economics, sociology, and anthropology, raising our error count to over nine thousand.

But if we want to be even fairer, we can admit that there are probably some science failures that haven’t been detected yet. I can think of three that I very strongly suspect are in that category, although I won’t tell you what they are so as to not distract from the meta-level debate. That brings us to 2.4%. Admit that maybe I’ve only caught half of the impending science failures out there, and we get to 3.6%. Still not much of an improvement for the anti-AGW crowd over having 3% of the literature.

Unless of course I am missing a whole load of well-known science failures which you will remind me about in the comments.

[Edit: Wow, people are really bad at following criteria 3 and 4, even going so far as to post the exact examples I said not to. Don't let that be you.]

Posted in Uncategorized | Tagged | 358 Comments

Medicine, As Not Seen On TV

Since I was twelve years old, my life has taken place in a series of Four Year Intervals.

Four years of high school. Four years of college. Four years of medical school. Four years of residency. Four times four, nice and symbolic.

This comes to mind now because I finished my first year of residency today.

I went into it raised on a steady diet of medical TV dramas like Scrubs and House, the legends passed down by other doctors in my family, and the ideas inculcated into me in medical school. It turned out to be nothing like any of those.

I’ve written a few posts about my experiences at work: The Hospital Orientation, I Aten’t Dead, Who By Very Slow Decay, and Evening Doc. I’ve tried to avoid writing anything more specific in order to protect patient confidentiality and my confidentiality.

But I thought this would be a good time to record – for my future self as much as for anyone else – what surprised me in my first year of medical practice.

To start with, forget about diagnostic mysteries. If you’ve ever seen House or anything else remotely like it, you imagine doctors as constantly presented with weird and wonderful symptoms, then racing against the clock to figure out what rare and deadly disease it is.

In real life, patients are more like the elderly lady I got last month. She had three hospital admissions for urinary tract infections in the past two years. Now she comes in with urinary symptoms. Before I even know the patient exists, the emergency room doctor has run a urine test which reveals that it’s a urinary tract infection. He has helpfully started her on the correct antibiotic for urinary tract infections. WHAT COULD THIS DIAGNOSTIC MYSTERY POSSIBLY BE?

Yeah, it was a urinary tract infection.

Or the guy who comes in shaking and sweating. I ask him what happened. He said he has been drinking alcohol for thirty years, and two days ago he tried to stop cold turkey. Have you ever had these sorts of symptoms before? Yes, every time I go off alcohol I get them. Does anything relieve the symptoms? Yes, drinking more alcohol. SOMEBODY PAGE DOCTOR HOUSE TO FIGURE OUT WHAT’S GOING ON?

Yeah, it was alcohol withdrawal.

Not all the patients I got were like this. But probably ninety-five percent of them were. Most people come into hospital for flare-ups of chronic problems they have had for, at minimum, ten years. Most of the time they have been to their primary care doctor first, who has made the diagnosis and sent the patient to the hospital for treatment. Or if not, they go to the emergency room, where the emergency room doctors do the same standard blood test they do on everybody and which usually gives you a really good idea what’s up. Oh, you’re feeling sick and tired and thirsty and nauseous? Hmm, your blood glucose is five hundred. Are you a diabetic? Did you take your insulin? Why didn’t you take your insulin? “Being on vacation” is not a good reason to stop taking your insulin! Do you promise to take your insulin in the future? Okay, well let’s admit you to the hospital and send you to Dr. Alexander so he can clear up this massive medical mystery we have on our hands.

But okay, five percent of cases we’re not entirely sure what’s going on. Now we can page Dr. House, right?

Wellll, in reality we “stabilize” them. A lot of the time “stabilize” means “put them in a bed and give them IV fluids and they get better on their own”. Sometimes the problem looks vaguely infectious and so we give empiric antibiotics, where empiric means “let’s give them an antibiotic that works for lots of stuff, and maybe it’ll work for this”. Sometimes the problem looks vaguely autoimmune and we give them steroids.

It’s pretty funny, because in medical school you spend a lot of time learning about maybe two dozen very rare autoimmune diseases, and how to differentiate Wegner’s granulomatosis from Takayasu arteritis, and the very subtle differences in the aetiology of each. And in real life, my attending says “Huh, this looks vaguely autoimmune, let’s throw steroids at it.” And it always works.

Now I understand that when the patient leaves hospital, they go to a rheumatologist or other specialist, and the specialist probably does lots of complicated tests and then comes up with a treatment regimen perfectly suited to that patient. But at the level I’m working at, it’s more “Hey, it responded to steroids! I guess it really was autoimmune! Or maybe the patient just got better on her own. Or something. Anyway, who cares, patient’s better, let’s discharge before something goes wrong.”

Because something else always goes wrong. You may be wondering: if doctors don’t spend their time solving diagnostic mysteries, what do they do in all those long hours they work? The answer is: deal with the avalanche of disasters that inevitably begin the second a patient walks through the door into a hospital.

I want to make it very clear I’m not criticizing my own hospital here. They make an amazing effort to do everything possible to avoid dangerous complications. All the hospitals I’ve worked at do. And all of them are death-traps. God just has a particular hatred for hospital patients, which He expresses by inflicting random diseases upon them for so long as they make the mistake of staying within the four walls and ceiling of a hospital building.

Like, you can be a perfectly healthy person, who lives forty years without anything worse than a sniffle. And then one day you’re playing sports, and you break your leg and you think “What’s the worst that can happen, I’ll spend a day or two in the hospital?” and by the time you come out you’ve got two artificial legs and a transplanted kidney and a rare bunyavirus from the African tropics and you have to inject yourself with insulin every three hours or else you die.

There are some good reasons for this. Obviously hospitals are full of sick people which means the potential for contagious infectious is high. People in hospitals are always getting lines stuck into them and surgeries performed and otherwise having foreign objects stuck in the body, and of course that’s a risk factor for all kinds of stuff. People in hospitals are often taking medications, which often have side effects. People in hospitals are often having tests, which sometimes involve injecting large amounts of radioactive material into the body and hoping it doesn’t fry anything important.

Then there are reasons you never expect until someone teaches you about them. If you don’t move your legs enough – maybe because you’re lying in a hospital bed all day – the blood in your legs settles and clots, and then the blood clots travel to your lungs, and then you can’t get any oxygen and potentially die. If you don’t fidget enough – maybe because you’re lying in a hospital bed unconscious – the constant pressure on a single patch of skin produces an ulcer, which gets infected and you potentially die. If you take five different recreational drugs every day, and your dealer doesn’t visit you in the hospital, then you go into withdrawal, and if you don’t want to admit what’s going on to your doctor maybe they miss it and – yeah, you potentially die.

But probably the biggest reason – and one you never think of – is that the hospital is where they’re finally doing tests on you, which means all those diseases that were lying dormant before and which you put down to normal old age finally get detected. You come in for a kidney stone, but your doctor does a blood test and finds you have diabetes. Also your calcium is a little off, we’re going to need to give you calcium pills and set up an appointment to get your parathyroid checked. And also when they did the CT of the kidneys they found a suspicious-looking mass in the colon, so you’re going to have to get that checked out. Uh, the gastroenterologist pulled the joystick controlling the colonoscope a little too hard and now you have a perforated colon, you need surgery. Uh, the surgeon put on her gloves the wrong way, now the surgical site is infected, guess you need antibiotics. Uh, guess you’re allergic to that antibiotic, let’s use a different one. Wow, allergic to four antibiotics in a row, guess this isn’t your day!

While Dr. House is diagnosing Chikungunya fever, the rest of us are treating the person who came in with a nosebleed (final diagnosis: blew nose too hard) but now has a DVT, hyperkalaemia, Sundowner’s syndrome, and a line infection.

Well, sort of treating.

John Searle came up with this really interesting philosophy-of-consciousness thought experiment. Suppose that a man were put in a room with a bunch of books, each of which contained a set of rules about Chinese characters. Sometimes, a paper with Chinese characters would come in through a slot in the door. The man would apply the rules in his book, which told him to write certain Chinese characters if certain conditions about the characters on the paper held true, and slip the output back through the slot in the door. The man does this faithfully, although he doesn’t know any Chinese and has no idea what any of it is saying.

On the other side of the door is a Chinese person. In her mind, she’s writing questions to the man, and he is responding back in fluent Chinese. She thinks they’re having a very productive conversation, and is starting to get a crush on him.

And the question is, in what sense can the man in the room be said to “understand” Chinese? If the answer is “not at all”, then in what sense can the brain – which presumably takes inputs from the environment, applies certain algorithms to them, and then sends forth appropriate outputs – be said to understand anything?

Daniel Dennett and various other materialist philosophers have a response to this challenge, which is that the man does not understand Chinese, but the man, his books, and the room can be conceptualized as an emergent system that does possess the property of Chinese-understanding and which may or may not be conscious.

I bring this up, because I understand what’s going on with patient care about as well as the man understands Chinese. I feel like maybe the hospital is an emergent system that has the property of patient-healing, but I’d be surprised if any one part of it does.

Suppose I see an unusual result on my patient. I don’t know what it means, so I mention it to a specialist. The specialist, who doesn’t know anything about the patient beyond what I’ve told him, says to order a technetium scan. He has no idea what a technetium scan is or how it is performed, except that it’s the proper thing to do in this situation. A nurse is called to bring the patient to the scanner, but has no idea why. The scanning technician, who has only a vague idea why the scan is being done, does the scan and spits out a number, which ends up with me. I bring it to the specialist, who gives me a diagnosis and tells me to ask another specialist what the right medicine for that is. I ask the other specialist – who has only the sketchiest idea of the events leading up to the diagnosis – about the correct medicine, and she gives me a name and tells me to ask the pharmacist how to dose it. The pharmacist – who has only the vague outline of an idea who the patient is, what test he got, or what the diagnosis is – doses the medication. Then a nurse, who has no idea about any of this, gives the medication to the patient. Somehow, the system works and the patient improves.

The patient thinks “My doctor must be very smart”. Meantime, the girl outside that room in the thought-experiment is thinking “This man must be a brilliant Confucian scholar.”

Part of being an intern is adjusting to all of this, losing some of your delusions of heroism, getting used to the fact that you’re not going to be Dr. House, that you are at best going to be a very well-functioning gear in a vast machine that does often tedious but always valuable work.

Well, other people are. I plan to go into outpatient.

Starting tomorrow, I abandon this exciting world of urinary tract infections and broken legs and go into psychiatry full time. I’m looking forward to it, especially since psychiatry is a little slower-paced and more focused. But this year was meant to teach me some appreciation for the wider world of medicine.

And boy have I got it.

[Good luck to SSC commenters Athrelon and Laura and everyone else starting an internship or residency tomorrow, and congratulations to everyone finishing one up]

Invisible Women

A commenter asked in yesterday’s discussion on The Two-Income Trap: why did the entry of women into the workforce produce so little effect on GDP? Here’s the graph of US women’s workforce participation:

And here’s the graph of US GDP.

In about 50 years – 1935 to 1985 – we went from 20% of women in the workforce to 60% of women in the workforce. Assuming a bit under 100% of men in the workforce, that’s an increase of almost 50% over the expected trend if number of women in the workforce had stayed constant.

I’ve already admitted I find the fixedness of the GDP trend bizarre. But how on Earth do you unexpectedly raise the number of people in the workforce by 50% and still stick to exactly the same GDP trend? It would be like the US annexed Mexico one day but the GDP didn’t change a bit. Are we imagining here that if women hadn’t entered the workforce, GDP would have suddenly deviated downward from the trend, and totally by coincidence women rushed in to save the day? And then when we ran out of interested women to add to the work force, again by coincidence the GDP stabilized back to its trend line?

Possible solutions:

1. The GDP data is totally false. I am suspicious of that GDP data anyway. Maybe some joker in the Bureau of Labor Statistics just graphed out an exponential function and reported that as our GDP to see if anyone would notice.

2. The GDP data is low resolution. So low-resolution that even a change of 50% is invisible if stretched over a long enough time period.

3. Women contributed through unpaid labor in the home, and their paid labor substituted for that but didn’t add to it. But as far as I know GDP only counts paid-for goods. Not only should women’s labor in the home not have counted, but GDP should overcount the benefits of putting women to work because paid daycare and so on appear as valuable new services.

4. Somehow in total contradiction to usual economic theory, all gains made by women came out of the pockets of men, leaving the same growth as would have happened anyway.

This last one segues into a question asked by another commenter – did women entering the workforce drive down male wages?

This seems a lot like the question “do immigrants entering the workforce drive down native-born wages?” to which economists tend to answer “no” with more or fewer caveats. But the economists’ explanation for the immigrant effect is that in addition to producing, immigrants also consume, increasing demand.

Women were presumably already consuming. Back when they were housewives, they still needed houses, food, clothes, entertainment, et cetera. Their entrance into the workforce may create slightly more demand – for business clothes and office supplies, for example – but nothing like the demand created by immigrants entering the country.

If women increase the labor supply by 50% but don’t change the demand for labor, that seems like it should make wages go way down. Even if women are primarily in different jobs than men, men’s wages should still go down via a substitution effect (that is, even if women all become elementary school teachers, then a man who was planning to be an elementary school teacher before might become a fireman instead, raising the labor supply for firemen and pushing down fireman wages).

But the only study I can find to investigate this says it didn’t happen. And male wages don’t seem to have dropped dramatically starting 1950 or so. They do seem to have stagnated dramatically starting 1970 or so, but I feel like the income inequality explanation for that is on pretty solid ground. Unless you want to argue that the reason for income inequality is that between both sexes there’s now such a large reserve army of labor that the capitalists can get away with paying the workers very little. But I feel like economists would have told us if this were going on.

I hear a lot of conspiracy theories about women in the workforce. On the left, women in the workforce are being exploited and kept down by a sinister patriarchy. On the right, women in the workforce are a satanic plot to weaken our moral fiber.

But so far I’ve never heard the conspiracy theory that women never actually entered the workforce, that all the working women you see around you are animatronic robots or carefully crafted stage illusions.

Maybe this is the surest sign that the conspiracy is working.

[EDIT: Commenter Pseudo-Erasmus presents a very plausible theory that total number of hours worked stayed about the same due to men entering the work force later, retiring earlier, and working shorter weeks]

Book Review: The Two-Income Trap

A long time ago I wrote a kinda-tongue-in-cheek defense of keeping modafinil – a relatively safe and effective stimulant – illegal. My argument was that if everybody can use stimulants to work harder and sleep less without side effects, then people who work very hard and don’t sleep will become the new norm. All the economic gains produced will go into bidding wars over positional goods, and people will end up about as happy – and with about as much stuff – as they have right now. Except the workday would be sixteen hours, the few people who can’t tolerate the stimulants will be at a profound disadvantage, and when the side effects reveal themselves twenty years down the line, everyone is too financially invested in the system to stop.

In other words, in a sufficiently screwed-up system, doubling everyone’s productivity is a net loss. The gains get eaten up by proportional increases in the prices of positional goods, and you’re left with nothing except complete dependence on a shaky advantage that could disappear at any time.

I don’t know exactly how serious I was. But Elizabeth Warren makes almost the exact same argument in The Two Income Trap, and I’m pretty sure she’s very serious. At least, she used it as a platform that got her elected to the US Senate, which is a kind of serious.

So on the advice of Alyssa Vance, I decided to take a look.

I.

Warren’s not talking about stimulants. She’s talking about the effect of an extra family income – usually moving from a system where the husband works outside the house and the wife stays at home, to a system where both parents work outside the house. Like a stimulant that removes the need for sleep, this can be expected to double economic productivity and family income.

In practice it doesn’t, because wives usually earn less than their husbands, but it comes pretty close. The average family income in the 1970s was around $40,000. The average family income in the 2000s was around $70,000 (all numbers in the book and in this post can be considered already adjusted for inflation). The husband’s income didn’t change much during this time, so the gain was due mostly to the wife getting an extra $30,000.

If families now have twice the income of families in the 1970s – who themselves were usually pretty financially secure and happy – then people should be really secure and rich now, right? But Warren meticulously collects statistics showing that the opposite is true. Home foreclosures have more than tripled in the past generation –

[Sorry, I feel at this point I should mention that my edition of the book was published in 2004, so all of these statistics about how awful home foreclosures are and everything are before the housing bubble burst and before the Great Recession. All of these statistics were when we were supposedly in a boom economy. You can assume that now they're much, much worse.]

– Sorry, where were we? Oh right. Home foreclosures have tripled in the last generation. Car repossessions doubled in the five years before the book was published. Bankruptcies have approximately quintupled since 1980. Over the same period, credit card debt has gone from 4% of income to 12%, and average savings have gone from 10% of income to negative.

Seventy percent of Americans say they have so much debt burden that “it is making their home lives unhappy”. In 2004, for the first time, “get out of debt” passed “lose weight” for Most Popular New Years Resolution.

So, Warren argues, the common-sense conclusion that a modern family making $70,000 is nearly twice as well-off as a traditional family making $40,000 clearly doesn’t hold. Why not?

II.

One thing that finally got me writing this up was a post on Bleeding Heart Libertarians which, like all posts on Bleeding Heart Libertarians and in accordance with the philosophy of the same name, was about how although libertarianism is commonly thought of as a heartless philosophy it can actually be reconciled with the care/harm-based ethic of deep compassion for the weak and needy.

Wait, sorry, actually it was about how we should cancel Social Security and let old people starve to death on the streets:

The baby boomers spent their entire lives buying new cars they didn’t need, buying houses that were too big, taking extra vacations, splurging on eating out, and the like. They enjoyed a higher standard of living than they could really afford. Why? Because they figured that when they retired, they could just use their voting power to force younger generations to pay for their retirement. These selfish narcissists pretty much want to steal as much as they can from their children. So, while I, Jasper, and my good twin brother Jason put tens of thousands of dollars into index funds each year, thereby forgoing fancier cars, vacations, and the like, the selfish, narcissistic baby boomers laugh gleefully, knowing that they’ll find a way to eat our nest eggs.

Jason is of course a sensitive soul and feels bad for these boomers. Not me. I say let them die. They knew what they were doing, and they spent their entire adult lives making the wrong choice over and over and over again. Does starving on the streets seem too inhumane? No problem. You’ve read Logan’s Run, right? Good idea, but wrong age limit.

This claim is pretty common. If true, it would explain the phenomenon cited above – that even with twice as much money, the Boomer generation is much less financially stable than their parents’ generation. But in Chapter 2 of Two-Income Trap, “The Over-Consumption Myth”, Warren tears it apart.

The Boomers “spent their entire lives buying new cars they didn’t need”? Warren, page 47:

When we analyzed unpublished data by the Bureau of Labor Statistics, we found that the average amount a family of four spends per car is twenty percent less than it was a generation ago. [Families spend $4000 more on automobiles in general, but instead of luxuries they are spending it on] something a bit more prosaic – a second car. Once an unheard-of luxury, a second car has become a necessity. With Mom in the workforce, that second car became the only means for running errands, earning a second income, and getting by in the far-flung suburbs.

In other words, it sounds like a family with two working parents requires two cars as a sound money-making strategy, but that Boomers compensate by spending less per car than past generations.

The Boomers “splurge on eating out”? Warren again:

Today’s family of four is actually spending 22 percent less on food (at home and restaurant eating combined) than its counterpart of a generation ago.

The Boomers “buy houses that are too big?” Warren:

The size and amenities of the average middle-class family home have increased only modestly. The median owner-occupied home grew from 5.7 rooms in 1975 in to 6.1 rooms in the late 1990s – an increase of only half a room in more than two decades…the data showed that most often that extra room was a second bathroom or third bedroom.

The BHL article doesn’t mention appliances, but in case you were worried, moderns spend 44% less on appliances than their parents’ generation, which is partly compensated for by a 23% increase in home entertainment (probably things like DVD players). Warren says that:

This same balancing act holds true in other areas. The average family spends more on airline travel than it did a generation ago, but less on dry cleaning. More on telephone services, but less on tobacco. More on pets, but less on carpets. And when we add it all up, increases in one category are offset by decreases in another. In other words, there seems to be about as much frivolous spending today as there was a generation ago…Sure, there are some families who buy too much stuff, but there is no evidence of any epidemic in overspending – certainly nothing that could explain a 255% increase in the foreclosure rate, a 430% increase in the bankruptcy rolls, and a 570% increase in credit card debt. A growing number of families are in terrible financial trouble, but no matter how many times the accusation is hurled, Prada and HBO are not the reason.

Curiouser and curiouser. Today’s families earn twice as much, spend the same amount on luxuries, yet are much less financially secure.

III.

So as to not keep anyone in suspense: the problem is nice suburban houses in good school districts.

Around a vague period of time centering on the 1970s, a couple of things happened.

First, the cities became viewed, rightly or wrongly, as terribly unsafe ghettos full of drugs and gangs and violence. As far as I can tell, this is a pretty accurate description of the 70s, although things have gotten a little better since then. Families didn’t want their children living in terribly unsafe ghettos full of drugs and gangs and violence, so they moved to the suburbs. Warren gives the testimony of a suburban mother:

We were close to The Corner and I was scared for my sons. I didn’t want them to grow up there. I wanted something away from this neighborhood to get my boys out to better schools and a safer place. The first night in [my new] house, I just walked around in the dark and was so grateful…at this house, it was so nice and quiet. My sons could go outdoors and they didn’t need to be afraid. I thought that if I could do this for them, get them to a better place, what a wonderful gift to give my boys. I mean, this place was three thousand times better. It is safe with a huge front yard and a backyard and a driveway. It is wonderful. I had wanted this my whole life.

Second, education started to be really, really important. As Warren puts it:

A generation or so ago, Americans were more likely to believe that there were many avenues for a young person to make his way into the middle class, including paths that didn’t require a degree. I recall my parents encouraging me to attend college, since my grades were high and they hoped I might become a teacher one day. But they were equally pleased when my eldest brother joined the Air Force, my middle brother entered a skilled trade, and my youngest brother became a pilot – even though all three of the boys had given up on college. My parents’ views were pretty typical a generation or two ago. Education was valued, but no one in our neighborhood would have claimed it was the single most important determinant of a young person’s success.

Warren is a Harvard professor. Think about that for a second. How many Harvard-professor-producing-type families can you think of today who are also happy with three of their children getting non-college-degree jobs? As Warren puts it in what might be my favorite passage from the whole book:

97% of Americans agree a college degree is “absolutely necessary” or “helpful” compared with a scant 3% claiming that a degree is “not that important”. According to one recent poll, 6% of our fellow citizens believe the Apollo moon landings were faked. In other words, Americans are twice as likely to believe that man never walked on the moon as they are to believe that a college degree doesn’t matter!

Certain school districts are known to be vastly superior to other school districts in terms of test scores, college admissions, et cetera. Usually these are school districts inhabited by rich people with very high property taxes and therefore very high levels of per-pupil spending in schools – although we’ll get back to that eventually.

These school districts are positional goods. Not everyone can be in the best school district. Only the people willing to spend the most money on their houses can be in the best school district. But rightly or wrongly, people believe that being in the best school district is vital for their children to succeed and become Harvard professors, as opposed to gang members or drug addicts or menial laborers. As Warren puts it, good education is the ticket to the middle class. And being in the lower class is too horrible to contemplate.

People want the best for their children [citation needed]. They’re not going to say “Well, we aren’t as rich as those other people, so we should probably live in a crappy school district with other people of our approximate wealth level”. They’re going to leave no stone unturned. And there are two big stones available for modern middle-class families: working-motherhood and debt.

If your family earns $70,000 and the other family earns $40,000, you have $30,000 extra to convince the banks to give you a really big mortgage so you can buy a much nicer house and get your kid into Oak Willow River View Hills Elementary, while their kid has to go to City Public School #431 and get beaten up by scary gang members every recess.

On the other hand, this is everybody’s cunning plan, so what you end up with is all houses costing a lot more, everyone working two jobs without any extra money, everyone burdened with massive debt, and everyone living exactly where they would have anyway.

Warren lists some points in support of her hypothesis:

A study conducted in Fresno found that, for similar homes, school quality was the most important determinant of neighborhood prices – more important than racial composition of the neighborhood, commuter distance, crime rate, or proximity to a hazardous waste site. A study in suburban Boston showed the impact of school boundary lines. Two homes located less than half a mile apart and similar in nearly every aspect will command significantly different prices if they are in different elementary school zones. Schools that scored just 5% better on fourth-grade math and reading tests added a premium of nearly $4,000 to nearby homes, even though these homes were virtually the same in terms of neighborhood character, school spending, racial composition, tax burden, and crime rate.

A lot of the causal claims here are very complicated and iffy at best, but here are two numbers that cuts through a lot of the debate: between 1984 and 2001, the median home value of the average childless couple increased 26%; the median home value of the average couple with children shot up 78%. So families are spending a lot more on houses nowadays and the disparity seems to be heavily concentrated in families with children. Combine that with the observation that houses only have 0.4 more rooms today, and you get a pretty good argument that families with children are competing much more intensely on house location.

IV.

When Warren does a very unofficial Fermi-estimate style breakdown of what is happening to the extra $30,000 that modern two-income families earn over traditional one-income families, she thinks they are paying about $4,000 more on their house, $4,000 more on child care, $3,000 more on a second car, $1,000 more on health insurance, $5,000 more on education (preschool + college), and $13,000 more on taxes.

The taxes are not a result of higher tax rates nowadays, just a result of the family making more money and so having to give more money – plus maybe being in a higher tax bracket. The health insurance isn’t surprising either to anyone who’s been paying attention. And the $4,000 extra on the house is a big part of what she’s been talking about the whole time.

The $4,000 on child care, $3,000 on the extra car, and $13,000 on taxes are the results of the second income. Mom needs a car to get to work, the children need care now that Mom’s not home to look after them, and not only does Mom get taxed but Dad may move into a higher bracket. That means that of the $30,000 Mom takes home, $20,000 gets spent on costs relating to Mom having a job – meaning that Mom’s $30,000 job only brings in $10,000 in extra money.

The $5,000 on education is a bit more complicated. In Warren’s example family, it’s spent on preschool. She points out how a generation ago, practically no one went to preschool, whereas nowadays it is viewed as another one of those important legs up (“If little Madison doesn’t get into the best preschool, she’ll never be able to make it into the science magnet school, which means she’ll be unprepared for high school, which means Harvard goes out the window”). Warren points out that today two-thirds of American children attend preschool, compared to four percent in the mid-1960s. Once again, parents are told if they want the best for their kids they need to compete for good preschools:

The laws of supply and demand take hold, eliminating the pressure for preschool programs to keep prices low. A full-day program in a preschool offered by the Chicago public school district costs $6,500 a year – more than the cost of a year’s tuition at the University of Illinois. High? Yes, but that hasn’t deterred parents. At one Chicago public school, there are ninety-five kids on a waiting list for twenty slots.

It’s a little bit sleight-of-hand-y to put that in the family budget as Warren does – preschool only takes up two years of a child’s life, for a total of four years per two-child family. But I forgive her because college expenses are higher and also need to be budgeted for. Also, she’s saying her $4,000 child care estimate is for one child, which means that once the second child is out of preschool she’ll need to be in child care as well, for an insignificant price drop.

So I think Warren partially supports her points. The second income goes partially to increased house costs due to bidding wars, partially to increased education costs due to bidding wars, and partially to supporting the ability to have a second income. In her (admittedly slightly cooked) model, the family’s discretionary income – what it has left to spend on variable expenses like food and luxury goods – actually decreased from the 1970s one-income family to the present, $17,834 to $17,045.

V.

In my essay on stimulants, I suggested that the benefits of the stimulants would be wasted on positional goods, leaving only the side effects. In the same way, Warren says the benefits of the second income are lost, but the side effects remain.

The most important side effect she talks about is the loss of flexibility.

One nice thing about having a non-working mother is that she can, on relatively short notice, become a working mother. This is especially true in the Old Economy where even people without much college education could get okay jobs.

In the old model, financially healthy families subsisted on one income, and financially unhealthy families put the mother to work to get back on their feet. The most common disasters were the husband getting fired or a family member becoming sick. If the husband got fired, then even if he could get a job relatively soon afterwards it might be at lower pay until he could work himself back up the totem pole. Suppose he loses his $40,000 a year job and can only find a $30,000 a year job. Luckily, as we already established a wife’s second income can contribute $10,000 to the family. So she goes to work, they have as much money as they did before, and they are able to pay off their debts and continue to have a good quality of life.

Even if the wife doesn’t go back to work, having a flexible person with lots of free time is a huge benefit. If Grandma gets very sick, the wife has a lot of time available to take care of her – whereas now, if Grandma gets sick, either one parent has to quit their job to take care of her (meaning that standard of living goes way down and the family is at risk of not being able to pay debts it took out when their prospects looked much higher) or Grandma gets sent to a nursing home, which is very expensive and also risks unpaid debts or loss of standard of living.

Last of all, it means that getting a nice suburban home is more important than ever. If in the old days children spent most of their time with their mothers, it might be possible for the mother to pass down important values like education and hard work to her children. When mothers have very limited time with their kids, schools and peer groups take over a lot of the socialization role. For example, a mother with a very young son might talk to him, read to him, take him to childrens’ museums, et cetera, providing the crucial intellectual stimulation that children need at an early age to develop their full brainpower. If the mother works full-time, then it becomes really imperative to get the son into preschool to make sure he’s not just sitting around staring at a wall and losing brain cells. If the mother isn’t around much when the child is ten, it becomes a lot more important to be certain he’s in a good elementary school that’s teaching him the right values. If you can’t watch your kid to make sure he’s not doing drugs, it’s more important his school be drug-free. And so on. I don’t know to what degree any of these social psychological hypotheses are true, but the important thing is that people think they are and so the competition for nice neighborhoods and nice schools intensifies.

The last loss of flexibility Warren talks about is divorce. Something like a third of couples with children can expect to get divorced. Consider a scenario where a working single mother gets the house and custody over the children. If the house took two incomes to afford, she’s not going to be able to afford to keep her house. Suggestions that the father be forced to pay more child support don’t work – unless he pays 100% of his earnings to her, she’s not going to have as much money as the couple did when they bought the house – and they deliberately spent every cent they could on the mortgage because if they didn’t they would be outcompeted by people who did and their kids would end up in gritty urban school districts and never get into Harvard.

So Warren says that the reason so many families go bankrupt or get into debt is because the extra income doesn’t make a difference, but the loss of flexibility does. Everything has been sunk into the home for risk of getting outcompeted. And that means when someone loses their job – and Warren calculates that in a two-income family, this will happen to one parent or the other about once every sixteen years on average – or costs go up even a little, there is no buffer room and the only solution is to go deeper into debt. That just adds another unpayable cost – interest – and means the whole thing can only end in bankruptcy.

In another of my favorite passages, Warren notes that if the myth of over-consumption was true – if the guy in Bleeding Heart Libertarians were exactly right – there would be no problem. In fact, she encourages families to overconsume as the road to financial health. She says families should save, but if they can’t save, that should spend their money on restaurants, vacations, jewelery – anything but large fixed-income monthly costs like houses, cars, schools, et cetera. That way, when something goes wrong, they can easily just stop taking the vacations and be back to financial health. It’s only when money is trapped in mortgage payments that can’t be gotten rid of that things can get as bad as they are.

VI.

There’s a chapter on debt. It’s really cute. She’s all like “Did you know there are things called subprime mortgages? And that some people think banks might give them out too easily? I sure hope this doesn’t do something bad to happen.”

I am pretty sure no modern reader needs this chapter, but it sure increases her credibility.

VII.

Oh, right, I’m supposed to have an opinion.

Let’s start with the negatives. I don’t think she does a great job of proving her housing-school-positional-goods theory. When she talks about school district effects on housing prices, she comes up with numbers like “a 5% difference on test scores add $4,000 to housing costs.” Okay. That means, assuming linearity, that a 50% difference on test scores – which is way more than we could possibly expect schools to produce – would only add $40,000 to house costs. When house prices for the middle class are routinely around $200,000 to $300,000, that’s just not enough to be causing the destruction of the American family.

Likewise, studies that look for effects of school district on house price – usually by looking at otherwise identical houses on either side of a school district line – generally find modest effects.

The whole area is really hard to research. Suppose Neighborhood A has lots of minorities, low house prices, and bad schools. Neighborhood B has few minorities, high house prices, and good schools.

You can tell a story where Neighborhood B’s good schools raise land value, which prevents crime and pushes out minorities. Or you could tell a story where Neighborhood B’s high land values push out minorities and increase property taxes which improve the schools. Or you can tell a story where Neighborhood A’s many minorities cause racist homebuyers to stay out, depressing land values, and also minorities tend to have worse school performance. Except in real life there are like twenty factors like this rather than three. Although lots of different studies try to control for confounders, that’s always hard and requires a lot of assumptions that might not necessarily be true.

There’s another problem, which is that the usual measure of school quality – standardized test scores – is not necessarily the one families are going to be looking at. Suppose only a few very smart people know where to look for standardized test scores. Maybe everyone else tries to guess at how good schools are. Maybe those people assume that schools with higher percent minorities are worse. Maybe they assume that schools in prettier neighborhoods with higher land values are better. In that case, studies could find all they wanted that test scores don’t correlate with home prices, because what’s actually happening is that high home prices are causing belief in school superiority which is causing higher home prices.

But a bigger problem here is that the average family only spends $4,000/year more on housing than they did a generation ago. Warren can talk all she likes about how that forces families to adopt a second job, but it’s really not a very big share of what the second job’s meager extra income is being spent on. The average husband earns $3000 more at his own job nowadays, which means that it would be possible in theory for him to soak up pretty much all of the extra housing cost. To say the wife gets a $30,000 extra job just to soak up $1,000 in extra mortgage money seems like a stretch, even though Warren does a good job of pointing out how many extra burdens this places on people. But when you add positional education costs to the mix – preschool and college – it becomes a little more believable.

I guess it’s just hard making the numbers add up. Suppose you have two kids, but they’re not in preschool – or that you’re indifferent to preschooling your kids versus having the mother take care of them. Then the costs of the mother getting her $30,000 job are $24,000 – $13,000 in extra taxes, $8,000 in child care, and $3,000 in a second car. Are mothers really so desperate they’ll work full-time for the extra $6,000? Doesn’t this whole model break down once the mother gets a raise and starts making $40,000?

VIII.

How about the good?

The good is that Warren backs all her points up with excellent statistics, is very good at explaining complicated economic things, and has exactly the right level of contempt for everyone in politics.

Her view on politics is very very close to my heart. My impression is that she thinks of it as noise. It’s not good, it’s not evil, it’s something that you have to adjust for. Like, “Well, this would be a good policy but we could never pass it because the Left would throw a fit, this other thing is a good policy but we could never pass it because the Right would throw a fit, but I’m pretty sure this third thing would also help and not get anybody too enraged.” For example:

The politics that surrounded women’s collective decision to migrate into the workforce are a study in misdirection. On the left, the women’s movement was battling for equal pay and equal opportunity, and any suggestion that the family might be better off with Mother at home was discounted as reactionary chauvinism. On the right, conservative commentators accused working mothers of everything from child abandonment to defying the laws of nature. The atmosphere was far too charged for any rational assessment of the financial consequences of sending both spouses into the workforce. The massive miscalculation ensued because both sides of the political spectrum discounted the financial value of the stay-at-home mother. There was no room in either worldview for the capable, resourceful mother who might spend her days devoted to the roles of wife and mother but who could, if necessary, dive headlong into the workforce to support her family. No one saw the stay-at-home mom as the family’s safety net.

(in case you’re wondering, she doesn’t recommend women leaving the workforce. She says families where both parents want to work should keep one of the two incomes in reserve by either saving it or spending it on non-fixed luxury items. She admits that this is unfair because they will have problems getting into the best school districts, but says it is the safest solution until the wider societal problems are fixed.)

As a result of her disdain for established partisan groups, she manages to totally transcend politics. I noticed that when the Bleeding Heart Libertarians article got up on Xenosystems, one commenter protested:

The accusations of excess are no doubt sound but I always pause when someone mentions the housing excess of the boomer generation. They bought giant houses in suburbia, but how much of that was due to the lack of civilization in the city limits? If there was a sane enforcement of laws and no public schools or at least public schools where you didn’t fear for the safety of your children would they have bought so many giant houses?

In other words, the commentariat of one of the larger reactionary blogs is more or less on the same page as the Democratic Senator being pushed by the liberal wing of her party to run for President.

Her proposed solutions are also all over the map. Yes, she pushes for taxpayer-funded universal preschool, which should make liberals pretty happy. But she also pushes for school vouchers, which she hopes will decouple school quality from housing prices and let people live wherever they want and still be able to get an acceptable education for their children. She even has a states’ right style solution to one problem – she points out that banks used to be kept under control very well by state laws until the Supreme Court legalized free interstate commerce between banks which means all of them moved to the states with the fewest regulations and could not be kept under any control at all. In order to rein in banks again, all we need is for Congress or the courts to grant those powers back to the states.

And I will say one more thing in Senator Warren’s favor. She often suggests non-free-market solutions, like regulating something or banning something or proposing the government spend money on something. Every time she does this, she says very clearly something like “I understand the free-market arguments against this, and why in general we would want to use the market to take care of these sorts of problems, but this is a case where there is a likely market failure because of reasons X, Y, and Z. I recognize there is a burden of proof on someone saying something is a market failure, so I will now proceed to meet that burden of proof with a lot of statistics.”

People talk about dogmatic libertarians, but honestly this is all I ever wanted from anybody. Just an “oh, by the way, I have reasons for what I’m saying and they’re not just coming from a total failure to have ever grasped freshman economics.” I know it seems unfair to make people say it explicitly each time. But given the overwhelming number of people who say these things exactly because they never grasped freshman economics, it’s welcome a breath of fresh air.

I am sure if Warren ends up running for President, we will end up getting those ads where someone repeats “MOST LIBERAL SENATOR OF ALL TIME” on a black-and-white background, followed by saucy rumors that she once had a fling with Karl Marx.

But I for one intend not to believe them.

IX.

But aside from doing some legal work to solve the bankruptcy crisis, we need some science work as well. The question is: are good school districts really that important?

I can’t find great research on this at the school district level. The closest I can find is the teacher value-added research, which finds things like “At age 28, a 1 SD increase in teacher quality in a single grade raises annual earnings by about 1% on average”. I can’t find good data on how this adds up – for example, do twelve great teachers in a row increase earnings 12% (linear addition)? Do you need one great teacher to inspire you for life, and after that it doesn’t matter whether or not you have more (ie sublinear addition_? Or can multiple great teachers build on one another’s successes by not having to constantly go back and review things the students should’ve learned before (superlinear addition)?

I don’t think it matters, because it doesn’t look like there are very big value-added score differences between teachers at rich and poor schools.

What about district-level issues like superintendents? According to the Brookings Institute report, difference in school district competency explained only 1.1% of variance in student test scores. Difference in schools explained another 1.7%. Teachers explained 6.7%. The remaining 90.4% was explained by demographic factors (class, race, parent’s education level) and individual variation among students.

Teachers are kind of a crapshoot – as we saw before, going to a better school district doesn’t increase your chances of getting a good one much. So the sorts of things you can easily affect by choosing what school district to live in are 2.8% of your kid’s total variation.

The research on preschool is so complicated it would take ten posts of this size to get through it. It seems strongly beneficial for low-income children and of controversial benefit for higher-income children. I will try to route around the controversy like so: home-schooled children do much better on every measure of academic achievement than school-schooled children. Preschool is basically teaching kids to share and playing fun games with them. If the alternative to sending your kid to preschool is that they stay home with you and you teach them to share and play fun games with them, you are home-preschooling your child and can expect them to do much better than school-preschooled children. And if the reason there’s no parent at home with the child is that both parents need to work in order to earn enough money to send the kids to a good preschool…well, that’s just a little bit circular.

So I think that in addition to various legal and policy changes, there needs to be more of a scientific effort to confirm (or disconfirm) these suspicions and, if they turn out to be true, publicize them to a society that clearly believes the opposite.

I know that talking about genetics and IQ too much makes people mad. And a lot of people have asked me – why do we have to do this? It’s going to offend a lot of people, and give a lot of unsavory people a lot of ammunition, so even if we shouldn’t ban research entirely, why not exercise the virtue of silence and let the whole thing stay in a few obscure journals?

And one of many answers to this is – suppose you see some school districts in rich neighborhoods, and all of the children in those schools can do calculus and read James Joyce and get great high-paying jobs. And next door is another school district, in a poor neighborhood, serving poor kids, and those kids are struggling.

If you’re not intimately familiar with behavioral genetics and IQ research, it is obvious that the rich-person school is much better and that’s why all the children of the rich people are doing so much better. And you will do anything, make any sacrifice, to get your kid into that rich person school, and so you work a back-breaking job and gamble your family’s financial security, all because you want your kid to have the same opportunities those rich kids do.

If you are intimately familiar with behavioral genetics and IQ research, a separate possible explanation leaps to mind: the rich people made their money by things like going to college, which means they probably have higher cognitive ability on average than the poor people, and cognitive ability is 50% genetic so they pass that on to their kids, and so it’s no surprise at all to see the rich person school having smarter students. That doesn’t prove that if your child switches from the poor person school to the rich person school, she will switch from average-poor-school-outcomes to average-rich-school-outcomes, and it doesn’t even provide any evidence whatsoever that it will make her do even a smidgeon better. So maybe you should, like, not sacrifice your life for it.

I’m not saying the behavioral-genetics-informed view is correct here. That’s going to require a lot more research. But I’m saying if you at least agree it’s something we’re allowed to talk about, maybe it will pan out and do nice things like save you from the horrible zero-sum competition destroying your country’s middle class.

Because if it could be confirmed that preschool attendance and expensive school districts had low impact – or even a merely moderate amount of impact – on success for middle- to high- income children, then even in the absence of legal changes that would relax the pressure on everyone to spend more money than they have to get into the best preschools and best school districts.

X.

Overall I recommend this book. I think the conclusion comes on a little too strong but that it sheds a lot of light on a lot of trends and throws important statistics at you such that you read them. Equally importantly, it sheds a lot of light – in a positive way! – on somebody who’s becoming an important national figure. The chapter about her meeting with Hillary Clinton and the subsequent break between the two of them seems likely to take on a lot more meaning in the years ahead.

What I really want is Elizabeth Warren vs. Rand Paul 2016. Imagine a Presidential race when both candidates have very different but very consistent philosophies, and you’d be pretty proud to see your country run by either. Wouldn’t that be a change?