Links 12/19

[Epistemic status: I haven’t independently verified each link. On average, commenters will end up spotting evidence that around two or three of the links in each links post are wrong or misleading. I correct these as I see them, but can’t guarantee I will have caught them all by the time you read this.]

You’ve probably seen the Russian city flag with the bear splitting the atom. But this is just the tip of the great Russian animal flags iceberg.

All archaeologists agree that the Roman artifacts dug up around Tucson, Arizona, are a hoax. Everyone agrees that there was no Roman colony of “Calalus” in North America that did battle with the Toltec Indians before finally being defeated in the 9th century AD. But who buried dozens of carefully-forged “crosses, swords, religious/ceremonial paraphernalia containing Hebrew and Latin inscriptions, pictures of temples, leaders portraits, angels, and a dinosaur inscribed on the lead blade of a sword” around Tucson during the 1920s to make it look like there was?

New study by growth mindset proponents finds an effect size of d = 0.11, highest in medium-achieving schools with challenge-supporting peer norms when the moon is in Scorpio. Even if true it should cast doubt on previous studies, since 0.11 is not a human-observable effect size or commensurate with the small-study findings in earlier growth mindset work.

Before genetic engineering, there was atomic gardening, the 1970s practice of planting some seeds in a circle around a radiation source and hoping some of them got beneficial mutations. The process produced modern Ruby Red grapefruits, among other things.

Despite the apparent renewal in interest, only about 1% of Americans think the gap between rich and poor is the most important issue – although it looks like it’s hard to get people to agree on what is a major problem in general.

What is continuous AI takeoff? What would a discontinuity in AI takeoff mean?

One of the engineers who worked on the Viking Mars landers continues to make the case that they discovered life on Mars in the 1970s and everyone is just ignoring this for no reason.

Table Of Organic Compounds And Their Smells. Smell is…a lot more logically organized in a dimensional space than I thought. And in case you have the same question I do: “ethereal” = “smells like ether”.

You might already be following the Navy UFO thing: over the past few years, the Navy has encouraged its pilots to come forward with UFO accounts, signal-boosted the reports, and sponsored UFO research organizations, as if they’re trying to stoke interest for some reason. Now the plot gets weirder: a Navy scientist has filed a patent for a quantum superconducter antigravity drive capable of UFO-like feats of impossible aeronautics. When the Patent Office rejected it as outlandish, the Chief Technical Officer of naval aviation personally wrote the Patent Office saying it was totally possible and a matter of national security, after which the Patent Office relented and granted the patent. The patent thanks UFO researchers in the acknowledgements, includes a picture of a UFO recently sighted by Navy pilots, and does everything short of print in capital letters ‘THIS COMES FROM A UFO’. Scientists who were asked to comment say the proposed drive is “babble” and none of the supposed science checks out at all. Has the Navy fallen victim to conspiracy-peddlers, are they deliberately trying to stoke conspiracy theories for some reason, or what?

Related: Army Partners With Former Blink-182 Founder To Study Alien Technology.

Principles For The Application Of Human Intelligence: Can decision-making by human intelligences introduce bias? Can HI be racist? “Until…human debiasing techniques reach the efficiency of our regular auditing, review, and modification of algorithms, we should not implement these human decision systems.”

BMJ: Failing to complete a prescribed antibiotic course does not contribute to antibiotic resistance.

Along with all of WeWork’s many other red flags, did you know they used kabbalah in decision-making? I would add that the name “Adam Neumann” has kabbalistic implications all by itself, regardless of what decision-making procedures he uses.

A while ago I linked an article about a supposedly disastrous trial of genetically engineered mosquitos in Brazil. This was wrong. The media misunderstood the incident, blew it out of proportion, and it seems that possibly the journal screwed up the original paper somehow? In any case, the scientists who wrote the paper the whole thing was based around are so upset that they are asking for their own paper to be retracted.

California passes a law saying that freelance journalists may not write more than 35 stories per year, which many freelance journalists argue is not enough to survive on and would essentially destroy freelance journalism as a career option. The story seems to be that California wanted to ban Uber from classifying its drivers as freelancers, and the easiest way to do this was just to ban freelance work and carve out exceptions for any form of freelance work the state didn’t want to ban, and whoever was in charge of exception-making randomly chose the number “35” for freelance journalism. The lawmaker responsible has apologized to freelance journalists, but the cynical part of me isn’t sure what apology they can give beyond “we’re sorry our law ending people’s freedom to make contracts with flexible work schedules also affected popular people who can complain”. And if you think I sound angry, as always you should read @webdevmason’s takes (1, 2). Anyway, I think California journalists should feel lucky to be allowed 35 stories; most new housing in the state is limited to two.

Minced oaths: You probably knew “gosh” = “God” and “darn” = “damn”. But did you know “crikey” = “Christ kill me”, and maybe even “bloody” = “by our Lady”? As always, Aaron Smith-Teller takes it too far.

The Middle East is quickly becoming less religious.

Which occupations disproportionately support which Democratic presidential candidates? Mostly what you’d expect, if anything a little too on the nose. Mathematicians for Warren, talent agents for Harris, pizza delivery drivers for Yang, etc. Also, if you want to figure out who is “the candidate of the rich”, you will find all the data you need here.

Jason Crawford’s Roots Of Progress blog on the history of science is going full time. Highly recommended – see eg his post on iron here. There’s also a subreddit.

Latest poll on how Americans view civility: 88% believe that “compromise and common ground should be the goal”, but 83% believe that “I’m tired of leaders compromising my values and ideals [and] want leaders who will stand up to the other side.”

A few weeks ago I posted about the bygone age when people used the Internet for endless arguments about atheism. If you’re sad you missed that era, good news! There’s still a little piece of it going strong over at

Pollution map: California wildfires vs. a totally normal day in China

Local Bay Area news: mass shooting at a party in an AirBnB house in Orinda, five dead. Orinda responds by banning AirBnB; AirBnB responds by banning parties. Seems to me like they’re just launching pointless attacks on coincidental features of this particular shooting instead of going after the real problem: houses.

Some pushback against Bryan Caplan’s Open Borders: Garrett Jones does an analysis where he shows that on Caplan’s own assumptions, average income of native-born US residents would fall by 40%, from $55,000 to $38,000. Caplan pushes back in a couple of ways. First, even under Jones’ assumptions, global GDP would almost double (because the natives being worse off is more than compensated by immigrants being better off). Second, a bunch of complicated statistical issues with Jones’ analysis. Third, pointing to South Africa, where the end of apartheid did not lower white incomes at all (!), showing that, even in multiracial countries where a richer race/class is outnumbered by a more politically powerful poorer race/class, this doesn’t seem to hurt the richer race/class (at least so far). There’s more at the link. See also the discussion of Open Borders at r/TheMotte.

Did you know: Brazil has more homicides than America + China + Russia + the EU + the rest of the Anglosphere combined?

Last month I linked the BernieBlindness subreddit so people could speculate whether weird media failures to include Bernie Sanders in lists of candidates were mistake or conspiracy. Here’s an even more impressive list of weird media failures to include Andrew Yang. Since I don’t think anyone feels especially threatened by Yang, I count this as strong evidence for the media just being too dumb to remember who all the candidates are consistently.

I can’t believe we’ve been rationalists for over a decade now and nobody proposed just doing a scientific study to see whether the Democrats or the Republicans are better. Apparent answer: when studied through careful causation-detecting economic techniques, having a state switch from Democratic to Republican control, or vice versa, has almost no effect on various outcomes of interest like unemployment, crime, or school attendance. This is true even when you limit it to the most extreme cases (state goes from unified Democratic control to unified Republican control and stays that way for many years). Not really sure what to think of this.

LW: autopsy of last year’s self-driving Uber crash. Hindsight is 20-20, and I usually try to hesitate to critique people smarter than I am who are trying to do an insanely difficult thing – but this still seems completely inexcusable and shockingly incompetent.

Texas plane crash was gender reveal party gone wrong; this comes hot on the heels of gender reveal parties being linked to a pipe bomb death and alligator abuse.

A team including Joseph Henrich (author of Secret Of Our Success) publishes a giant paper making the case that Westerners’ psychological differences from the global norm (more individualist, more trusting, less bound by tradition) date back to kinship structures enforced by the early Catholic Church (many of you will have first heard this theory from Twitter user @hbdchick, who’s been using it to explain everything for the past half-decade). There’s been a big (and sometimes nasty) pushback from less-quantitatively-oriented historians; see The Scholar’s Stage for a great play-by-play and a spirited defense of Henrich.

I’ve always wondered how long it takes to make a really good painting; some seem so intricate that I imagine an artist working full-time for a year just to get it right. Turns out I am very off and a skilled artist can make impressive-looking paintings in a few hours.

The Libertarian Party of Kentucky ran a third-party candidate who split the vote and helped a Democrat get elected Kentucky governor this year; here is their statement on the results.

Last month, one of the world’s leading Napoleonic historians was rescued from an icy river, only to have relief turn to horror when he was discovered to be wearing a backpack full of severed human arms. Then things got weird.

Alexey Guzey spent 130 hours chronicling errors in the first chapter of Dr. Matthew Walker’s hit book Why We Sleep, and is suitably upset by it. It seems to be paying off with high-volume sites like Andrew Gelman and Hacker News taking note. No response yet from Walker, but I agree with Gelman’s suggestion that Joe Rogan (who helped popularize Walker) should invite Alexey on his show to talk about it. See also this comment on the subreddit critiquing some of Guzey’s points, with ensuing discussion – the one about not using correlation to infer causation in all-cause mortality stats is a very important point, here and always.

Mark Zuckerberg started Facebook in college and cemented a cultural association between young people and entrepreneurship. But according to the American Institute for Economic Research, this association is wrong: the average successful entrepreneur is 45 when they found their company, the youngest entrepreneurs are the least successful, and a 50-year-old’s company is almost twice as likely to succeed as a 30-year-old’s.

A few weeks ago I reviewed an NYT article on incentives; since then some real economists including David Henderson (Part 2 here) and Bryan Caplan have weighed in with their own thoughts.

If you’re wondering what socialists want, this article on How To Build Socialist Institutions gives a pretty good rundown of moderate socialist proposals (eg nationalize things that have successfully been nationalized in other countries and times, switch various things to co-ops).

Myths about WWI: contrary to the portrayal that officers sat in comfortable tents as they sent enlisted men to certain death, officers were about 50% more likely to die than ordinary soldiers.

From the best of new Less Wrong: Design Principles Of Biological Circuits. I was especially impressed by this passage: “The body uses an integral feedback mechanism to achieve robust exact adaptation of glucose levels, with the count of pancreatic beta cells serving as the state variable: when glucose is too low, the cells (slowly) die off, and when glucose is too high, the cells (slowly) proliferate…mutant cells which mismeasure the glucose concentration could proliferate and take over the tissue. One defense against this problem is for the beta cells to die when they measure very high glucose levels (instead of proliferating very quickly). This handles must mutations, but it also means that sufficiently high glucose levels can trigger an unstable feedback loop: beta cells die, which reduces insulin, which means higher glucose “price” and less glucose usage throughout the body, which pushes glucose levels even higher. That’s type-2 diabetes.” Any experts reading who can confirm if this is true?

“Do you think we’re prepared for the big reveal that the last century and a half of history has been orchestrated by an immortal Andrew Johnson with space-radiation-related superpowers and a grudge?”

New research paper claims that “deaths of despair” are caused by white people being angry at the loss of their white privilege. This should immediately prompt another round of “spot the statistical malpractice people are using to provide scientific cover for the dominant narrative”, but in this case Clay Routledge has already done our work: the paper is just a rehash of the finding that Trump did unusually well in areas hit by the opioid epidemic and deaths of despair. The paper uses Trump support as a proxy for racism, tries to adjust out a few confounders, declares the whole thing probably causal, and so reframes this as “racism must be causing deaths of despair”.

The Department of Homeland Security opened a fake university in Michigan. They convinced immigrant students (who had legitimate student visas) to go there, used their openly-DHS-persona to ensure students the university was legitimate – then arrested those students for visa fraud since they were attending a fake university. They claim that since the university was fake (ie didn’t have any real classes or professors), they were operating a sting on immigrants who were okay with attending a fake university in order to keep their visas. But students claim they weren’t told it was a fake university without classes or professors when they signed up, and some students who transferred out once they figured out it was fake were also arrested. I’ve been reading about efforts to abolish the DHS recently, and the people involved stress they don’t mean that nobody should ever enforce immigration laws. They mean that the DHS, specifically, as an organization, has a screwed-up culture, and that dissolving it and leaving immigration enforcement to various other departments the way it was before 2001 would work better. This university thing seems like Exhibit A.

Man wields narwhal tusk to thwart terrorist’s murder spree is now a thing that has happened.

Kurt Vonnegut’s ice-9 is science fiction, but the same process – a new crystalline form arising in a substance, spreading unstoppably, and destroying everything that relied on the old form – happened in real life to the AIDS drug ritonavir (tumblr post, paper).

Zero HP Lovecraft: God-Shaped Hole

Posted in Uncategorized | Tagged | 918 Comments

Open Thread 142

This is the bi-weekly visible open thread (there are also hidden open threads twice a week you can reach through the Open Thread tab on the top of the page). Post about anything you want, but please try to avoid hot-button political and social topics. You can also talk at the SSC subreddit or the SSC Discord server – and also check out the SSC Podcast. Also:

1. Okay, now the adversarial collaboration contest is actually over. Thanks to everyone who sent me their entries. I’m going to start posting them sometime over the next few weeks. If you haven’t sent it yet, I may still post yours if you can get it in before I’m done posting everyone else’s, so hurry up!

2. New mistake on the Mistakes Page. I had previously argued technological progress wasn’t slowing down; based on the work of Tyler Cowen and Ben Southwood I now think it is; my previous position was mistaken.

3. Some good responses to my posts on therapy from various therapists and therapy patients; unless I get around to collecting them all in one place I’ll just link @QiaochuYuan on Twitter.

Posted in Uncategorized | Tagged | 406 Comments

SSC Meetups Everywhere Retrospective

Slate Star Codex has regular weekly-to-monthly meetups in a bunch of cities around the world. Earlier this autumn, we held a Meetups Everywhere event, hoping to promote and expand these groups. We collected information on existing meetups, got volunteers to create new meetups in cities that didn’t have them already, and posted times and dates prominently on the blog.

During late September and early October, I traveled around the US to attend as many meetups as I could. I hoped my presence would draw more people; I also wanted to learn more about meetups and the community and how best to guide them. Buck Shlegeris and a few other Bay Area effective altruists came along to meet people, talk to them about effective altruism, and potentially nudge them into the recruiting pipeline for EA organizations.

Lots of people asked me how my trip was. In a word: exhausting. I got to meet a lot of people for about three minutes each. There were a lot of really fascinating people with knowledge of a bewildering variety of subjects, but I didn’t get to pick their minds anywhere as thoroughly as I would have liked. I’m sorry if I talked to you for three minutes, you told me about some amazing project you were working on to clone neuroscientists or eradicate bees or convert atmospheric CO2 into vegan meat substitutes, and I mumbled something and walked away. You are all great and I wish I could have spent more time with you.

I finally got to put faces to many of the names I’ve interacted with through the years. For example, Bryan Caplan is exactly how you would expect, in every way. Also, in front of his office, he has a unique painting, which he apparently got by asking a Mexican street artist to paint an homage to Lord of the Rings. The artist had never heard of it before, but Bryan described it to him very enthusiastically, and the completely bonkers result is hanging in front of his office. This is probably a metaphor for something.

Philadelphia hosted their meetup in a beautiful room that looked like a Roman temple, and had miniature cheesesteaks for everybody. Chicago held theirs in a gym; appropriate, given this blog’s focus on BRUTE STRENGTH. Berkeley’s was in a group house with posters representing the Twelve Virtues Of Rationality hanging along the staircase. In Fairbanks, a person who had never read the blog showed up to get a story and an autograph for his brother who did. In New York, someone brought the best bread I have ever had, maybe the best bread anyone has ever had, I am so serious about this. In Boston, the organizers set up a prediction market to determine how many attendees they needed to plan for; they still ended up being off by a factor of two. This is also probably a metaphor for something. If only they had used more BRUTE STRENGTH!

Along the way, I got to see America. Most of it I saw from an airplane window, but I still saw it. In Portland, I ate from a makeshift food court formed by a bunch of really good food trucks congregating in the same empty lot; one of them just sold like a dozen different kinds of french fries. In Texas, I rode with an Uber driver whose day job is driving mechanical bulls to parties that need mechanical bulls, and who Ubers people around while he waits for the party to finish. In Washington DC, I tried to see the White House, only to be thwarted by the construction of a new security fence; they say that before you change the world you must change your own home, and it seems like our Wall-Builder-In-Chief takes this seriously. In Delaware, I stood on the spot where the Swedes first landed in America and declared it to be the colony of New Sweden; probably there are alternate timelines out there who could appreciate this more than I did. In New Jersey, I confirmed that the Pine Barrens are, in fact, really creepy.

People gave me things. You are all so nice, but you also seem to think I am about ten times more classy and fashionable than I really am. One person gave me a beautiful record of their audiobook – a real, honest-to-goodness vinyl record – as if I had any idea what to do with it. A reader in Philadelphia gave me a beautiful glossy magazine about Philadelphia culture, which I stared at intently for twenty minutes. Many people gave me beautifully-bound copies of my own work, which was so incredibly thoughtful that I feel bad that I will have to hide them in a closet so nobody sees them and thinks I am the kind of narcissist who makes beautifully-bound copies of my own work. The Charter Cities Institute people gave me a very nice Charter Cities Institute bag (although I assume that if I ever take it outside in Berkeley, someone will punch me and it will start a National Conversation). I am still really grateful to all of you.

But you already know how great you are. Let’s get to the statistics.

Mingyuan, the Official SSC Meetup Coordinator, sent out a survey to get information on the meetups we weren’t able to visit, and determined that we had somewhere between 81 and 111 meetups around the world. I’m sorry I can’t be more precise. 111 meetups were supposed to happen, 81 organizers reported back to Mingyuan that their meetups happened, and I’m not sure what happened to the other 30. Although most activity was concentrated in the Anglosphere, there were meetups as far away as Bangalore (9 people), Tel Aviv (25 people), Oslo (9 people), and Seoul (4 people). Medellin, Colombia reports a one person meetup; I am sorry it sounds like you did not have a good time. Montreal, Canada, reports a zero person meetup, which sounds very computer-sciency, kind of like a heap of zero grains of sand.

Here’s the histogram of attendance, binned by fives. About twenty meetups had 0-5 people, thirty had 5-10, and the remaining thirty had more than 10. The best-attended meetups were Boston (140), NYC (120), and Berkeley (105). Total meetup attendance around the world was almost 1500 people!

Did the event fulfill its goal of bringing more people to meetups? Many organizers had only a vague idea how many people usually attended their meetups, and many said their city didn’t have a usual meetup group at all. But as best I can tell, about 2.3x as many people attended the Meetups Everywhere meetup in a city compared to the average previous meetup. Breaking it down by tour status, meetups on my tour had much higher attendance (6.1x usual), but even meetups off my tour had somewhat higher attendance (1.6x usual).

Did the event succeed in bringing some people into meetup groups who might stay around later? I suggested meetup organizers bring a signup sheet that people could sign to get on a mailing list for future meetups. My data on this is sparse, because people took the survey question overly literally and wrote things like “I didn’t have a signup sheet, I just asked people for their emails” and then didn’t tell me how many people gave them. But for the 40 meetups where I have data, people on average got a population of new signers equal to 77% of their previous regular attendance; that is, a meetup group that usually had 100 people had 77 extra new people sign up for their mailing last. Breaking it down by tour status, meetups on my tour gained 170%, other meetups gained 58%.

This seems implausibly large; did one event nearly double the attendance of SSC meetup groups around the world? I don’t know how many people who signed up for the mailing list will really start attending regularly. But I will probably survey the organizers again next year, and they might be able to help me figure out how many people stayed around.

In total, 1,476 people attended SSC meetups, and 339 people added their name to mailing lists (the ratio here doesn’t match the previous numbers because most organizers didn’t have a mailing list or didn’t report mailing list data, and the ratios above only counted those who did).

So much for the numbers. What did I learn?

I don’t want to generalize too much – I deliberately went to the biggest meetups, and things that work for a group of 100 people might not apply to a group of 2 people. So take all of this with a grain of salt, but:

1. Tables and chairs kill big meetups. Some people tried to hold meetups at a restaurant or a park with picnic tables or something. Everyone would sit down at the table, talk to the 3-4 people in their immediate neighborhood, and that would be that. Eventually I figured out that I need to force everyone out of the picnic tables and into the rest of the park. This caused a phase shift from solid to gas, with people milling about, talking to everyone, finding the conversations that most interested them.

2. The welcomeness sentence is really important. In the meetup descriptions on the blog, I included a sentence like “Please feel free to come even if you feel awkward about it, even if you’re not ‘the typical SSC reader’, even if you’re worried people won’t like you, etc.” It sounds silly, but I had so many people come up to me saying the only reason they came was because of that sentence. It happened again and again and again. Anybody planning any kind of meetup about anything should strongly consider including a sentence like that (as long as it’s true). Maybe there are other simple hacks like this waiting to be discovered.

3. Group houses are important community nuclei. Obvious in retrospect, but it was pretty stark seeing the level of community in cities that did have rationalist group houses vs. the ones that didn’t, even if there were good meetup groups in both. This also came out in listening to some people mourn the loss of the main group house in their city and talk about all the great things they were no longer able to do.

I was thinking of this last one because a lot of the meetups felt kind of superficial. Everyone shows up, talks about their favorite SSC post or what their job is or what kind of interesting thing they read recently, and then they go home. Lots of people seemed to enjoy that, I enjoyed it, but seeing the kind of really great rationalist communities in the Bay Area or Seattle gave me a sense that more is possible. I don’t know, maybe it’s not possible in cities with only 10 or 20 interested people; maybe only places like the Bay Area and Seattle have enough people, and everywhere it’s possible it’s already happening. But group houses seem to be a big part of it.

I was also struck by the number of female meetup organizers; the female:male ratio on the meetup organizer survey is almost twice that on the SSC survey in general. When there were cities that didn’t have regular meetup groups, and I asked for a volunteer to set one up, it was usually a woman who volunteered.

This suggests to me that we’re not just performing at some kind of theoretical maximum for the amount of people and interest in a given community; there’s a shortage of something (speculatively, social initiative) that (in this community) women are better than men at. I don’t know how to solve this (though integrating more with the EA community, which has more women, might help), but I think it’s an interesting problem.

And Buck has written his own retrospective of his EA work at the meetups here.

Posted in Uncategorized | Tagged , , | 116 Comments

Mental Mountains


Kaj Sotala has an outstanding review of Unlocking The Emotional Brain; I read the book, and Kaj’s review is better.

He begins:

UtEB’s premise is that much if not most of our behavior is driven by emotional learning. Intense emotions generate unconscious predictive models of how the world functions and what caused those emotions to occur. The brain then uses those models to guide our future behavior. Emotional issues and seemingly irrational behaviors are generated from implicit world-models (schemas) which have been formed in response to various external challenges. Each schema contains memories relating to times when the challenge has been encountered and mental structures describing both the problem and a solution to it.

So in one of the book’s example cases, a man named Richard sought help for trouble speaking up at work. He would have good ideas during meetings, but felt inexplicably afraid to voice them. During therapy, he described his narcissistic father, who was always mouthing off about everything. Everyone hated his father for being a fool who wouldn’t shut up. The therapist conjectured that young Richard observed this and formed a predictive model, something like “talking makes people hate you”. This was overly general: talking only makes people hate you if you talk incessantly about really stupid things. But when you’re a kid you don’t have much data, so you end up generalizing a lot from the few examples you have.

When Richard started therapy, he didn’t consciously understand any of this. He just felt emotions (anxiety) at the thought of voicing his opinion. The predictive model output the anxiety, using reasoning like “if you talk, people will hate you, and the prospect of being hated should make you anxious – therefore, anxiety”, but not any of the intermediate steps. The therapist helped Richard tease out the underlying model, and at the end of the session Richard agreed that his symptoms were related to his experience of his father. But knowing this changed nothing; Richard felt as anxious as ever.

Predictions like “speaking up leads to being hated” are special kinds of emotional memory. You can rationally understand that the prediction is no longer useful, but that doesn’t really help; the emotional memory is still there, guiding your unconscious predictions. What should the therapist do?

Here UtEB dives into the science on memory reconsolidation.

Scientists have known for a while that giving rats the protein synthesis inhibitor anisomycin prevents them from forming emotional memories. You can usually give a rat noise-phobia by pairing a certain noise with electric shocks, but this doesn’t work if the rats are on anisomycin first. Probably this means that some kind of protein synthesis is involved in memory. So far, so plausible.

A 2000 study found that anisomycin could also erase existing phobias in a very specific situation. You had to “activate” the phobia – get the rats thinking about it really hard, maybe by playing the scary noise all the time – and then give them the anisomycin. This suggested that when the memory got activated, it somehow “came loose”, and the brain needed to do some protein synthesis to put it back together again.

Thus the idea of memory reconsolidation: you form a consolidated memory, but every time you activate it, you need to reconsolidate it. If the reconsolidation fails, you lose the memory, or you get a slightly different memory, or something like that. If you could disrupt emotional memories like “speaking out makes you hated” while they’re still reconsolidating, maybe you could do something about this.

Anisomycin is pretty toxic, so that’s out. Other protein synthesis inhibitors are also toxic – it turns out proteins are kind of important for life – so they’re out too. Electroconvulsive therapy actually seems to work pretty well for this – the shock disrupts protein formation very effectively (and the more I think about this, the more implications it seems to have). But we can’t do ECT on everybody who wants to be able to speak up at work more, so that’s also out. And the simplest solution – activating a memory and then reminding the patient that they don’t rationally believe it’s true – doesn’t seem to help; the emotional brain doesn’t speak Rationalese.

The authors of UtEB claim to have found a therapy-based method that works, which goes like this:

First, they tease out the exact predictive model and emotional memory behind the symptom (in Richard’s case, the narrative where his father talked too much and ended up universally hated, and so if Richard talks at all, he too will be universally hated). Then they try to get this as far into conscious awareness as possible (or, if you prefer, have consciousness dig as deep into the emotional schema as possible). They call this “the pro-symptom position” – giving the symptom as much room as possible to state its case without rejecting it. So for example, Richard’s therapist tried to get Richard to explain his unconscious pro-symptom reasoning as convincingly as possible: “My father was really into talking, and everybody hated him. This proves that if I speak up at work, people will hate me too.” She even asked Richard to put this statement on an index card, review it every day, and bask in its compellingness. She asked Richard to imagine getting up to speak, and feeling exactly how anxious it made him, while reviewing to himself that the anxiety felt justified given what happened with his father. The goal was to establish a wide, well-trod road from consciousness to the emotional memory.

Next, they try to find a lived and felt experience that contradicts the model. Again, Rationalese doesn’t work; the emotional brain will just ignore it. But it will listen to experiences. For Richard, this was a time when he was at a meeting, had a great idea, but didn’t speak up. A coworker had the same idea, mentioned it, and everyone agreed it was great, and congratulated the other person for having such an amazing idea that would transform their business. Again, there’s this same process of trying to get as much in that moment as possible, bring the relevant feelings back again and again, create as wide and smooth a road from consciousness to the experience as possible.

Finally, the therapist activates the disruptive emotional schema, and before it can reconsolidate, smashes it into the new experience. So Richard’s therapist makes use of the big wide road Richard built that let him fully experience his fear of speaking up, and asks Richard to get into that frame of mind (activate the fear-of-speaking schema). Then she asks him, while keeping the fear-of-speaking schema in mind, to remember the contradictory experience (coworker speaks up and is praised). Then the therapist vividly describes the juxtaposition while Richard tries to hold both in his mind at once.

And then Richard was instantly cured, and never had any problems speaking up at work again. His coworkers all applauded, and became psychotherapists that very day. An eagle named “Psychodynamic Approach” flew into the clinic and perched atop the APA logo and shed a single tear. Coherence Therapy: Practice Manual And Training Guide was read several times, and God Himself showed up and enacted PsyD prescribing across the country. All the cognitive-behavioralists died of schizophrenia and were thrown in the lake of fire for all eternity.

This is, after all, a therapy book.


I like UtEB because it reframes historical/purposeful accounts of symptoms as aspects of a predictive model. We already know the brain has an unconscious predictive model that it uses to figure out how to respond to various situations and which actions have which consequences. In retrospect, this framing perfectly fits the idea of traumatic experiences having outsized effects. Tack on a bit about how the model is more easily updated in childhood (because you’ve seen fewer other things, so your priors are weaker), and you’ve gone a lot of the way to traditional models of therapy.

But I also like it because it helps me think about the idea of separation/noncoherence in the brain. Richard had his schema about how speaking up makes people hate you. He also had lots of evidence that this wasn’t true, both rationally (his understanding that his symptoms were counterproductive) and experientially (his story about a coworker proposing an idea and being accepted). But the evidence failed to naturally propagate; it didn’t connect to the schema that it should have updated. Only after the therapist forced the connection did the information go through. Again, all of this should have been obvious – of course evidence doesn’t propagate through the brain, I was writing posts ten years ago about how even a person who knows ghosts don’t exist will be afraid to stay in an old supposedly-haunted mansion at night with the lights off. But UtEB’s framework helps snap some of this into place.

UtEB’s brain is a mountainous landscape, with fertile valleys separated by towering peaks. Some memories (or pieces of your predictive model, or whatever) live in each valley. But they can’t talk to each other. The passes are narrow and treacherous. They go on believing their own thing, unconstrained by conclusions reached elsewhere.

Consciousness is a capital city on a wide plain. When it needs the information stored in a particular valley, it sends messengers over the passes. These messengers are good enough, but they carry letters, not weighty tomes. Their bandwidth is atrocious; often they can only convey what the valley-dwellers think, and not why. And if a valley gets something wrong, lapses into heresy, as often as not the messengers can’t bring the kind of information that might change their mind.

Links between the capital and the valleys may be tenuous, but valley-to-valley trade is almost non-existent. You can have two valleys full of people working on the same problem, for years, and they will basically never talk.

Sometimes, when it’s very important, the king can order a road built. The passes get cleared out, high-bandwidth communication to a particular valley becomes possible. If he does this to two valleys at once, then they may even be able to share notes directly, each passing through the capital to get to each other. But it isn’t the norm. You have to really be trying.

This ended out a little more flowery than I expected, but I didn’t start thinking this way because it was poetic. I started thinking this way because of this:

Frequent SSC readers will recognize this as from Figure 1 of Friston and Carhart-Harris’ REBUS And The Anarchic Brain: Toward A Unified Model Of The Brain Action Of Psychedelics, which I review here. The paper describes it as “the curvature of the free-energy landscape that contains neuronal dynamics. Effectively, this can be thought of as a flattening of local minima, enabling neuronal dynamics to escape their basins of attraction and—when in flat minima—express long-range correlations and desynchronized activity.”

Moving back a step: the paper is trying to explain what psychedelics do to the brain. It theorizes that they weaken high-level priors (in this case, you can think of these as the tendency to fit everything to an existing narrative), allowing things to be seen more as they are:

A corollary of relaxing high-level priors or beliefs under psychedelics is that ascending prediction errors from lower levels of the system (that are ordinarily unable to update beliefs due to the top-down suppressive influence of heavily-weighted priors) can find freer register in conscious experience, by reaching and impressing on higher levels of the hierarchy. In this work, we propose that this straightforward model can account for the full breadth of subjective phenomena associated with the psychedelic experience.

These ascending prediction errors (ie noticing that you’re wrong about something) can then correct the high-level priors (ie change the narratives you tell about your life):

The ideal result of the process of belief relaxation and revision is a recalibration of the relevant beliefs so that they may better align or harmonize with other levels of the system and with bottom-up information—whether originating from within (e.g., via lower-level intrinsic systems and related interoception) or, at lower doses, outside the individual (i.e., via sensory input or extroception). Such functional harmony or realignment may look like a system better able to guide thought and behavior in an open, unguarded way (Watts et al., 2017; Carhart-Harris et al., 2018b).

This makes psychedelics a potent tool for psychotherapy:

Consistent with the model presented in this work, overweighted high-level priors can be all consuming, exerting excessive influence throughout the mind and brain’s (deep) hierarchy. The negative cognitive bias in depression is a good example of this (Beck, 1972), as are fixed delusions in psychosis (Sterzer et al., 2018).25 In this paper, we propose that psychedelics can be therapeutically effective, precisely because they target the high levels of the brain’s functional hierarchy, primarily affecting the precision weighting of high-level priors or beliefs. More specifically, we propose that psychedelics dose-dependently relax the precision weighting of high-level priors (instantiated by high-level cortex), and in so doing, open them up to an upsurge of previously suppressed bottom-up signaling (e.g., stemming from limbic circuitry). We further propose that this sensitization of high-level priors means that more information can impress on them, potentially inspiring shifts in perspective, felt as insight. One might ask whether relaxation followed by revision of high-level priors or beliefs via psychedelic therapy is easy to see with functional (and anatomic) brain imaging. We presume that it must be detectable, if the right questions are asked in the right way.

Am I imagining this, or are Friston + Carhart-Harris and Unlocking The Emotional Brain getting at the same thing?

Both start with a piece of a predictive model (= high-level prior) telling you something that doesn’t fit the current situation. Both also assume you have enough evidence to convince a rational person that the high-level prior is wrong, or doesn’t apply. But you don’t automatically smash the prior and the evidence together and perform an update. In UtEB‘s model, the update doesn’t happen until you forge conscious links to both pieces of information and try to hold them in consciousness at the same time. In F+CH’s model, the update doesn’t happen until you take psychedelics which make the high-level prior lose some of its convincingness. UtEB is trying to laboriously build roads through mountains; F+CH are trying to cast a magic spell that makes the mountains temporarily vanish. Either way, you get communication between areas that couldn’t communicate before.


Why would mental mountains exist? If we keep trying to get rid of them, through therapy or psychedelics, or whatever, then why not just avoid them in the first place?

Maybe generalization is just hard (thanks to MC for this idea). Suppose Goofus is mean to you. You learn Goofus is mean; if this is your first social experience, maybe you also learn that the world is mean and people have it out for you. Then one day you meet Gallant, who is nice to you. Hopefully the system generalizes to “Gallant is nice, Goofus is still mean, people in general can go either way”.

But suppose one time Gallant is just having a terrible day, and curses at you, and that time he happens to be wearing a red shirt. You don’t want to overfit and conclude “Gallant wearing a red shirt is mean, Gallant wearing a blue shirt is nice”. You want to conclude “Gallant is generally nice, but sometimes slips and is mean.”

But any algorithm that gets too good at resisting the temptation to separate out red-shirt-Gallant and blue-shirt-Gallant risks falling into the opposite failure mode where it doesn’t separate out Gallant and Goofus. It would just average them out, and conclude that people (including both Goofus and Gallant) are medium-niceness.

And suppose Gallant has brown eyes, and Goofus green eyes. You don’t want your algorithm to overgeneralize to “all brown-eyed people are nice, and all green-eyed people are mean”. But suppose the Huns attack you. You do want to generalize to “All Huns are dangerous, even though I can keep treating non-Huns as generally safe”. And you want to do this as quickly as possible, definitely before you meet any more Huns. And the quicker you are to generalize about Huns, the more likely you are to attribute false significance to Gallant’s eye color.

The end result is a predictive model which is a giant mess, made up of constant “This space here generalizes from this example, except this subregion, which generalizes from this other example, except over here, where it doesn’t, and definitely don’t ever try to apply any of those examples over here.” Somehow this all works shockingly well. For example, I spent a few years in Japan, and developed a good model for how to behave in Japanese culture. When I came back to the United States, I effortlessly dropped all of that and went back to having America-appropriate predictions and reflexive actions (except for an embarrassing habit of bowing whenever someone hands me an object, which I still haven’t totally eradicated).

In this model, mental mountains are just the context-dependence that tells me not to use my Japanese predictive model in America, and which prevents evidence that makes me update my Japanese model (like “I notice subways are always on time”) from contaminating my American model as well. Or which prevent things I learn about Gallant (like “always trust him”) from also contaminating my model of Goofus.

There’s actually a real-world equivalent of the “red-shirt-Gallant is bad, blue-shirt-Gallant is good” failure mode. It’s called “splitting”, and you can find it in any psychology textbook. Wikipedia defines it as “the failure in a person’s thinking to bring together the dichotomy of both positive and negative qualities of the self and others into a cohesive, realistic whole.”

In the classic example, a patient is in a mental hospital. He likes his doctor. He praises the doctor to all the other patients, says he’s going to nominate her for an award when he gets out.

Then the doctor offends the patient in some way – maybe refuses one of his requests. All of a sudden, the doctor is abusive, worse than Hitler, worse than Mengele. When he gets out he will report her to the authorities and sue her for everything she owns.

Then the doctor does something right, and it’s back to praise and love again.

The patient has failed to integrate his judgments about the doctor into a coherent whole, “doctor who sometimes does good things but other times does bad things”. It’s as if there’s two predictive models, one of Good Doctor and one of Bad Doctor, and even though both of them refer to the same real-world person, the patient can only use one at a time.

Splitting is most common in borderline personality disorder. The DSM criteria for borderline includes splitting (there defined as “a pattern of unstable and intense interpersonal relationships characterized by alternating between extremes of idealization and devaluation”). They also include things like “markedly and persistently unstable self-image or sense of self”, and “affective instability due to a marked reactivity of mood”, which seem relevant here too.

Some therapists view borderline as a disorder of integration. Nobody is great at having all their different schemas talk to each other, but borderlines are atrocious at it. Their mountains are so high that even different thoughts about the same doctor can’t necessarily talk to each other and coordinate on a coherent position. The capital only has enough messengers to talk to one valley at a time. If tribesmen from the Anger Valley are advising the capital today, the patient becomes truly angry, a kind of anger that utterly refuses to listen to any counterevidence, an anger pure beyond your imagination. If they are happy, they are purely happy, and so on.

About 70% of people diagnosed with dissociative identity disorder (previously known as multiple personality disorder) have borderline personality disorder. The numbers are so high that some researchers are not even convinced that these are two different conditions; maybe DID is just one manifestation of borderline, or especially severe borderline. Considering borderline as a failure of integration, this makes sense; DID is total failure of integration. People in the furthest mountain valleys, frustrated by inability to communicate meaningfully with the capital, secede and set up their own alternative provincial government, pulling nearby valleys into their new coalition. I don’t want to overemphasize this; most popular perceptions of DID are overblown, and at least some cases seem to be at least partly iatrogenic. But if you are bad enough at integrating yourself, it seems to be the sort of thing that can happen.

In his review, Kaj relates this to Internal Family Systems, a weird form of therapy where you imagine your feelings as people/entities and have discussions with them. I’ve always been skeptical of this, because feelings are not, in fact, people/entities, and it’s unclear why you should expect them to answer you when you ask them questions. And in my attempts to self-test the therapy, indeed nobody responded to my questions and I was left feeling kind of silly. But Kaj says:

As many readers know, I have been writing a sequence of posts on multi-agent models of mind. In Building up to an Internal Family Systems model, I suggested that the human mind might contain something like subagents which try to ensure that past catastrophes do not repeat. In subagents, coherence, and akrasia in humans, I suggested that behaviors such as procrastination, indecision, and seemingly inconsistent behavior result from different subagents having disagreements over what to do.

As I already mentioned, my post on integrating disagreeing subagents took the model in the direction of interpreting disagreeing subagents as conflicting beliefs or models within a person’s brain. Subagents, trauma and rationality further suggested that the appearance of drastically different personalities within a single person might result from unintegrated memory networks, which resist integration due to various traumatic experiences.

This post has discussed UtEB’s model of conflicting emotional schemas in a way which further equates “subagents” with beliefs – in this case, the various schemas seem closely related to what e.g. Internal Family Systems calls “parts”. In many situations, it is probably fair to say that this is what subagents are.

This is a model I can get behind. My guess is that in different people, the degree to which mental mountains form a barrier will cause the disconnectedness of valleys to manifest as anything from “multiple personalities”, to IFS-findable “subagents”, to UtEB-style psychiatric symptoms, to “ordinary” beliefs that don’t cause overt problems but might not be very consistent with each other.


This last category forms the crucial problem of rationality.

One can imagine an alien species whose ability to find truth was a simple function of their education and IQ. Everyone who knows the right facts about the economy and is smart enough to put them together will agree on economic policy.

But we don’t work that way. Smart, well-educated people believe all kinds of things, even when they should know better. We call these people biased, a catch-all term meaning something that prevents them from having true beliefs they ought to be able to figure out. I believe most people who don’t believe in anthropogenic climate change are probably biased. Many of them are very smart. Many of them have read a lot on the subject (empirically, reading more about climate change will usually just make everyone more convinced of their current position, whatever it is). Many of them have enough evidence that they should know better. But they don’t.

(again, this is my opinion, sorry to those of you I’m offending. I’m sure you think the same of me. Please bear with me for the space of this example.)

Compare this to Richard, the example patient mentioned above. Richard had enough evidence to realize that companies don’t hate everyone who speaks up at meetings. But he still felt, on a deep level, like speaking up at meetings would get him in trouble. The evidence failed to connect to the emotional schema, the part of him that made the real decisions. Is this the same problem as the global warming case? Where there’s evidence, but it doesn’t connect to people’s real feelings?

(maybe not: Richard might be able to say “I know people won’t hate me for speaking, but for some reason I can’t make myself speak”, whereas I’ve never heard someone say “I know climate change is real, but for some reason I can’t make myself vote to prevent it.” I’m not sure how seriously to take this discrepancy.)

In Crisis of Faith, Eliezer Yudkowsky writes:

Many in this world retain beliefs whose flaws a ten-year-old could point out, if that ten-year-old were hearing the beliefs for the first time. These are not subtle errors we’re talking about. They would be child’s play for an unattached mind to relinquish, if the skepticism of a ten-year-old were applied without evasion…we change our minds less often than we think.

This should scare you down to the marrow of your bones. It means you can be a world-class scientist and conversant with Bayesian mathematics and still fail to reject a belief whose absurdity a fresh-eyed ten-year-old could see. It shows the invincible defensive position which a belief can create for itself, if it has long festered in your mind.

What does it take to defeat an error that has built itself a fortress?

He goes on to describe how hard this is, to discuss the “convulsive, wrenching effort to be rational” that he thinks this requires, the “all-out [war] against yourself”. Some of the techniques he mentions explicitly come from psychotherapy, others seem to share a convergent evolution with it.

The authors of UtEB stress that all forms of therapy involve their process of reconsolidating emotional memories one way or another, whether they know it or not. Eliezer’s work on crisis of faith feels like an ad hoc form of epistemic therapy, one with a similar goal.

Here, too, there is a suggestive psychedelic connection. I can’t count how many stories I’ve heard along the lines of “I was in a bad relationship, I kept telling myself that it was okay and making excuses, and then I took LSD and realized that it obviously wasn’t, and got out.” Certainly many people change religions and politics after a psychedelic experience, though it’s hard to tell exactly what part of the psychedelic experience does this, and enough people end up believing various forms of woo that I hesitate to say it’s all about getting more rational beliefs. But just going off anecdote, this sometimes works.

Rationalists wasted years worrying about various named biases, like the conjunction fallacy or the planning fallacy. But most of the problems we really care about aren’t any of those. They’re more like whatever makes the global warming skeptic fail to connect with all the evidence for global warming.

If the model in Unlocking The Emotional Brain is accurate, it offers a starting point for understanding this kind of bias, and maybe for figuring out ways to counteract it.

Book Review: All Therapy Books

[Related: CBT In The Water Supply, Scientific Freud, Book Review: Method Of Levels, Different Worlds]


All therapy books start with a claim that their form of therapy will change everything. Previous forms of therapy have required years or even decades to produce ambiguous results. Our form of therapy can produce total transformation in five to ten sessions! Previous forms of therapy have only helped ameliorate the stress of symptoms. Our form of therapy destroys symptoms at the root!

All therapy books bring up the Dodo Bird Verdict – the observation, confirmed in study after study, that all psychotherapies are about equally good, and the only things that matters are “nonspecific factors” like how much patients like their therapist. Some people might think this suggests our form of therapy will only be about as good as other forms. This, all therapy books agree, would be a foolish and perverse interpretation of these findings. The correct interpretation is that all previous forms of therapy must be equally wrong. The only reason they ever produce good results at all is because sometimes therapists accidentally stumble into using our form of therapy, without even knowing it. Since every form of therapy is about equally likely to stumble into using our form of therapy, every other form is equally good. But now that our form of therapy has been formalized and written up, there is no longer any need to stumble blindly! Everyone can just use our form of therapy all the time, for everything! Nobody has ever done a study of our form of therapy. But when they do, it’s going to be amazing! Nobody has even invented numbers high enough to express how big the effect size of our form of therapy is going to be!

Consider the case of Bob. Bob had some standard-issue psychological problem. He had been in and out of therapy for years, tried dozens of different medications, none of them had helped at all. Then he decided to try our form of therapy. In his first session, the therapist asked him “Have you ever considered that your problems might be because of [the kind of thing our form of therapy says all problems are because of]?” Bob started laughing and crying simultaneously, eventually breaking into a convulsive fit. After three minutes, he recovered and proceeded to tell a story of how [everything in his life was exactly in accordance with our form of therapy’s predictions] and he had always reacted by [doing exactly the kind of thing our form of therapy predicts that he would]. Now that all of this was out in consciousness, he no longer felt any desire to have psychological problems. In a followup session two weeks later, the therapist confirmed that he no longer had any psychological problems, and had become the CEO of a Fortune 500 company and a renowned pentathlete.

Not every case goes this smoothly. Consider the case of Sarah. Sarah also has some standard-issue psychological problem. She had also been in and out of therapy for years, tried dozens of different medications, none of them had helped at all. Then she decided to try our form of therapy. In her first session, the therapist asked her “Have you ever considered that your problems might be because of [the kind of thing our form of therapy says all problems are because of]?” Sarah said “No, I don’t think they are.” The therapist asked “Are you sure you’re not just repressing the fact that they totally definitely are, for sure?” As soon as Sarah heard this, she gasped, and her eyes seemed to light up with an inner fire. Then she proceeded to tell a story of how [everything in her life was exactly in accordance with our form of therapy’s predictions] and she had always reacted by [doing exactly the kind of thing our form of therapy predicts that she would], only she was repressing this because she was scared of how powerful she would be if she recovered. Now that all of this was out in consciousness, she no longer felt any desire to have psychological problems. In a followup session two weeks later, the therapist confirmed that she no longer had any psychological problems, and had become the hand-picked successor to the Dalai Lama and the mother of five healthy children.

Previous forms of therapy have failed because they were ungrounded. They were ridiculous mental castles built in the clouds by armchair speculators. But our form of therapy is based on hard science! For example, it probably acts on synapses or the hippocampus or something. Here are three neuroscience papers which vaguely remind us of our form of therapy. One day, neuroscience will catch up to us and realize that the principles of our form of therapy are the principles that govern the organization of the entire brain – if not all of multicellular life.


Maybe I’m being unfair here. I’m basing this off a small sample of therapy books (five textbooks I can think of, plus scattered papers on psychodynamic and psychedelic therapies), and only a subset are quite this bad.

But my basic confusion is this: I work in a clinic with about ten therapists. Some are better than others, but all of them are competent. I send my patients to them. In a few hundred patients I’ve worked with, zero have had the sudden, extraordinary, long-lasting change that the therapy books promise. Many have benefited a little. A few would say that, over the course of years, their lives have been turned around. But sudden complete transformations? Not that much.

Of course, this fits with the therapy books’ perspective. My colleagues practice normal therapy. Sometimes it’s from a boring old school like CBT; other times it’s “eclectic” or “supportive” or any of the other words we use to describe what we’re doing when we don’t know what we’re doing. So maybe there are two sets of therapies: boring old therapies that ordinary people practice, and exciting new therapies that people write glowing books about. And maybe the first set really don’t work (or work only a little), and the second set really is that good.

The problem is, the boring old therapies that everybody uses nowadays inspired equal excitement when they first arose. This is the point that I make in CBT In The Water Supply, and that Oliver Burkeman makes more cogently in Why CBT Is Falling Out Of Favor. Look at therapy books from the 1990s, and they were all about how CBT was a new miracle therapy that would cure your anxiety forever in a few sessions. From a cognitive therapy book:

[When I first learned about cognitive-behavioral therapy, I thought] depression and anxiety seemed far too serious and severe for such a simplistic approach. But when I tried these methods with some of my more difficult patients, my perceptions changed. Patients who’d felt hopeless, worthless, and desperate began to recover. At first, it was hard to believe that the techniques were working, but I could not deny the fact that when my patients learned to put the lie to their negative thoughts, they began to improve. Sometimes they recovered right before my eyes during sessions. Patients who’d felt demoralized and hopeless for years suddenly turned the corner on their problems. I can still recall an elderly French woman who’d been bitterly depressed for more than fifty years, with three nearly-successful suicide attempts, who started shouting “Joie de vivre! Joie de vivre!” (“joy of living”) one day in my office. These experiences made such a strong impact on me that I decided my calling was in clinical work rather than brain research. After considerable soul-searching, I decided to give up my research career and become a full-time clinician. Over the years, I’ve had more than 35,000 psychotherapy sessions with depressed and anxious patients, and I’m every bit as enthusiastic about CBT as when I first began learning about it.

But look at therapy books now, and they’re all people saying “Sure, CBT barely outperforms placebo…but what about this exciting new therapy which blows CBT out of the water?”

Studies reflect this decline:

…with the average studied effect size of CBT shrinking from 2.5 to 1.0 over the course of a generation. People have come up with various explanations for this. Maybe therapist quality is falling – when CBT was the hot new thing, you had to be a really plugged-in up-to-date therapist to have heard about it and to make the effort to retrain in it, so only the best therapists would practice it, but now it’s the default therapy used by everyone who’s just clocking it in. Maybe placebo effect is falling – when people viewed it as an astounding miracle therapy, it got astounding miracle results, but now that it’s lost its luster nobody takes it seriously anymore. Maybe its ideas are spreading, so that patients come into their first session already aware of CBT insights and inoculated against them. Or maybe it’s like all science, where the first studies are done quickly by true believers, and the later studies are done carefully by the Cochrane Collaboration, and so the level of hype naturally goes down.

These explanations have different practical implications. If it’s all about therapist quality and placebo expectations, then you should go get the exciting new therapies described in therapy books, since their unusually-qualified therapists and unusually-high expectations will deliver you the miracle cure you’re looking for.

If it’s just that study quality gets better and better until we realize how crappy the exciting new therapies really are, you might as well get the boring old therapies. At least insurance probably covers them.

And they also have different philosophical implications. If it’s all about therapist quality and placebo expectations, then even if it’s hard to deliver high-quality therapy consistently at scale, it means high-quality therapy is a thing. It means that if enough factors go right at once, therapy can be the kind of powerful tool that cures someone’s life-long psychiatric issues in a few sessions with a high success rate. If this is true it would be fascinating. It would be like saying that bananas cure cancer, but only if they’re really fresh bananas. Even if there are practical issues in getting every cancer patient a banana that’s fresh enough, you still want to take a step back and think “Whoa, what’s up with this?”

I can only say that I’ve had a few patients try the exciting new therapies, and none of them have reported miracle cures. They’ve all maybe gotten a little better over long periods, same as the boring old therapies. This makes me think it’s more likely that early results from the exciting new therapies get oversold, not that some combination of therapist skill and excitement makes them go shockingly well. And the Efficient Market agrees with my low estimation, given that therapists aren’t rushing to learn these new strategies and patients aren’t rushing to use them.

But the therapy books still confuse me. They’re full of stories of incredible instant cures, with the authors assuring us that these are all real and typical of their experience. How can you get this from merely “stretching the truth”, as opposed to outright data falsification? Are therapy book authors blatantly lying? I try to have a really low prior on this sort of thing, but I’m not sure.

Therapy books are often written by the researcher who invented the therapy. I imagine if you invent a therapy yourself, then it perfectly fits your personality and communication style, you believe in it wholeheartedly, and you understand every piece of it from the ground up. You’re also probably a really exceptional and talented person who’s obsessed with psychotherapy and how to make it better. So maybe they get results nobody else can replicate?

But that still raises the philosophical implication of it being possible, for somebody, to consistently produce dramatic change through therapy. This still bothers me a lot.


Most therapy books share some assumptions, so deep as to be unspoken: current problems serve some purpose related to past traumas.

Different therapies take this in different directions. Some view problems as a passive residue of past traumas: for example you were abused as a child, that filled you with stress and rage, and now you take that out on other people and yourself. Others view them as maladaptive learning from past trauma: for example, you were abused as a child, that taught you that other people would hurt you if you opened up to them, so you never open up to anybody. I don’t know the official name for this, but let’s call it historicism: symptoms are the result of something that happened in a patient’s life history.

Some weak forms of historicism are obviously true. Many (though not all) phobias began with a clear incident where the patient was endangered by the phobic object; someone mauled by a dog as a child who then has cynophobia as an adult is hardly a medical mystery. Many (though not all) depressions are precipitated by some depressing event. And post-traumatic stress disorder has the historical perspective right there in the name; at the very least, going through trauma dysregulates something inside you. But it’s a long way from there to saying that a patient’s psychosomatic blindness is caused by persistent shame at having seen their parents having sex thirty years earlier, or something like that.

And some therapy books go beyond historicism into purposefulism: symptoms serve some quasi-logical purpose relating to the life history. I recently read a therapy book that included a case like this. Bob had a history of failing at work. He would go from job to job, making various mistakes and doing crappy work until he got fired. He went to a therapist for help. During the therapy, it came out that Bob’s abusive father had always pushed him really hard to succeed. The therapist suggested that maybe Bob failed at work to send a message to his father; ie to prove that his father’s abusive parenting had been a bad idea and would not make Bob successful. The therapist asked Bob to imagine confronting his father about this. After he worked through his anger at his father, Bob was able to succeed at work. In this story, the apparently dysfunctional symptom (failing at work) ended up having a legible purpose within Bob’s life history (it helped him send a message to his father). Only by teasing out the purpose and finding some other way to achieve it could the dysfunctional behavior be prevented.

A non-historical, non-purposeful account might argue that Bob failed at work because he was bad at work. Maybe he was bad at the specific jobs he was holding (in which case he should get more training). Maybe he was bad at social skills (in which case he should learn to communicate better). Maybe he had ADHD and kept getting distracted (in which he should get treatment for ADHD). In any case, him being bad at work isn’t related to any past traumas or serving any hidden purposes. It’s just an unfortunate fact.

I am constantly worried by the history of how many things we historically applied historical-purposeful reasoning to, totally confident at the time that our explanations made sense – which we now know are not historical-purposeful at all. Psychologists “knew” that autism was caused by distant mothers, and schizophrenia by overbearing mothers, right up until we discovered both conditions are about 80% genetic. And when they “knew” these things, they were able to come up with long lists of how exactly each individual patient fit the mold, and reported great progress by helping patients overcome their maternal attachment issues. Back when homosexuality was considered a disorder, historical-purposeful therapists would tell gay people patients they must be so angry at their mother that they had sworn off all female companionship and switched to men instead as a way of sending her a giant “F–K YOU” message; while homosexuality is mostly not genetic, few people today think this is a plausible explanation.

I sometimes see if I can come up with these kinds of historical-purposeful accounts of my patients’ symptoms. These always fit into place freakishly well – so well that either the historical-purposeful perspective is completely true, or there is some very strong bias that makes it extra-convincing despite its falsehood. But we already know there’s some very strong bias that makes it extra-convincing despite its falsehood! That bias must have been at work in all the therapists who applied historical-purposeful narratives to autistics, schizophrenics, and gays! At some point I notice the road I’m on is littered with skulls and start wondering if I should reconsider.

All therapy books propose an answer: the proof is that the patients get better. But my patients do not get better. When I tell them the historical-purposeful accounts I have devised for their symptoms, they usually shrug and say it sounds plausible and they’ve thought along those lines before, but what are they going to do? When I try all the exciting new therapies on them, they just sort of nod, say that this sounds like an interesting perspective, and then go off and keep having symptoms. It’s very rude!

I’ve told this story before: when I was a teenager, I got really into pseudohistory for a while. What snapped me out of it wasn’t the sober historians, who totally went AWOL on their job of explaining why they were right and the whackos were wrong. It was that a bunch of mutually exclusive pseudohistories all sounded equally plausible: the Pyramids couldn’t have been built by Atlanteans and Lemurians and mole-people! At that point I was able to halt, melt, catch fire, and realize there was something really wrong with my reasoning processes, which I continue to worry about and work on twenty years later.

I bring this up because I’m going to be reviewing some specific psychotherapy books. Each of them on their own can be convincing. But they should be taken in the context of All Therapy Books, which as a category are pretty worrying.

More Intuition-Building On Non-Empirical Science: Three Stories

[Followup to: Building Intuitions On Non-Empirical Arguments In Science]


In your travels, you arrive at a distant land. The chemists there believe that when you mix an acid and a base, you get salt and water, and a star beyond the cosmological event horizon goes supernova. This is taught to every schoolchild as an important chemical fact.

You approach their chemists and protest: why include the part about the star going supernova? Why not just say an acid and a base make salt and water? The chemists find your question annoying: your new “supernova-less” chemistry makes exactly the same predictions as the standard model! You’re just splitting hairs! Angels dancing on pins! Stop wasting their time!

“But the part about supernovas doesn’t constrain expectation!” Yes, say the chemists, but removing it doesn’t constrain expectation either. You’re just spouting random armchair speculation that can never be proven one way or the other. What part of “stop wasting our time” did you not understand?

Moral of the story: It’s too glib to say “There is no difference between theories that produce identical predictions”. You actually care a lot about which of two theories that produce identical predictions is considered true.


Later in your travels, you come to another land. The paleontologists here believe the Devil planted dinosaur fossils to trick humans into doubting Creation.

You approach the paleontologists and argue the same point you argued with the chemists on your last stop – that if two theories make identical predictions, it’s still important to go with the simpler one.

To your surprise, the paleontologists know and agree. “Of course!” they tell you. “And in the dinosaur theory, there must have been, like, millions or even billions of dinosaurs. But the Devil theory explains everything with just one Devil.”

You argue that it doesn’t work that way, but the paleontologists insist that it does. After all, Occam says not to multiply entities beyond necessity. And if the dinosaur theory posits a billion dinosaurs, that’s 999,999,999 more entities than are necessary to explain all those bones.

Moral of the story: “Choose the simpler of two theories that make identical predictions” isn’t trivial. You actually have to understand some philosophy in order to figure out which of two theories is simpler.


You return home and curl up in front of the fire with a good book on quantum mechanics.

Renowned physicist Sean Carroll jumps out from behind you, and exclaims: “Don’t you realize that single-world interpretations of quantum mechanics make both the errors that you fought against abroad?”

You are startled. “This room is locked,” you tell him. “And how did you know what I was doing abroad? Wait a second. Are you secretly the Devil?”

“Untestable, therefore irrelevant!” says Carroll. You wonder if he has always had bright orange eyes. “But being indifferent between ‘wavefunction branches’ and ‘wavefunction branches, and then somewhere we can’t see it one branch mysteriously collapses’ is the same kind of error as being indifferent between ‘acid and base make salt’ and ‘acid and base make salt and water, and then somewhere we can’t see it a star mysteriously goes supernova’.”

He stomps his foot for emphasis, and something falls out of his pocket. Is that a dinosaur bone? He quickly reaches down and pockets it again.

“And,” he adds “preferring collapse interpretations to many-worlds because there are fewer universes – that’s like preferring the Devil theory to dinosaurs because it involves fewer entities. It’s optimizing over the wrong thing! You’re not literally trying to come up with a theory with as few entities as possible! You’re trying to come up with one that has as few extra moving parts as possible. The process that makes wavefunctions collapse is an extra assumption! Now if you’ll excuse me, I’ve got to go plant this” – he taps the bone “in a sedimentary rock formation in China”. He vanishes in a puff of smoke. Can all quantum physicists do that?

Moral of the story: Applying the two previous morals consistently lets you prefer the many worlds interpretation of quantum mechanics without having to worry about this being “untestable”.

Open Thread 141

This is the bi-weekly visible open thread (there are also hidden open threads twice a week you can reach through the Open Thread tab on the top of the page). Post about anything you want, but please try to avoid hot-button political and social topics. You can also talk at the SSC subreddit or the SSC Discord server – and also check out the SSC Podcast. Also:

Comments of the week:

– mtl1882 explains how 19th century “railway spine” was probably an early version of PTSD.

– John Schilling discusses the history of and politics surrounding “shell shock”.

– And from doesntliketocomment: “The unusual feature of the modern world is not that you can be exposed to trauma, it’s that you can be removed from it.”

Posted in Uncategorized | Tagged | 612 Comments

Autism And Intelligence: Much More Than You Wanted To Know

[Thanks to Marco DG for proofreading and offering suggestions]


Several studies have shown a genetic link between autism and intelligence; genes that contribute to autism risk also contribute to high IQ. But studies show autistic people generally have lower intelligence than neurotypical controls, often much lower. What is going on?

First, the studies. This study from UK Biobank finds a genetic correlation between genetic risk for autism and educational attainment (r = 0.34), and between autism and verbal-numerical reasoning (r = 0.19). This study of three large birth cohorts finds a correlation between genetic risk for autism and cognitive ability (beta = 0.07). This study of 45,000 Danes finds that genetic risk for autism correlates at about 0.2 with both IQ and educational attainment. These are just three randomly-selected studies; there are too many to be worth listing.

The relatives of autistic people will usually have many of the genes for autism, but not be autistic themselves. If genes for autism (without autism itself) increase intelligence, we should expect these people to be unusually smart. This is what we find; see Table 4 here. Of 11 types of psychiatric condition, only autism was associated with increased intelligence among relatives. This intelligence is shifted towards technical subjects. About 13% of autistic children (in this sample from whatever social stratum they took their sample from) have fathers who are engineers, compared to only 5% of a group of (presumably well-matched?) control children (though see the discussion here) for some debate over how seriously to take this; I am less sure this is accurate than most of the other statistics mentioned here.

Further (indirect) confirmation of the autism-IQ link comes from evolutionary investigations. If autism makes people less likely to reproduce, why would autism risk genes stick around in the human population? Polimanti and Gelemter (2017) find that autism risk genes aren’t just sticking around. They are being positively selected, ie increasing with every generation, presumably because people with the genes are having more children than people without them. This means autism risk genes must be doing something good. Like everyone else, they find autism risk genes are positively correlated with years of schooling completed, college completion, and IQ. They propose that the reason evolution favors autism genes is that they generally increase intelligence.

But as mentioned before, autistic people themselves on average have lower intelligence. One study found that 69% of autistic people had an IQ below 85 (the average IQ of a high school dropout). Only 3% of autistic people were found to have IQs above 115, even though 15% of the population should be at this level.

These numbers should be taken with very many grains of salt. First, IQ tests don’t do a great job of measuring autistic people. Their intelligence tends to be more imbalanced than neurotypicals’, so IQ tests (which rely on an assumption that most forms of intelligence are correlated) are less applicable. Second, even if the test itself is good, autistic people may be bad at test-taking for other reasons – for example, they don’t understand the directions, or they’re anxious about the social interaction required to answer an examiner’s quetsions. Third, and most important, there is a strong selection bias in the samples of autistic people. Many definitions of autism center around forms of poor functioning which are correlated with low intelligence. Even if the definition is good, people who function poorly are more likely to seek out (or be coerced into) psychiatric treatment, and so are more likely to be identified. In some sense, all “autism has such-and-such characteristics” studies are studying the way people like to define autism, and tell us nothing about any underlying disease process. I talk more about this in parts 2 and 3 here.

But even adjusting for these factors, the autism – low intelligence correlation seems too strong to dismiss. For one thing, the same studies that found that relatives of autistic patients had higher IQs find that the autistic patients themselves have much lower ones. The existence of a well-defined subset of low IQ people whose relatives have higher-than-predicted IQs is a surprising finding that cuts through the measurement difficulties and suggests that this is a real phenomenon.

So what is going on here?


At least part of the story is that there are at least three different causes of autism.

1. The “familial” genes mentioned above: common genes that increase IQ and that evolution positively selects for.

2. Rare “de novo mutations”, ie the autistic child gets a new mutation that their non-autistic parent doesn’t have. These mutations are often very bad, and are quickly selected out of the gene pool (because the people who have them don’t reproduce). But “quickly selected out of the gene pool” doesn’t help the individual person who got one of them, who tends to end up severely disabled. In a few cases, the parent gets the de novo mutation, but for whatever reason doesn’t develop autism, and then passes it onto their child, who does develop autism.

3. Non-genetic factors. The best-studied are probably obstetric complications, eg a baby gets stuck in the birth canal and can’t breathe for a long time. Pollution, infection, and trauma might also be in this basket.

These three buckets and a few other less important factors combine to determine autism risk for any individual. Combining information from a wide variety of studies, Gaugler et al estimate that about 52% of autism risk is attributable to ordinary “familial” genes, 3% to rare “de novo” mutations, 4% to complicated non-additive genetic interaction effects, and 41% “unaccounted”, which may be non-genetic factors or genetic factors we don’t understand and can’t measure. This study finds lower heritability than the usual estimates (which are around 80% to 90%; the authors are embarrassed by this, and in a later study suggest they might just have been bad at determining who in their sample did or didn’t have autism. While their exact numbers are doubtful, I think the overall finding that common familial genes are much more important than rare de novo mutations survives and is important.

Most cases of autism involve all three of these factors; that is, your overall autisticness is a combination of your familial genes, mutations, and environmental risk factors.

One way of resolving the autism-intelligence paradox is to say that familial genes for autism increase IQ, but de novo mutations and environmental insults decrease IQ. This is common-sensically true and matches previous research into all of these factors. So the only question is whether the size of the effect is enough to fully explain the data – or whether, even after adjusting out the degree to which autism is caused by mutations and environment, it still decreases IQ.

Ronemus et al (2014) evaluate this:

They find that even autistic people without de novo mutations have lower-than-average IQ. But they can only screen for de novo mutations they know about, and it could be that they just missed some.

Here’s another set of relevant graphs:

This one comes from Gardner et al (2019), which measures the cognitive ability of the fathers of autistic people and disaggregates those with and without intellectual disability. In Graph A, we see that if a child has autism (but not intellectual disability), their likelihood of having a father with any particular IQ (orange line) is almost the same as the likelihood of a neurotypical child having a father of that IQ (dotted line). Disguised in that “almost” is a very slight tendency for fathers to be unusually intelligent, plus a (statistically insignificant) tendency for them to be unusually unintelligent. For reasons that don’t entirely make sense to me, if instead we look at the likelihood of the father to be a certain intelligence (bottom graph, where dark line surrounded by gray confidence cloud is autistic people’s fathers, and dotted line is neurotypical people’s fathers) it becomes more obvious that more intelligent people are actually a little more likely to have autistic children (though less intelligent people are also more likely.

(remember that “no intellectual disability” just means “IQ over 70”, and so many of these not-intellectually-disabled people may still have low intelligence – I wish the paper had quantified this)

Graph B is the same thing, but with people have have autism with intellectual disability. Now there is a very strong effect towards their fathers being less intelligent than usual.

This confuses me a little. But for me the key point is that high-intelligence fathers show a trend (albeit not significant in this study) to be more likely than average to have children with autism and intellectual disability.

These questions interest me because I know a lot of people who are bright nerdy programmers married to other bright nerdy programmers, and sometimes they ask me if their children are at higher risk for autism. While their children are clearly at higher risk for autistic traits, I think they want to know whether they have higher risk for the most severe forms of the syndrome, including intellectual disability and poor functioning. If we take the Ronemus and Gardner studies seriously, the answer seems to be yes. The Gardner study seems to suggest it’s a very weakly elevated risk, maybe only 1.1x or 1.2x relative risk. But the Gardner study also ceilings off at 90th percentile intelligence, so at this point I’m not sure what to tell these people.


If Ronemus isn’t missing some obscure de novo mutations, then people who get autism solely by accumulation of common (usually IQ-promoting) variants still end up less intelligent than average. This should be surprising; why would too many intelligence-promoting variants cause a syndrome marked by low intelligence? And how come it’s so inconsistent, and many people have naturally high intelligence but aren’t autistic at all?

One possibility would be something like a tower-vs-foundation model. The tower of intelligence needs to be built upon some kind of mysterious foundation. The taller the tower, the stronger the foundation has to be. If the foundation isn’t strong enough for the tower, the system fails, you develop autism, and you get a collection of symptoms possibly including low intelligence. This would explain low-functioning autism from de novo mutations or obstetric trauma (the foundation is so weak that it fails no matter how short the tower is). It would explain the association of genes for intelligence with autism (holding foundation strength constant, the taller the tower, the more likely a failure). And it would also explain why there are many extremely intelligent people who don’t have autism at all (you can build arbitrarily tall towers if your foundation is strong enough).

I’ve only found one paper that takes this model completely seriously and begins speculating on the nature of the foundation. This is Crespi 2016, Autism As A Disorder Of High Intelligence. It draws on the VPR model of intelligence, where g (“general intelligence”) is divided into three subtraits, v (“verbal intelligence”), p (“perceptual intelligence”), and r (“mental rotation ability”) – despite the very specific names each of these represents ability at broad categories of cognitive tasks. Crespi suggests that autism is marked by an imbalance between P (as the tower) and V + R (as the foundation). In other words, if your perceptual intelligence is much higher than your other types of intelligence, you will end up autistic.

It doesn’t really present much evidence for this other than that autistic people seem to have high perceptual intelligence. Also, it doesn’t really look like autistic people are worse at mental rotation. Also, the Gardner paper has analyzed autistic patients’ fathers by subtype of intelligence, and there is a nonsignificant but pretty suggestive tendency for them to have higher-than-normal verbal intelligence; certainly no signs of high verbal intelligence preventing autism. I can’t tell if this is evidence against Crespi or whether since all intellectual abilities are correlated this is just the shadow of their high perceptual intelligence, and if we directly looked at perceptual-to-verbal ratio we would see it was lower than expected. Also also, Crespi is one of those scientists who constantly has much more interesting theories than anyone else (eg), and this makes me suspicious.

Overall I would be surprised if this were the real explanation for the autism-and-intelligence paradox, but it gets an A for effort.


1. The genes that increase risk of autism are disproportionately also genes that increase intelligence, and vice versa (~100% confidence)

2. People diagnosed with autism are less intelligent than average (~100% confidence, leaving aside definitional complications)

3. Some of this effect is because autism is caused both by normal genes and by de novo mutations and environmental insults, and the de novo mutations and environmental insults definitely decrease intelligence. Every autism case is caused by some combination of these three factors, and the more it is caused by normal genes, the more intelligence is likely to be preserved (~100% confidence)

4. This is not the whole story, and even cases of autism that are caused entirely or mostly by normal genetics are associated with unusually low IQ (80% confidence)

5. This can best be understood through a tower-versus-foundation model where higher intelligence that outstrips the ability of some mysterious foundation to support it will result in autism (25% confidence)

6. The specific way the model plays out may be through perceptual intelligence out of balance with verbal and rotational intelligence causing autism (3% confidence)

Book Review: The Body Keeps The Score


The Body Keeps The Score is a book about post-traumatic stress disorder.

The author, Bessel van der Kolk, helped discover the condition and lobby for its inclusion in the DSM, and the brief forays into that history are the best part of the book. Like so many things, PTSD feels self-evident once you know about it. But this took decades of conceptual work by people like van der Kolk, crystallizing some ideas and hacking away at others until they ended up with something legible to the Establishment. Before that there was nothing. It was absolutely shocking how much nothing there was. As soon as the APA officialy recognized PTSD as a diagnosis in 1980, Bessel and his friends applied for a grant from the VA to study it. The grant was rejected on the grounds that (actual quote from the rejection letter) “it has never been shown that PTSD is relevant to the mission of the Veterans Administration”. So the first step in raising awareness of PTSD was – amazingly – convincing the US military that some people might get PTSD from combat.

After the military relented, the next step was convincing everyone else. PTSD was temporarily pigeonholed as “the thing veterans get when they come back from a war”. The next push was convincing people that civilian trauma could have similar effects. It was simple to extend the theory to sudden disasters like fires or violent crimes. But van der Kolk and his colleagues started noticing that a history of child abuse, and especially childhood sexual abuse, correlated with a lot of psychiatric problems later on.

Again, “child abuse is bad” sounds self-evident once you know it. But van der Kolk insists this is the result of hard work by a coalition of psychiatrists, psychologists, activists, and victims. When he first started raising awareness of the problem, nobody believed him. His grant proposal to study whether childhood trauma was associated with personality disorders got rejected too. He recalls that:

I was particularly struck by how many female patients spoke of being sexually abused as children. The standard textbook of psychiatry at the time stated that incest was extremely rare in the United States, cocurring about once in every million women. Given that there were then only about one hundred million women in the United States, I wondered how forty-seven, almost half of them, had found their way to my office in the basement of the hospital.

Furthermore, the textbook said, “There is little agreement about the role of father-daughter incest as a source of serious subsequent psychopathology”…the textbook went on to practically endorse incest, explaining that “such incestuous activity diminishes the subject’s chance of psychosis, and allows for a better adjustment to the external world.”

Van der Kolk found that child abuse (sexual and otherwise) was both far more common and far more destructive than anybody else thought. He also found that it worked differently than regular PTSD. A soldier traumatized during war has already developed a sense of self, and has a concept of a safe homeland to return to if he makes it out alive; a child has neither, and has to deal with trauma again and again absent any trustworthy external support system. This is the same insight some researchers call “complex PTSD”; van der Kolk uses the terms “developmental trauma disorder” and argues it is the real culprit behind many people currently diagnosed with ADHD, bipolar, intermittent explosive disorder, oppositional defiant disorder, etc. He rejects at least some of these diagnoses as “pseudoscience…impressive but meaningless labels”.

A group including Van der Kolk tried to get developmental trauma disorder added to the DSM; the APA decided against it. He denounces this decision, which he thinks ignored several great studies that prove developmental trauma (ie child abuse) is much more important than anyone else thinks. I have a lot of opinions about this section.

First, I think van der Kolk downplays the importance of the APA’s philosophical commitment to categorizing by symptoms rather than cause. Consider four patients, Alice, Bob, Carol, and Dan. Alice has poor concentration caused by child abuse. Bob has poor concentration caused by bad genes. Carol throws tantrums because child abuse. Dan throws tantrums because bad genes. The current DSM would categorize Alice and Bob as ADHD, and Carol and Dan as intermittent explosive disorder. Van der Kolk would like to classify Alice and Carol as having Developmental Trauma Disorder, and Bob and Dan as…I don’t know. Bad Gene Disorder? Seems sketchy. When the APA decides not to do that, they’re not necessarily rejecting the seriousness of child abuse, only saying it’s not the kind of thing they build their categories around.

Second, van der Kolk really does not come across as a great source about the effects of development. He does not mention the possibility that links between parent behavior and child pathology might be genetic (ie a disordered parent is more likely to abuse their child, and to pass on genes for disordered behavior). In fact, he is weirdly and vocally ignorant about genetics in general, dismissing the entire field because “after thirty years and millions upon millions of dollars worth of research, we have failed to find consistent genetic patterns for schizophrenia – or for any psychiatric illness, for that matter”. When TBKtS was published in 2014, we already know with certainty that schizophrenia was about 80% genetic, and at least 15 genes had been identified as especially likely to be involved; today we know hundreds and can even make primitive polygenic predictors. The only gene he considers sympathetically is good old 5-HTTLPR, which he says proves that genes have different effects in children with vs. without abuse histories (like everything else about 5-HTTLPR, this has since been proven false). He shows total lack of interest in behavioral genetics and the challenge it raises to his hypothesis.

This is a very pre-replication crisis book. I don’t hold this against the author, I don’t think anyone’s really proud of what they believed pre-replication crisis, but it’s undoubtedly a product of its time. Mirror neurons, candidate genes, left- vs right-brained people, etc all make dramatic appearances. Nothing (except the genetics parts) are inexcusable or even certainly wrong, but all of them together concern me. And several of the book’s key studies are contradicted by later, larger studies. Van der Kolk talks about how childhood trauma decreases IQ, but some pretty good studies say it doesn’t. Even the studies that have passed the test of time look a little weird. The Adverse Childhood Experiences study found that obesity and other seemingly nonpsychiatric diseases were linked to child abuse, and recent studies confirm this – but the controls for socioeconomic status are always insufficient, and there’s surprisingly little shared environmental component. I’m biased about this, everyone’s biased, but part of the book was meant to prove that child abuse mattered shockingly more than you thought it possibly could, and that part was wasted on me.


Fine, okay, drop that hobby horse, what does this book have to say about PTSD?

The book stressed the variety of responses to PTSD. Some people get anxious. Some people get angry. But a lot of people, whatever their other symptoms, also go completely numb. They are probably still “having” “emotions” “under” “the” “surface”, but they have no perception of them. Sometimes this mental deficit is accompanied by equally surprising bodily deficits. Van der Kolk describes a study on stereoagnosia in PTSD patients: if blindfolded and given a small object (like a key), they are unable to recognize it by feel, even though this task is easy for healthy people. Sometimes this gets even more extreme, like the case of a massage therapy patient who did not realize they were being massaged until the therapist verbally acknowledged she had started.

The book is called The Body Keeps The Score, and it returns again and again to the idea of PTSD patients as disconnected from their bodies. The body sends a rich flow of information to the brain, which is part of what we mean when we say we “feel alive” or “feel like I’m in my body”. In PTSD, this flow gets interrupted. People feel “like nothing”. For example:

I don’t know what I feel, it’s like my head and body aren’t connected. I’m living in a tunnel, a fog, no matter what happens it’s the same reaction – numbness, nothing. Having a bubble bath and being burned or raped is the same feeling.

Or, borrowed from one of William James’ patients:

I have no human sensations. I am surrounded by all that can render life happy and agreeable, still to me the faculty of enjoyment and of feeling is wanting. Each of my senses, each part of my proper self, is as it were separated from me and can no longer afford me any feeling; this impossibility seems to depend upon a void which I feel in the front of my head, and to be due to the diminuition of the sensibility over the whole surface of my body, for it seems to me that I never actually reach the objects that I touch. All this would be a small matter enough, but for its frightful result, which is that of the impossibility of any other kind of feeling and of any sort of enjoyment, although I experience a need and desire of them that render my life an incomprehensible torture.

One other new thing I learned about PTSD is the importance of immobilization. Van der Kolk thinks that traumas are much more likely to cause PTSD when the victim is somehow unable to respond to them. Enemy soldiers shooting at you and you are running away = less likelihood of trauma. Enemy soldiers shooting at you and you are hiding motionless behind a tree = more likelihood of trauma. Speculatively, your body feels like its going into trauma mode hasn’t gotten you to take the right actions, and so the trauma mode cannot end.

There’s some discussion of the neurobiology of all this, but it never really connects with the vividness of the anecdotes. A lot of stuff about how trauma causes the lizard brain to inappropriately activate in ways the rational brain can’t control, how your “smoke detector” can be set to overdrive, all backed up with the proper set of big words like “dorsolateral prefrontal cortex” – but none of it seemed to reach the point where I felt like I was making progress to a gears-level explanation. I felt like the level on which I wanted an explanation of PTSD, and the level at which van der Kolk was explaining PTSD, never really connected; I can’t put it any better than that.

Why does PTSD exist? “The brain isn’t prepared to feel emotions as intense as…” Yes it is! Trauma is as old as living creatures; war, disaster, bullying, and rape far predate homo sapiens. Even if child abuse is rare in hunter-gatherer tribes (as some optimistic anthropologists claim) killing all the adults in a tribe and enslaving their children is pretty common, which cashes out to kids getting abused. Our evolutionary history should have prepared us incredibly well for all of this; the brain “getting stuck” in fear mode after a particularly bad trauma should be no more likely than the legs “getting stuck” in running mode after a particularly long chase.

And why would the body be so confused by the right action being “hide” or “accept the pain and abuse” rather than “run” or “fight”? The safest action has been “hide” or “accept the pain and the abuse” in a pretty good fraction of traumatic events since humanity came down from the trees.

And why should the consequences of this be the body going numb? Why not other things that seem more like the consequences of garden-variety acute or chronic stress?

I missed any answers that TBKtS might have contained to questions like these, and so a lot of its neurobiology ended up feeling more like a random collection of simplified facts than like real enlightenment.


But all of this would be excusable if TBKtS had answered the most important question: how do you treat PTSD? There are a wide variety of proposed methods, and I was looking forward to having an authority like van der Kolk sort through the evidence for and against each.

Instead, I felt like he rejected every conventional treatment on the grounds that they didn’t treat the root problem, then waxed rhapsodic about every single weird alternative treatment and how it was a perfect miracle cure that truly gave patients their lives back. I understand that he may just be presenting the alternative treatments that he found most effective, but something about the style here really turned me off.

There are a lot of alternative treatments for PTSD. Neurofeedback, where you attach yourself to a machine that reads your brain waves and try to explore the effect your thoughts have on brain wave production until you are consciously able to manipulate your neural states. Internal family systems, where a therapist guides you through discovering “parts” of yourself (think a weak version of multiple personalities), and you talk to them, and figure out what they want, and make bargains with them where they get what they want and so stop causing mental illness. Eye movement directed reprocessing (alternative when the book was written, now basically establishment) where you move your eyes back and forth while talking about your trauma, and this seems to somehow help you process it better. Acupuncture. Massage. Yoga.

There was a thing called “PBSP psychomotor therapy”, where the therapist would create “tableaus” representing people’s traumas. They would enlist an actor to play the victim’s abusive father, then another actor to play an idealized version of their father who didn’t abuse them and was always there when they needed them, then have them recite formulaic lines that “played their part” in the remembered (or alternative hypothetical) versions of the patient’s trauma. Gradually they would progress from the real trauma to a version where things had worked out better, with the therapist discussing the patient’s reaction the whole time.

There was a chapter on community theater, where troubled youth who would otherwise be sent to jail were instead asked to put on a Shakespeare production. This encountered some early hitches:

We were shocked to discover that, in scenes where someone was in physical danger, the students always sided with the aggressors. Because they could not tolerate any sign of weakness in themselves, they could not accept it in others. They showed nothing but contempt for potential victims, yelling things like “Kill the bitch, she deserves it,” during a skit about dating violence.

At first some of the actors wanted to give up – it was simply too painful to see how mean these kids were – but they stuck it out, and I was amazed to see how they gradually got the students to experiment, however reluctantly, with new roles. Toward the end of the program, a few students were even volunteering for parts that involved showing vulnerability or fear.

The traumatic incidents in Shakespeare’s work helped them come to terms with their own difficult history:

As we’ve seen, the essence of trauma is feeling godforsaken, cut off from the human race. Theater involves a collective confrontation with the realities of the human condition. As Paul Griffin, discussing his theater program for foster care children, told me: “The stuff of tragedy in theater revolves around coping with betrayal, assault, and destruction. These kids have no trouble understanding what Lear, Othello, Macbeth, or Hamlet is all about.” In Tina Packer’s words: “Everything is about using the whole body and having other bodies resonate with your feelings, emotions, and thoughts.” Theater gives trauma survivors a chance to connect with one another by deeply experiencing their common humanity.”

Each of these stories about an alternative therapy was, on its own, inspiring. But after chapter after chapter on these, plus other even weirder things, you start to wish there was at least one alternative therapy that Bessel van der Kolk didn’t like, or one conventional therapy that he did.

This is a very pre-replication-crisis book. In these more cynical days, we know that the first few studies on any technique – usually done in an atmosphere of frothy excitement, by the technique’s most fervent early adapters – are always highly positive. And later studies – done in an atmosphere of boredom, by large multi-center consortia – are almost always disappointing. Half the time van der Kolk is so excited about the miraculous life-changing potential of the latest alternative therapy that he doesn’t list studies at all. The other half of the time, the studies are there to support his enthusiasm. But can they be trusted?

Overall, so many bizarre methods seemed to work so well (with no examples of anything that didn’t work) that it was hard for me to figure out how this book should affect my treatment decisions. Find the closest person in a robe and wizard hat and send all of my trauma patients to them, because every alternative therapy works equally well as long as it’s weird? This might actually be a good lesson, there are a lot of things in psychiatry where as long as people feel drawn in and “validated” the treatment works. But I’m annoyed I have to ponder this kind of thing on my own rather than have the book take a step back and wonder about these kinds of questions.

[Update, written a few weeks after the rest of this post: maybe it is all wizardry. I recommended this book to a severely traumatized patient of mine, who had not benefited from years of conventional treatment, and who wanted to know more about their condition. The next week the patient came in, claiming to be completely cured, and displaying behaviors consistent with this. They did not use any of the techniques in this book, but said that reading the book helped them figure out an indescribable mental motion they could take to resolve their trauma, and that after taking this mental motion their problems were gone. I’m not sure what to think of this or how much I should revise the negative opinion of this book which I formed before this event.]

Maybe the most consistent lesson from this book’s tour of successful alternative therapies – keeping with the theme of the title – is that it’s important for PTSD patients to get back in touch with their bodies. Massage therapy, yoga, and acupuncture addressed this directly, usually creating gentle, comfortable sensations that patients could take note of to gradually relax the absolute firewall between bodily sensation and conscious processing. Some of the other methods – the community theater, maybe even the internal family systems – seemed like tricks to get people afraid of emotions back in touch with their emotions anyway: “Oh, you’re not going to be feeling your emotions, just emotions from Macbeth or Hamlet or this other personality living in your mind”. I don’t know how plausible this interpretation is.


Overall, I was not too impressed with this book. The highlight was van der Kolk’s personal reminisces from the fight to get PTSD recognized as a real disease – but some of them were so over-the-top that I would have liked to triangulate them with a more objective history. The sections with the symptomatology and neurobiology of PTSD were helpful in exploring the boundaries of the syndrome, but didn’t make me feel like I really understood what was going on. The sections on the dangers of child abuse were a good knock-down of some hypothetical “child abuse isn’t really that bad” position, but I don’t know anyone who holds that position, and some of the research seemed questionable. And the section on treatment was so glowing about everything that it was hard to draw any specific conclusions.

Maybe a broader concern is that I seem to inhabit a different world than van der Kolk. All of his patients showed bizarre and florid sequelae from serious trauma. My patients seem to discuss their trauma with comparative equanimity, have only the usual psychiatric symptoms (depression, anxiety, etc) and not experience much benefit from the weirder alternative therapies they try. Some of this might be van der Kolk being a better doctor than I am, or having sicker patients. But I’m concerned about this because van der Kolk seems pretty good at doing what he does, and I would like to be able to inhabit his world insofar as he’s able to get good results in it. But insofar as my goal is to become more like Bessel van der Kolk, I was surprised how little this book helped guide me along that journey.

I think my actual takeaway is to screen for trauma more carefully, especially in patients who seem anhedonic or numb, and to recommend they go to a trauma clinic. There are a lot of places like this (I sometimes send patients to this one in Berkeley), and they practice a lot of the weirder alternative therapies that van der Kolk mentions (in fact, van der Kolk seems to work at/lead a very similar type of institution in Massachussetts). Whether or not these work for everybody, I think everybody deserves a chance at them, and I should take them more seriously at least until I get a better sense of the terrain here myself.

Building Intuitions On Non-Empirical Arguments In Science


Aeon: Post-Empirical Science Is An Oxymoron And It is Dangerous:

There is no agreed criterion to distinguish science from pseudoscience, or just plain ordinary bullshit, opening the door to all manner of metaphysics masquerading as science. This is ‘post-empirical’ science, where truth no longer matters, and it is potentially very dangerous.

It’s not difficult to find recent examples. On 8 June 2019, the front cover of New Scientist magazine boldly declared that we’re ‘Inside the Mirrorverse’. Its editors bid us ‘Welcome to the parallel reality that’s hiding in plain sight’. […]

[Some physicists] claim that neutrons [are] flitting between parallel universes. They admit that the chances of proving this are ‘low’, or even ‘zero’, but it doesn’t really matter. When it comes to grabbing attention, inviting that all-important click, or purchase, speculative metaphysics wins hands down.

These theories are based on the notion that our Universe is not unique, that there exists a large number of other universes that somehow sit alongside or parallel to our own. For example, in the so-called Many-Worlds interpretation of quantum mechanics, there are universes containing our parallel selves, identical to us but for their different experiences of quantum physics. These theories are attractive to some few theoretical physicists and philosophers, but there is absolutely no empirical evidence for them. And, as it seems we can’t ever experience these other universes, there will never be any evidence for them. As Broussard explained, these theories are sufficiently slippery to duck any kind of challenge that experimentalists might try to throw at them, and there’s always someone happy to keep the idea alive.

Is this really science? The answer depends on what you think society needs from science. In our post-truth age of casual lies, fake news and alternative facts, society is under extraordinary pressure from those pushing potentially dangerous antiscientific propaganda – ranging from climate-change denial to the anti-vaxxer movement to homeopathic medicines. I, for one, prefer a science that is rational and based on evidence, a science that is concerned with theories and empirical facts, a science that promotes the search for truth, no matter how transient or contingent. I prefer a science that does not readily admit theories so vague and slippery that empirical tests are either impossible or they mean absolutely nothing at all.

As always, a single quote doesn’t do the argument justice, so go read the article. But I think this captures the basic argument: multiverse theories are bad, because they’re untestable, and untestable science is pseudoscience.

Many great people, both philosophers of science and practicing scientists, have already discussed the problems with this point of view. But none of them lay out their argument in quite the way that makes the most sense to me. I want to do that here, without claiming any originality or special expertise in the subject, to see if it helps convince anyone else.


Consider a classic example: modern paleontology does a good job at predicting dinosaur fossils. But the creationist explanation – Satan buried fake dinosaur fossils to mislead us – also predicts the same fossils (we assume Satan is good at disguising his existence, so that the lack of other strong evidence for Satan doesn’t contradict the theory). What principles help us realize that the Satan hypothesis is obviously stupid and the usual paleontological one more plausible?

One bad response: paleontology can better predict characteristics of dinosaur fossils, using arguments like “since plesiosaurs are aquatic, they will be found in areas that were underwater during the Mesozoic, but since tyrannosaurs are terrestrial, they will be found in areas that were on land”, and this makes it better than the Satan hypothesis, which can only retrodict these characteristics. But this isn’t quite true: since Satan is trying to fool us into believing the modern paleontology paradigm, he’ll hide the fossils in ways that conform to its predictions, so we will predict plesiosaur fossils will only be found at sea – otherwise the gig would be up!

A second bad response: “The hypothesis that all our findings were planted to deceive us bleeds into conspiracy theories and touches on the problem of skepticism. These things are inherently outside the realm of science.” But archaeological findings are very often deliberate hoaxes planted to deceive archaeologists, and in practice archaeologists consider and test that hypothesis the same way they consider and test every other hypothesis. Rule this out by fiat and we have to accept Piltdown Man, or at least claim that the people arguing against the veracity of Piltdown Man were doing something other than Science.

A third bad response: “Satan is supernatural and science is not allowed to consider supernatural explanations.” Fine then, replace Satan with an alien. I think this is a stupid distinction – if demons really did interfere in earthly affairs, then we could investigate their actions using the same methods we use to investigate every other process. But this would take a long time to argue well, so for now let’s just stick with the alien.

A fourth bad response: “There is no empirical test that distinguishes the Satan hypothesis from the paleontology hypothesis, therefore the Satan hypothesis is inherently unfalsifiable and therefore pseudoscientific.” But this can’t be right. After all, there’s no empirical test that distinguishes the paleontology hypothesis from the Satan hypothesis! If we call one of them pseudoscience based on their inseparability, we have to call the other one pseudoscience too!

A naive Popperian (which maybe nobody really is) would have to stop here, and say that we predict dinosaur fossils will have such-and-such characteristics, but that questions like that process that drives this pattern – a long-dead ecosystem of actual dinosaurs, or the Devil planting dinosaur bones to deceive us – is a mystical question beyond the ability of Science to even conceivably solve.

I think the correct response is to say that both theories explain the data, and one cannot empirically test which theory is true, but the paleontology theory is more elegant (I am tempted to say “simpler”, but that might imply I have a rigorous mathematical definition of the form of simplicity involved, which I don’t). It requires fewer other weird things to be true. It involves fewer other hidden variables. It transforms our worldview less. It gets a cleaner shave with Occam’s Razor. This elegance is so important to us that it explains our vast preference for the first theory over the second.

A long tradition of philosophers of science have already written eloquently about this, summed up by Sean Carroll here:

What makes an explanation “the best.” Thomas Kuhn ,after his influential book The Structure of Scientific Revolutions led many people to think of him as a relativist when it came to scientific claims, attempted to correct this misimpression by offering a list of criteria that scientists use in practice to judge one theory better than another one: accuracy, consistency, broad scope, simplicity, and fruitfulness. “Accuracy” (fitting the data) is one of these criteria, but by no means the sole one. Any working scientist can think of cases where each of these concepts has been invoked in favor of one theory or another. But there is no unambiguous algorithm according to which we can feed in these criteria, a list of theories, and a set of data, and expect the best theory to pop out. The way in which we judge scientific theories is inescapably reflective, messy, and human. That’s the reality of how science is actually done; it’s a matter of judgment, not of drawing bright lines between truth and falsity or science and non-science. Fortunately, in typical cases the accumulation of evidence eventually leaves only one viable theory in the eyes of most reasonable observers.

The dinosaur hypothesis and the Satan hypothesis both fit the data, but the dinosaur hypothesis wins hands-down on simplicity. As Carroll predicts, most reasonable observers are able to converge on the same solution here, despite the philosophical complexity.


I’m starting with this extreme case because its very extremity makes it easier to see the mechanism in action. But I think the same process applies to other cases that people really worry about.

Consider the riddle of the Sphinx. There’s pretty good archaeological evidence supporting the consensus position that it was built by Pharaoh Khafre. But there are a few holes in that story, and a few scattered artifacts suggest it was actually built by Pharaoh Khufu; a respectable minority of archaeologists believe this. And there are a few anomalies which, if taken wildly out of context, you can use to tell a story that it was built long before Egypt existed at all, maybe by Atlantis or aliens.

So there are three competing hypotheses. All of them are consistent with current evidence (even the Atlantis one, which was written after the current evidence was found and carefully adds enough epicycles not to blatantly contradict it). Perhaps one day evidence will come to light that supports one above the others; maybe in some unexcavated tomb, a hieroglyphic tablet says “I created the Sphinx, sincerely yours, Pharaoh Khufu”. But maybe this won’t happen. Maybe we already have all the Sphinx-related evidence we’re going to get. Maybe the information necessary to distinguish among these hypotheses has been utterly lost beyond any conceivable ability to reconstruct.

I don’t want to say “No hypothesis can be tested any further, so Science is useless to us here”, because then we’re forced to conclude stupid things like “Science has no opinion on whether the Sphinx was built by Khafre or Atlanteans,” whereas I think most scientists would actually have very strong opinions on that.

But what about the question of whether the Sphinx was built by Khafre or Khufu? This is a real open question with respectable archaeologists on both sides; what can we do about it?

I think the answer would have to be: the same thing we did with the Satan vs. paleontology question, only now it’s a lot harder. We try to figure out which theory requires fewer other weird things to be true, fewer hidden variables, less transformation of our worldview – which theory works better with Occam’s Razor. This is relatively easy in the Atlantis case, and hard but potentially possible in the Khafre vs. Khufu case.

(Bayesians can rephrase this to: given that we have a certain amount of evidence for each, can we quantify exactly how much evidence, and what our priors for each should be. It would end not with a decisive victory of one or the other, but with a probability distribution, maybe 80% chance it was Khafre, 20% chance it was Khufu)

I think this is a totally legitimate thing for Egyptologists to do, even if it never results in a particular testable claim that gets tested. If you don’t think it’s a legitimate thing for Egyptologists to do, I have trouble figuring out how you can justify Egyptologists rejecting the Atlantis theory.

(Again, Bayesians would start with a very low prior for Atlantis, and assess the evidence as very low, and end up with a probability distribution something like Khafre 80%, Khufu 19.999999%, Atlantis 0.000001%)


How does this relate to things like multiverse theory? Before we get there, one more hokey example:

Suppose scientists measure the mass of one particle at 32.604 units, the mass of another related particle at 204.897 units, and the mass of a third related particle at 145173.870 units. For a while, this is just how things are – it seems to be an irreducible brute fact about the universe. Then some theorist notices that if you set the mass of the first particle as x, then the second is 2πx and the third is 4/3 πx^3. They theorize that perhaps the quantum field forms some sort of extradimensional sphere, the first particle represents a radius of a great circle of the sphere, the second the circumference of the great circle, and the third the volume of the sphere.

(please excuse the stupidity of my example, I don’t know enough about physics to come up with something that isn’t stupid, but I hope it will illustrate my point)

In fact, imagine that there are a hundred different particles, all with different masses, and all one hundred have masses that perfectly correspond to various mathematical properties of spheres.

Is the person who made this discovery doing Science? And should we consider their theory a useful contribution to physics?

I think the answer is clearly yes. But consider what this commits us to. Suppose the scientist came up with their Extradimensional Sphere hypothesis after learning the masses of the relevant particles, and so it has not predicted anything. Suppose the extradimensional sphere is outside normal space, curled up into some dimension we can’t possibly access or test without a particle accelerator the size of the moon. Suppose there are no undiscovered particles in this set that can be tested to see if they also reflect sphere-related parameters. This theory is exactly the kind of postempirical, metaphysical construct that the Aeon article savages.

But it’s really compelling. We have a hundred different particles, and this theory retrodicts the properties of each of them perfectly. And it’s so simple – just say the word “sphere” and the rest falls out naturally! You would have to be crazy not to think it was at least pretty plausible, or that the scientist who developed it had done some good work.

Nor do I think it seems right to say “The discovery that all of our unexplained variables perfectly match the parameters of a sphere is good, but the hypothesis that there really is a sphere is outside the bounds of Science.” That sounds too much like saying “It’s fine to say dinosaur bones have such-and-such characteristics, but we must never speculate about what kind of process produced them, or whether it involved actual dinosaurs”.


My understanding of the multiverse debate is that it works the same way. Scientists observe the behavior of particles, and find that a multiverse explains that behavior more simply and elegantly than not-a-multiverse.

One (doubtless exaggerated) way I’ve heard multiverse proponents explain their position is like this: in certain situations the math declares two contradictory answers – in the classic example, Schrodinger’s cat will be both alive and dead. But when we open the box, we see only a dead cat or an alive cat, not both. Multiverse opponents say “Some unknown force steps in at the last second and destroys one of the possibility branches”. Multiverse proponents say “No it doesn’t, both possibility branches happen exactly the way the math says, and we end up in one of them.”

Taking this exaggerated dumbed-down account as exactly right, this sounds about as hard as the dinosaurs-vs-Satan example, in terms of figuring out which is more Occam’s Razor compliant. I’m sure the reality is more nuanced, but I think it can be judged by the same process. Perhaps this is the kind of reasoning that only gets us to a 90% probability there is a multiverse, rather than a 99.999999% one. But I think determining that theories have 90% probability is a reasonable scientific thing to do.


At times, the Aeon article seems to flirt with admitting that something like this is necessary:

Such problems were judged by philosophers of science to be insurmountable, and Popper’s falsifiability criterion was abandoned (though, curiously, it still lives on in the minds of many practising scientists). But rather than seek an alternative, in 1983 the philosopher Larry Laudan declared that the demarcation problem is actually intractable, and must therefore be a pseudo-problem. He argued that the real distinction is between knowledge that is reliable or unreliable, irrespective of its provenance, and claimed that terms such as ‘pseudoscience’ and ‘unscientific’ have no real meaning.

But it always jumps back from the precipice:

So, if we can’t make use of falsifiability, what do we use instead? I don’t think we have any real alternative but to adopt what I might call the empirical criterion. Demarcation is not some kind of binary yes-or-no, right-or-wrong, black-or-white judgment. We have to admit shades of grey. Popper himself was ready to accept this, [saying]:

“The criterion of demarcation cannot be an absolutely sharp one but will itself have degrees. There will be well-testable theories, hardly testable theories, and non-testable theories. Those which are non-testable are of no interest to empirical scientists. They may be described as metaphysical.”

Here, ‘testability’ implies only that a theory either makes contact, or holds some promise of making contact, with empirical evidence. It makes no presumptions about what we might do in light of the evidence. If the evidence verifies the theory, that’s great – we celebrate and start looking for another test. If the evidence fails to support the theory, then we might ponder for a while or tinker with the auxiliary assumptions. Either way, there’s a tension between the metaphysical content of the theory and the empirical data – a tension between the ideas and the facts – which prevents the metaphysics from getting completely out of hand. In this way, the metaphysics is tamed or ‘naturalised’, and we have something to work with. This is science.

But as we’ve seen, many things we really want to include as science are not testable: our credence for real dinosaurs over Satan planting fossils, our credence for Khafre building the Sphinx over Khufu or Atlanteans, or elegant patterns that explain the features of the universe like the Extradimensional-Sphere Theory.

The Aeon article is aware of Carroll’s work – which, along with the paragraph quoted in Section II above, includes a lot of detailed Bayesian reasoning encompassing everything I’ve discussed. But the article dismisses it in a few sentences:

Sean Carroll, a vocal advocate for the Many-Worlds interpretation, prefers abduction, or what he calls ‘inference to the best explanation’, which leaves us with theories that are merely ‘parsimonious’, a matter of judgment, and ‘still might reasonably be true’. But whose judgment? In the absence of facts, what constitutes ‘the best explanation’?

Carroll seeks to dress his notion of inference in the cloth of respectability provided by something called Bayesian probability theory, happily overlooking its entirely subjective nature. It’s a short step from here to the theorist-turned-philosopher Richard Dawid’s efforts to justify the string theory programme in terms of ‘theoretically confirmed theory’ and ‘non-empirical theory assessment’. The ‘best explanation’ is then based on a choice between purely metaphysical constructs, without reference to empirical evidence, based on the application of a probability theory that can be readily engineered to suit personal prejudices.

“A choice between purely metaphysical constructs, without reference to empirical evidence” sounds pretty bad, until you realize he’s talking about the same reasoning we use to determine that real dinosaurs are more likely than Satan planting fossils.

I don’t want to go over the exact ways in which Bayesian methods are subjective (which I think are overestimated) vs. objective. I think it’s more fruitful to point out that your brain is already using Bayesian methods to interpret the photons striking your eyes into this sentence, to make snap decisions about what sense the words are used in, and to integrate them into your model of the world. If Bayesian methods are good enough to give you every single piece of evidence about the nature of the external world that you have ever encountered in your entire life, I say they’re good enough for science.

Or if you don’t like that, you can use the explanation above, which barely uses the word “Bayes” at all and just describes everything in terms like “Occam’s Razor” and “you wouldn’t want to conclude something like that, would you?”

I know there are separate debates about whether this kind of reasoning-from-simplicity is actually good enough, when used by ordinary people, to consistently arrive at truth. Or whether it’s a productive way to conduct science that will give us good new theories, or a waste of everybody’s time. I sympathize with some these concerns, though I am nowhere near scientifically educated enough to have an actual opinion on the questions at play.

But I think it’s important to argue that even before you describe the advantages and disadvantages of the complicated Bayesian math that lets you do this, something like this has to be done. The untestable is a fundamental part of science, impossible to remove. We can debate how to explain it. But denying it isn’t an option.