Genetic Testing and Self-Fulfilling Prophecies

Lineweaver et al tested 144 elderly adults for the ApoE4 gene, which is known to be a major risk factor for Alzheimers. They told half of them their test results, kept it secret from the other half, then waited. Eight months later, they asked people how they thought their memory was doing and gave everyone objective memory tests.

No one in the study population had Alzheimers yet, so everyone did okay on the memory test. But subjects who knew they had ApoE4 did significantly worse than subjects who did have ApoE4 but didn’t know it. The subjects who knew they didn’t have ApoE4 didn’t do any better on the memory test than other subjects who didn’t have ApoE4 but had not yet heard the good news, but they did give subjectively better ratings of their memory ability.

The medical community concludes from this that letting people know their genetic risks may be dangerous, and although I hate to admit it, they have a point.

Unfortunately, the study doesn’t give the methodological details I need to really understand the implications.

We know that the researchers waited eight months between giving the genetic test results and doing the memory tests. That’s good.

But was it the same researcher doing both the genes and the memory parts of the study? Was it in the same building? Did they start the memory tests by saying “Hi! I’m Dr. Lineweaver! You may remember me from such medical experiments as the one eight months ago in which you were told that you had a high risk of getting Alzheimer’s disease”?

Or did they sneakily pretend to be a separate study entirely and try to avoid mentioning the A-word throughout?

It would not surprise me if – having been primed with a reminder of their Alzheimer testing results – the subjects then performed worse on a memory test that was given immediately after in an obviously related context.

It would be much more surprising – though still not totally unbelievable – if subjects, having been told they had a high risk of Alzheimers, just went around for eight months having slightly worse memory which was reflected on everything they did including the memory test administered by the researchers.

New England Journal of Medicine compares the finding to “stereotype threat”, the phenomenon in which people can for example sometimes make women perform worse on math tests simply by telling them that it is a “test of their innate mathematical abilities” – something that women are stereotypically bad at.

The memory tests the researchers were giving are equivalent to the “innate mathematical abilities” condition in the stereotype threat research – a test clearly intended to measure how good their memory was in a very scientific way. The activities of daily living that require memory – keeping appointments, paying bills on time, et cetera – are the equivalent of the condition in stereotype threat experiments where researchers just give women a normal math test without introduction and stereotype threat is not seen.

So I see two ways in which we could get results like the ones in this study without any broader implications of ApoE4 testing harming the elderly in general.

First, being called to the same study in which the ApoE4 results were given could have primed their worries about Alzheimers and made them do especially bad on the study’s memory test compared to their usual memory.

Second, the study’s memory test could have been official-looking enough that it activated their stereotype of themselves as having innately poor memory, when, concordant with stereotype threat research, that stereotype doesn’t harm their everyday memory-requiring activities.

In either of these cases, the study would have some very limited implications, which the authors describe in an appropriately circumscribed way: “The patient’s knowledge of his or her genotype and risk of Alzheimer’s disease should be considered when evaluating cognition in the elderly.”

But this would not imply that genetic testing elderly people for ApoE4 is risky and can itself cause them to develop forgetfulness and other Alzheimer’s symptoms.

I worry that the medical community is going to miss this subtlety and start “raising awareness” of the possibility that genetic testing can cause harmful side effects, before finishing the hard task of discovering if that’s actually true.

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

31 Responses to Genetic Testing and Self-Fulfilling Prophecies

  1. Jason says:

    From your description, they never showed that genetics had anything to do with the results — they just tested whether hearing a medical prediction affects people. So we might just as well resolve to stop telling people that smoking causes cancer.

    If they wanted to know whether genetic predictions are bad for people to hear, the control should have been a fake test for some non-genetic predictor of memory. (“Your hippocratic enzyme count shows you’re at an increased risk for Alzheimers…”)

    They could also have lied to some of the participants about their test results, separating out whether the genes had anything to do with it, or merely the belief that they possess a risk factor of any sort.

    I predict that the fake predictor would have just as strong an effect as the real one; it’s placebos all the way down. So we can decide to stop telling patients anything about their future health (ban do-it-yourself blood pressure tests!), or not. But this gives us no reason to single out genetic testing for censorship.

    • Creutzer says:

      It’s thinkable that knowledge of genetics would cause stronger placebo (or rather, nocebo) effects because people perceive it as unchangeable. But of course you’re right that there is no knowledge to single out genetic testing before this has been checked.

  2. Charlie says:

    What kind of sweet name for a genetically-inherited-disease researcher is ‘Lineweaver,’ anyhow?

  3. Multiheaded says:

    Woah, dude! (I’m not intelligent enough to comment better on this.)

  4. Nestor says:

    I recall idly wondering sometime last year if the nocebo concept could be used in the context of the US’s notoriously litigation happy environment to stick it to insurance companies. There must be a ton of outcomes that are made worse by telling people they owe xxx.xxxx$ for the procedure… cue class action suit.

    • Multiheaded says:

      It’d also make for a typical Alternet headline:

      Why Capitalism is the Hidden Deadly Complication to Every Disease in America

    • Randy M says:

      Then again, maybe seeing a high bill would convince the patient the surgery was more efficacious?

  5. Q says:

    My subjective view is, that I myself would be unnecessarily worried, if I knew I had a genetic disposition on Alzheimer. The self fulfilling prophecy would work on me. I would be interested in some version of genetic testing, where only highly preventable things would be reported. For instance, I would like to know about single allele mutations with high penetrance, which my children could inherit, to check my partner for it afterwards. Maybe I would be even OK knowing about BRCA mutation, and possibly decide for double mastectomy. But I am not sure what good the knowledge of Alzheimer disposition would be for me. I mean, are there any quidelines what action to take, once you have the mutation ? Do they make much difference ?

    • Kibber says:

      Just a data point: for this precise reason, 23andme was “locking” such information by default and required some deliberate actions on user’s part to unlock it. Those who don’t want to know – don’t have to know.

  6. Chris says:

    New England Journal of Medicine compares the finding to “stereotype threat”, the phenomenon in which people can for example sometimes make women perform worse on math tests simply by telling them that it is a “test of their innate mathematical abilities” – something that women are stereotypically bad at.

    I can’t see the paywalled NEJM article, but this is disappointing behavior from NEJM. The studies that claimed to measure the existence of stereotype threat, and this study in particular, have been failing to reproduce for a long time now, such that the entire theory is significantly discredited — at this point, I think there is more evidence for the existence of the *opposite* of stereotype threat than the original effect itself. It’s a canonical example of the reproducibility crisis, as far as I can see.

    In summary: everything is broken and the NEJM is broken because they report about broken things without knowing that they’re broken, and this APOE4 study is also likely broken, and science doesn’t work and we should probably all just stick to engineering instead.

    • Scott Alexander says:

      I spent a while looking into stereotype threat a couple months ago, and the impression I got was that while there were a lot of studies that couldn’t find it, there were even more that could (I very carefully included the word “sometimes” in my description of it above).

      This is a good site to start with if you want to sift through the data; it does seem like it was set up by believers, but it does a good job presenting evidence from both sides (unless of course there is even more evidence that they don’t present, in which case I kind of give up).

      Insofar as I could find any pattern in the results, stereotype threat seems to happen quite often in toy settings like a laboratory where people are told they are going to get “special innate intelligence tests” and very rarely in the real world with tests that matter. Those toy settings seem very similar to this study and its clinical memory tests, which do not affect the taker in any way but are billed as measuring innate cognitive attributes. So I am more likely to accept stereotype threat as an explanation here than I would for certain other things.

      I have “write a review of the stereotype threat literature” on my list of things to do the next time I am so desperate for attention that I want people from both sides of the social justice wars yelling at me. In the meantime I am somewhat skeptical of existing review articles as they tend to be written by very biased people.

      • Douglas Knight says:

        Yes, there’s lots of data that they don’t present, at least on their “criticisms” page. The complaints they list are all about the real-world validity of the construct, and not about whether the experiments are reproducible. That matches the first three paragraphs of the criticism page Chris cited, but wikipedia goes on to make Chris’s complaints with citations.

        Also, there are rumors of huge file drawers of unpublishable non-replications.

        • Scott Alexander says:

          I don’t want rumors, I want funnel plots. The only one I know of says it’s symmetrical.

        • Douglas Knight says:

          If you don’t want rumors, ignore my last paragraph. But that’s no excuse for ignoring my first paragraph, which is to say, ignoring Chris’s comment.

        • Douglas Knight says:

          That’s an odd way to phrase it. Do you want funnel plots, or do you want people to tell you what to think about funnel plots? Sure, the authors tell you it’s symmetric, but I tell you it’s asymmetric.

        • Douglas Knight says:

          Oops…the authors link was supposed to go to the paper you cited, not to the paper wikipedia cites.

          And if you don’t like rumors of unpublished papers, why do you link to rumors of privately circulated manuscripts?

        • Scott Alexander says:

          I’m not ignoring your first paragraph. I’m familiar with two or three big studies/meta-analyses showing no (or minor) stereotype threat – Geary and Stoat and those two on Gelman’s blog. I’m also familiar with several dozen studies and analyses that do show evidence of stereotype threat. This is about where I started.

          The authors say “relatively symmetrical”, which fits my naked-eye estimate, and say that higher sample size studies are more likely to get a positive result, which is usually a good sign.

        • Douglas Knight says:

          What is your definition of “minor” effect size?

        • Scott Alexander says:

          I guess it would be .2 or .3. Why?

        • Douglas Knight says:

          Because the funnel plot you cite sure looks like the studies with large sample size have effect sizes less than 0.2. Come on, look at it.

        • Scott Alexander says:

          Okay, it’s been a while since I’ve had to read a funnel plot, so let me see if I understand what you’re saying.

          Because the top of the funnel is close to zero, that means the most likely effect size value for stereotype threat is close to zero. Insofar as the funnel is slightly asymmetrical skewed to the negative direction, that means there is probably some publication bias pushing results to be more negative. Insofar as the funnel isn’t too asymmetrical, that means there’s probably not too much publication bias. Is that right?

        • Douglas Knight says:

          Yes, looking at the top of the funnel plot amounts to looking only at large studies. Yes, insofar as those counterfactuals hold, that is how you should interpret them. But it is hugely skewed. The effect size for studies under 100 people is a full standard deviation different from the effect size for studies over 100 people.

        • Douglas Knight says:

          Also, the meta-analysis Chris cited compared published studies to unpublished ones and found massive publication bias. The study you cited is half unpublished studies, but I don’t see them making the comparison, or making a funnel for just published studies. In particular, the second largest study wasn’t published.

  7. St. Rev says:

    Isn’t there stronger evidence for a link between survival threat and cognitive performance? This may not evidence of ‘remembering that you’re at risk to become stupid’ depressing performance, so much as ‘being reminded that you’re probably going to die in a particularly horrible way’ depressing performance.

    • St. Rev says:

      ETA: In other words, if you ran a similar experiment but with a gene for, say, cancer risk, would you see a different outcome?

      • Douglas Knight says:

        That’s a good hypothesis, but your proposed experiment is largely subsumed by Scott’s proposal to avoid making an acute reminder at all.

  8. I suspect there’s a similar nocebo effect when people attribute memory lapses to aging, but how would you test it?

  9. pwyll says:

    Scott, glad to hear you’ve already started delving into the question of whether stereotype threat exists. Here are a couple of links referring to studies that you’re probably already aware of, but just in case you’re not:
    All of these argue against its existence; unfortunately I’m not familiar enough with the topic to know what the best arguments *in favor* of the existence of stereotype threat are.

    • Scott Alexander says:

      Yes, I’ve seen those posts. Unfortunately, they all rely on word of mouth from people in the field and unpublished papers (the only link is to a presentation at a conference that doesn’t have a transcript). One of the researchers involved apparently submitted his result for publication, but it’s been years since then and no word of it.

  10. Pingback: More Links For January | Slate Star Codex