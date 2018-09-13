The collective intellect is change-blind. Knowledge gained seems so natural that we forget what it was like not to have it. Piaget says children gain long-term memory at age 4 and don’t learn abstract thought until ten; do you remember what it was like not to have abstract thought? We underestimate our intellectual progress because every every sliver of knowledge acquired gets backpropagated unboundedly into the past.
For decades, people talked about “the gene for height”, “the gene for intelligence”, etc. Was the gene for intelligence on chromosome 6? Was it on the X chromosome? What happens if your baby doesn’t have the gene for intelligence? Can they still succeed?
Meanwhile, the responsible experts were saying traits might be determined by a two-digit number of genes. Human Genome Project leader Francis Collins estimated that there were “about twelve genes” for diabetes, and “all of them will be discovered in the next two years”. Quanta Magazine reminds us of a 1999 study which claimed that “perhaps more than fifteen genes” might contribute to autism. By the early 2000s, the American Psychological Association was a little more cautious, was saying intelligence might be linked to “dozens – if not hundreds” of genes.
The most recent estimate for how many genes are involved in complex traits like height or intelligence is approximately “all of them” – by the latest count, about twenty thousand. From this side of the veil, it all seems so obvious. It’s hard to remember back a mere twenty or thirty years ago, when people earnestly awaited “the gene for depression”. It’s hard to remember the studies powered to find genes that increased height by an inch or two. It’s hard to remember all the crappy p-hacked results that okay, we found the gene for extraversion, here it is! It’s hard to remember all the editorials in The Guardian about how since nobody had found the gene for IQ yet, genes don’t matter, science is fake, and Galileo was a witch.
And even remembering those times, they seem incomprehensible. Like, really? Only a few visionaries considered the hypothesis that the most complex and subtle of human traits might depend on more than one protein? Only the boldest revolutionaries dared to ask whether maybe cystic fibrosis was not the best model for the entirety of human experience?
This side of the veil, instead of looking for the “gene for intelligence”, we try to find “polygenic scores”. Given a person’s entire genome, what function best predicts their intelligence? The most recent such effort uses over a thousand genes and is able to predict 10% of variability in educational attainment. This isn’t much, but it’s a heck of a lot better than anyone was able to do under the old “dozen genes” model, and it’s getting better every year in the way healthy paradigms are supposed to.
Genetics is interesting as an example of a science that overcame a diseased paradigm. For years, basically all candidate gene studies were fake. “How come we can’t find genes for anything?” was never as popular as “where’s my flying car?” as a symbol of how science never advances in the way we optimistically feel like it should. But it could have been.
And now it works. What lessons can we draw from this, for domains that still seem disappointing and intractable?
Turn-of-the-millennium behavioral genetics was intractable because it was more polycausal than anyone expected. Everything interesting was an excruciating interaction of a thousand different things. You had to know all those things to predict anything at all, so nobody predicted anything and all apparent predictions were fake.
Modern genetics is healthy and functional because it turns out that although genetics isn’t easy, it is simple. Yes, there are three billion base pairs in the human genome. But each of those base pairs is a nice, clean, discrete unit with one of four values. In a way, saying “everything has three billion possible causes” is a mercy; it’s placing an upper bound on how terrible genetics can be. The “secret” of genetics was that there was no “secret”. You just had to drop the optimistic assumption that there was any shortcut other than measuring all three billion different things, and get busy doing the measuring. The field was maximally perverse, but with enough advances in sequencing and computing, even the maximum possible level of perversity turned out to be within the limits of modern computing.
(this is an oversimplification: if it were really maximally perverse, chaos theory would be involved somehow. Maybe a better claim is that it hits the maximum perversity bound in one specific dimension)
One possible lesson here is that the sciences where progress is hard are the ones that have what seem like an unfair number of tiny interacting causes that determine everything. We should go from trying to discover “the” cause, to trying to find which factors we need to create the best polycausal model. And we should go from seeking a flash of genius that helps sweep away the complexity, to figuring out how to manage complexity that cannot be swept away.
Late-90s/early-00s psychiatry was a lot like late-90s/early-00s genetics. The public was talking about “the cause” of depression: serotonin. And the responsible experts were saying oh no, depression might be caused by as many as several different things.
Now the biopsychosocial model has caught on and everyone agrees that depression is complicated. I don’t know if we’re still at the “dozens of things” stage or the “hundreds of things stage”, but I don’t think anyone seriously thinks it’s fewer than a dozen. The structure of depression seems different from the structure of genetic traits in that one cause can still have a large effect; multiple sclerosis might explain less than 1% of the variance in depressedness, but there will be a small sample of depressives whose condition is almost entirely because of multiple sclerosis. But overall, I think the analogy to genetics is a good one.
If this is true, what can psychiatry (and maybe other low-rate-of-progress sciences) learn from genetics?
One possible lesson is: there are more causes than you think. Stop looking for “a cause” or “the ten causes” and start figuring out ways to deal with very numerous causes.
There are a bunch of studies that are basically like this one linking depression to zinc deficiency. They are good as far as they go, but it’s hard to really know what to do with them. It’s like finding one gene for intelligence. Okay, that explains 0.1% of the variability, now what?
We might imagine trying to combine all these findings into a polycausal score. Take millions of people, measure a hundred different variables – everything from their blood zinc levels, to the serotonin metabolites in their spinal fluid, to whether their mother loved them as a child – then do statistics on them and see how much of the variance in depression we can predict based on the inputs. “Do statistics on them” is a heck of a black box; genes are kind of pristine and causally unidirectional, but all of these psychological factors probably influence each other in a hundred different ways. In practice I think this would end up as a horribly expensive boondoggle that didn’t work at all. But in theory I think this is what a principled attempt to understand depression would look like.
(“understand depression” might be the wrong term here; it conflates being able to predict a construct with knowing what real-world phenomenon the construct refers to. We are much better at finding genes for intelligence than at understanding exactly what intelligence is, and whether it’s just a convenient statistical construct or a specific brain parameter. By analogy, we can imagine a Martian anthropologist who correctly groups “having a big house”, “driving a sports car”, and “wearing designer clothes” into a construct called “wealth”, and is able to accurately predict wealth from a model including variables like occupation, ethnicity, and educational attainment – but who doesn’t understand that wealth = having lots of money. I think it’s still unclear to what degree intelligence and depression have a simple real-world wealth-equals-lots-of-money style correspondence – though see here and here.)
A more useful lesson might be skepticism about personalized medicine. Personalized medicine – the idea that I can read your genome and your blood test results and whatever and tell you what antidepressant (or supplement, or form of therapy) is right for you has been a big idea over the past decade. And so far it’s mostly failed. A massively polycausal model would explain why. The average personalized medicine company gives you recommendations based on at most a few things – zinc levels, gut flora balance, etc. If there are dozens or hundreds of things, then you need the full massively polycausal model – which as mentioned before is computationally intractable at least without a lot more work.
(you can still have some personalized medicine. We don’t have to know the causes of depression to treat it. You might be depressed because your grandfather died, but Prozac can still make you feel better. So it’s possible that there’s a simple personalized monocausal way to check who eg responds better to Prozac vs. Lexapro, though the latest evidence isn’t really bullish about this. But this seems different from a true personalized medicine where we determine the root cause of your depression and fix it in a principled way.)
Even if we can’t get much out of this, I think it can be helpful just to ask which factors and sciences are oligocausal vs. massively polycausal. For example, what percent of variability in firm success are economists able to determine? Does most of the variability come from a few big things, like talented CEOs? Or does most of it come from a million tiny unmeasurable causes, like “how often does Lisa in Marketing get her reports in on time”?
Maybe this is really stupid – I’m neither a geneticist or a statistician – but I imagine an alien society where science is centered around polycausal scores. Instead of publishing a paper claiming that lead causes crime, they publish a paper giving the latest polycausal score for predicting crime, and demonstrating that they can make it much more accurate by including lead as a variable. I don’t think you can do this in real life – you would need bigger Big Data than anybody wants to deal with. But like falsifiability and compressability, I think it’s a useful thought experiment to keep in mind when imagining what science should be like.
I mostly endorse this.
—
What you call “polycausal models” is what we call “causal models,” and is mostly what we work on, in practice.
—
Looking for a single “gene for intelligence” is extra funny to me, because it’s a multi-cause, multi-effect problem. It’s looking for something that can’t exist, sort of by definition.
—
“Personalized medicine” is defined in different ways by different people. I can suggest some reading, not all work in this space uniformly deserves skepticism.
Sorry to be a bit off the main topic, but Piaget is very obviously wrong; I have many long-term memories from before 4 (as early as 2), and definitely had abstract thoughts before 10; I very distinctly remember thinking about things like the nature of space and time, among many other things. I’d like to see how they came to that conclusion, because whatever they based it on is obviously bogus. I guess it’s possible that I’m some kind of extreme rarity in this regard, but I seriously doubt that.
I have no intention to gaslight you, but I wonder how you’re so sure those memories are real. For years I thought I had scattered memories from my far youth, but recent research has made me question. Studies on false memories seem to suggest they’re easy to make, just by attempting to remember something you can’t. And I myself know of times I’ve gone years mistaking a dream for a memory.
Some of them were independently confirmed by other people, like I mentioned to my mom that I liked to watch minnows in a river near where I lived as a small child, and she said she remembers me doing so as well.
Others I remember my internal thought process, which was unusual enough to be very unlikely to be post-hoc, like I remember my dad telling me we were going to move because our current house was “getting too small for us”, and I thought that he meant that he was still growing and soon wouldn’t be able to fit under the ceiling.
“Some of them were independently confirmed by other people, like I mentioned to my mom that I liked to watch minnows in a river near where I lived as a small child, and she said she remembers me doing so as well.”
That’s not independent confirmation, that’s just the same mechanism again.
But I also remember some stuff about from before my fourth birthday and I guess many people do. I’m pretty sure Piaget wasn’t super categorical about it.
I remember our neighbour visiting in our old apartment (we moved when I was four). He had to stoop to get through the door. I remember him being absurdly large, brushing the ceiling with his head.
It must have been your dad.
That reminds me of a ya book by Diana Wynne Jones, “Fire and Hemlock”, which features a “giant” and a lot of magical stuff, but for a long time it is unclear whether this is just a little kid’s imagination interpreting more ordinary events or really something magical.
There was a giant in Fire and Hemlock?
“There was a giant in Fire and Hemlock?”
It didn’t play a big role. It made a racket in a supermarket or something? It’s been 20 years …
Oh right. I actually recently reread F&H (after a gap of 20+ years), but I forgot that part.
I personally got a lot of problems remembering things before 7 or 8. Even stuff around that age is unusually shrouded in mist… while I can remember with a precision that unnerves friends conversations or stories they shared 10-15 years ago (I’m mid 40s).
It could be Piaget just worked out some baseline for “most commonly in people”…
I also have scattered clear memories from age 3, some of which have been independently confirmed.
However, while I also used to think about abstract ideas before age 10, I seem to recall my thoughts had a different nature back then – I was to a great extent parroting what I had most recently read, and I don’t believe I ever got close to synthesizing different positions until I turned 12 or so. Yes, that could have other explanations, but given Piaget et al I’m strongly inclined to attribute that to my age.
When I was a teenager, I was of the opinion that I had basically gained consciousness at the age of 12. Before that I was more like a little animal, never thinking any deep thoughts, just running around playing all the time.
Most likely. I’m not up to date on development psych, but as a general rule assume everything Piagets gives age ranges for to happen earlier, in many cases much earlier. Combined with the possibility, that any particular person may be an outlier on the normal distribution (i.e. if the average child develops long-term memory at 2.5 years, some will then start to do so at 2) there is no reason to doubt your recollection.
My 2-year-old certainly has long-term memory, if that means she can remember things that happened yesterday or last month.
I have two memories of things that happened before I was 2, but one of them may be a false memory, because there is a picture of the event I remember in a photo collage in my parents’ house. Or maybe it’s in-between – seeing the photo reinforced the memory and helped me not forget? My older sister was going to visit our grandparents. I wanted to go too, and I thought that if I got in the car and refused to get out, they would have to take me along. The photo shows me sitting in the car.
The other memory is of the time I was in the hospital because of epiglottitis. (Obviously I didn’t know the name at the time, but I remember having a tube in my throat.)
It’s definitely possible that a lot of the commenters here were precocious children.
Anecdata: A good ten years ago I had a chance to visit the apartment we used to live in when I was a toddler (and have never visited afterwards). I could navigate the place just fine, and have a conversation with my mother about the red wallpaper we used to have in one of the rooms.
Possible confounders: the mechanism for remembering locations is probably different from the mechanism for remembering events. Piaget could apply to the latter but not the former.
“Possible” super-confounders: my priors on psychological models of sensory perception or memory research being outright wrong are unusually high, so I might just invent memories of visiting our old apartment to confirm this. I could ask my mother to check if we really did visit the place, but under BlindKungFuMaster’s model she would probably just confabulate something to confirm my experiences. So I guess we’ll never know.
I would point out that what Plomin actually said was
‘reasonable prediction’ != ‘explain all genetic variance’, and of course, if most relevant genes are less than 1% of variance, then at 80% variance (excluding the catalogue of genetic disorders which have effects on intelligence like retardation, which must have been into the hundreds by 1995), the implication is that there are easily hundreds. (I recently scanned a book from around the same time where in the discussion transcript Plomin also implies he expects hundreds of variants but since the link makes the same point, I don’t need to dig it up.) His appreciation of this point, incidentally, is probably one of the reasons he and associated researchers pushed for some very early genetic sequencing of SMPY participants (using high-IQ samples delivers much higher power so their tiny sample wasn’t as useless as you would expect) and successfully debunked some ’90s candidate-gene hits for IQ ( https://www.gwern.net/SMPY#chorney-et-al-1998 ).
And Fisher’s infinitesimal model was both a landmark achievement in genetics and widely used and accepted in many fields, and that model literally entails all genes having an effect of some sort. So the real question shouldn’t be ‘why was polygenicity a surprise’ but why were human medical geneticists, specifically, so surprised and allowed to get away with the claims they did…
It’s a good question. Let’s go back to heritability. What is it? It’s a variance component: the total amount of differences explained by an entire category of effects. One of the things that makes talking about heritability/shared-environment/error is that, well, you hardly ever see anyone talking about the variance components directly in any field other than quantitative genetics and behavioral genetics specifically. It’s part of the mathematical machinery, but the focus is always on direct effects: does this specific value of variable X increase or decrease Y, and how does it compare to Z? You pretty much never see anyone ask ‘how much of variance could variables X-Z explain even in the limit of infinite data?’
One of the only counterexamples I know of is “Morphometricity as a measure of the neuroanatomical signature of a trait” http://www.pnas.org/content/early/2016/09/07/1604378113.long , Sabuncu et al 2016; instead of using relatedness coefficients (we know siblings are ~50% genetically similar etc), it defines similarity by brain measurements and looks at how similar similar brains are to gauge the limits of perfect prediction from those measurements. Possibly relevant is “Phenomic selection: a low-cost and high-throughput alternative to genomic selection”, Rincent et al 2018 https://www.biorxiv.org/content/early/2018/04/16/302117
It would be nice to see more use of this approach to try to quantify bounds on what sets of variables could deliver.
Thanks for saying what I wanted to say but much better and with more evidence.
Given that the assumption of a very large number of additive genetic effects worked well for explaining variation and evolution in a quantitative manner, it’s a mystery to me why any scientist would have thought we’d get many hits for genes explaining most of the variance in some particular trait that all humans have. I’m not really sure how many did and how much of it was the public misunderstanding. For certain crippling inherited diseases, it does make sense to look for mutations after all. Especially if it the disease shows signs of being on a single recessive allele from genealogy and basically working out punnett squares with your data.
Yeah, polygenicity pretty much drops out of the bell curve, so people must habe known that’s the way to bet for intelligence.
Given that there’s also a non-heritable component to intelligence, would we really have been able to tell the difference between a bell curve formed by the interaction of 10 genes (+ non-heritable component) vs. 10,000?
If these ten genes all have the same effect size, occur in 50% of the population and are statistically independent? Then probably not.
But these are strong assumptions. It’s not that you can’t get a bell curve with few causal variants, it’s just that the bell curve is derived from a model with many small effect variants, so whenever you see that distribution, that’s probably the underlying model.
Trying to think of a good way to ballpark it but my instinct is you’d want a little more than 10 as a bottom bound. Maybe that’s enough though. Somewhere between 10-100 might have been a reasonable bottom bound on expectations.
But why would anyone expect to find the bottom bound? The upper bound was something like all non-junk DNA contributes.
Genetic diseases carried on a single gene or small set of genes do not show patterns of inheritance like what we see for most traits. Those diseases are often binary traits (on or off) and recessive to boot. That’s totally unlike a continuous trait. Why would anyone have thought rare genetic diseases were a good model for most normal traits?
Selective breeding wouldn’t work as well if we typically only had ~10 genes that could be selected on. We’d find that the variance in the trait we were selecting on shrank really fast.
With the assumptions I made in the above post, 10 variants would lead to 1024 little IQ bins. Add to that some non-hereditary variance that smoothes out the edges and that would be very much a bell curve in smaller samples.
The biggest problem with this would be that the genetically most intelligent person is just 1:1000, something like IQ 145? So in huge sample sizes we’d see a tail that is solely due to non-hereditary factors and that should be a very detectable deviation from the bell curve.
The selective breeding example is a good one. We increased milk production and chicken size and stuff by so many standard deviations, those just have to be massively polygenic.
It almost seems like academia as a whole would have to change to match this alien culture. Currently, we have independent papers thrown over the fence that measure one or a few causes. This alien society would somehow have an interface between all the academics, so that all their research would be synthesized and shared in real-time. Even though we as humans are quick to adopt signals, we are slow to adopt standards, thus making large-scale integration of research impossible. We might have to wait for an early-AGI that can understand science papers and integrate that knowledge for us.
I get that there’s been a failure of models of the form, genes -> observable traits, but maybe it’s just that we’re looking at this from the wrong abstraction level? Like DNA, we take machine code as the theoretical complete specification of programs (sans bit errors and module machine memory models), but that doesn’t mean it’s the best language to work with.
Sure, with more statistics and Big Data™ we’re going to arrive at arbitrarily accurate genetic predictions but this is still essentially brute force, and I’m not sure I’m ready to accept having scientific models of larger and larger computational complexity.
PS Is it just me, or does this read more like an essay I’d read in a magazine and less like usual Scott?
Thing is: a lot of the time
genes -> observable traits
Works really really well. A lot of times, when people have dug into various disorders they’ve been able to narrow the cause down to specific mutations and can then figure out how the mutation is affecting various pathways.
On the other hand polygenic risk scores have a tendency to replicate poorly across different populations and can be prone to P-hacking.
One reason science is resistant to the idea of polycausality is: in highly polycausal domains, science isn’t very useful.
Science is great at figuring out how simple things operate — complex things not so much. Saying “phenomenon X is highly polycausal” also implies “studying phenomenon X with scientific rigor is almost certainly not going to lead to breakthroughs; likely not very useful at all; very possibly harmful due to ensuing misapplication of knowledge gained.” It’s a tough pill for scientists to swallow.
By the way, the inaccurate belief that science is highly effective in polycausal domains, is one of the great sins of the New York Times-Harvard-Democratic Party triumvirate. Hopefully this belief will soften as polycausality becomes an explicit concept among the elite.
(Having written all this, not sure if I really endorse my point 100%, but publishing anyway because… it’s provocative. And it lets me mention TALEB.)
This is what I would have said aside from the example of genetics, but the recent success of genetics makes me hopeful that occasionally it’s possible to just steamroll through this by finding all of the complexity and dealing with it.
Obviously there are cases where it’s much harder, but I still wonder if genetics can provide a blueprint that can be approximated in other areas.
Tractable is on the word calendar today, I guess.
I think the biggest difference between polycausal models and single-cause models is that polycausal models are only polycausal because we don’t understand the precise underlying mechanisms. A polycausal models helps us predict things better, but it still hasn’t reduced what we’re studying to its components the way a sufficiently detailed normal model would. In the case of depression and the genetic components of intelligence, the reduction might not happen for a long while, but if it ever will, I don’t think it will be because we’ve mapped the inputs and outputs of the black box – it will be because we made it transparent and tracked the path of each of these inputs and outputs, which is orders of magnitude more complicated. This probably means that experts in the relevant fields with need to be either hyper-specialized or half-computer, as well.
I’ll throw out a claim, and if anyone wants to operationalize it and offer a bet, let me know.
Intelligence isn’t massively polycausal. Neither is height. Massive polycausality basically never happens except when there’s symmetry involved (e.g. air pressure resulting from a huge number of identical particles). 80/20 rule is a thing, and if a phenomenon looks massively polycausal, it’s because we haven’t figured out the 80 yet.
In the case of both intelligence and height, I see two likely ways for this to play out (though there may of course be others). First, the hereditary component of intelligence/height isn’t primarily genetic – e.g. maybe it turns out to be all about microbiota or something like that. Second, it could be that we’re missing a level of abstraction – e.g. it turns out there’s a bunch of proteins which implement some simple predictive model of the environment and then make height/intelligence investment decisions, but we haven’t managed to decode the model yet.
I expect, fifty years from now, we’ll look back and say “obviously intelligence/height isn’t massively multicausal, we just hadn’t figured out the actual causal process yet.”
For background on why I expect this, see The Epsilon Fallacy.
At a large scale, the effect of genes on traits often is a lot like the effect of the molecules of a gas on pressure and temperature. Independent additive variance type models work pretty well for a lot of cases.
You’re made of trillions of not identical but very similar complex units. Yes, those units can be specialized. But they still have a really deep underlying similarity that the wheels on your car don’t have with its electronics for example.
I’m trying to figure out how to operationalize a bet with you or if it’s possible. I think the number of genes contributing to height and intelligence is large. We already have enough measurements. But I think polycausal is the wrong way to think about it, so I agree with you there. I wouldn’t say the temperature of my room is polycausal because there are a lot of molecules in the room. Strictly speaking it’s kind of true, but…
“But I think polycausal is the wrong way to think about it, so I agree with you there. I wouldn’t say the temperature of my room is polycausal because there are a lot of molecules in the room. Strictly speaking it’s kind of true, but…”
The causal SNPs are much more independent than the molecules in the room. If you speed up a single molecule it’ll collide and dissipate the additional energy among the other molecules. It doesn’t matter which one you chose. If you change a single SNP, it’ll have an effect on the phenotype that is specific to only that SNP. Looking at SNPs as causes makes sense, because that’s were you can intervene.
Sure, but independence of contributions normally makes things simpler to handle not less simple. My statement would be true for a noninteracting ideal gas too.
If I put a pile of sand grains on a scale and each grain has a slightly different mass, the weight of the pile is polycausal from the point of view that each grain of sand makes an independent contribution to weight (note to others: obviously human traits mechanistically more complicated than this). And each grain of sand has a unique weight. But the weight of the pile isn’t polycausal in an interesting way.
Genetics is obviously much more interesting than that, but I don’t think the idea of polycausality is really gaining us anything here. At a really low level of description, biology is a very large physical system with a particularly interesting set of initial conditions. At a high level of description, we find that genes often add to traits as if they were unique grains of sand adding weight to a scale. At some middle levels, it’s more polycausal in an interesting way. But those middle levels are deeply structured by the physical level below and the emergent rules of natural selection above. What is the idea of polycausality really gaining us here?
A causal model allows us to predict and to intervene. That is exactly what finding causal SNPs is about. That is what we gain. Saying that a causal relationship that can be used to predict and intervene isn’t interesting enough to be thought of as causal is just absurd.
We can predict height with an accuracy of something like 2 inches directly from genetic data. That’s basically all the additive heritability encoded in a polygenic score. Where does that leave your claim?
‘If a problem seems hard the problem formulation is probably wrong’ -Chapman
‘one must understand information processing systems at three distinct, complementary levels of analysis. This idea is known in cognitive science as Marr’s Tri-Level Hypothesis:
computational level: what does the system do (e.g.: what problems does it solve or overcome) and similarly, why does it do these things
algorithmic/representational level: how does the system do what it does, specifically, what representations does it use and what processes does it employ to build and manipulate the representations
implementational/physical level: how is the system physically realised (in the case of biological vision, what neural structures and neuronal activities implement the visual system)’ -Wiki summary of David Marr’s work.
So, we encounter a problem ontologized at the algorithmic and our solution is to go down to the implementation level. What about going up to the computational level? My inside view on depression is that when I was depressed it was because I didn’t understand what it was I was trying to do. Lacking such knowledge I did things that were of the same sort of thing as trying to train a dog to be vegan. That these all went horribly wrong constantly in many varied ways (why is the dog acting so crazy?) eventually paralyzed me into inaction. Playing wack a mole with all the specific ways the dog is acting crazy is something I don’t think would have ever worked.
I’m using genetics because it’s the best example we have right now, and so far it’s a counterexample to the Chapman quote. It turned out genetics was just hard. Once you brute-forced the hard thing, you could do genetics just fine.
Biological determinism, also known as genetic determinism[1] or genetic reductionism,[2] is the belief that human behaviour is controlled by an individual’s genes or some component of their physiology, generally at the expense of the role of the environment, whether in embryonic development or in learning.[3] It has been associated with movements in science and society including eugenics, scientific racism, the supposed heritability of IQ, the supposed biological basis for gender roles, and the sociobiology debate.
https://en.wikipedia.org/wiki/Biological_determinism
Yes, and research has shown it to be largely accurate. Obviously genes and the environment both play a role, but the recent trend in research has been to find genes are much more important than people thought possible even a few decades ago, with traditional “nurture” assumptions much less relevant than people would have considered plausible.
You can find a very high-level overview of the IQ case here, but also look into twin studies ever, GCTAs, adoption studies, etc. In terms of books, The Nurture Hypothesis will be especially helpful, but any other book on genetics or intelligence written in the past 10-20 years should at least give you the basics. See also this post here.
I find this phrasing odd. If you go farther back in time, the trend is to believe that genes are more important, not less important. The period a few decades ago is the wacky outlier. How did people feel about breeding vs. raising 45 decades ago?
Conjecture: Looking through the overview of studies on the Wikipedia page, it seems to me that the heritable component of intelligence could be viewed as “intelligence potential”, while environmental factors as “intelligence filters”.
What do I mean by that? Presumably, the capacity for “intelligent operations” (however defined; say: solving puzzles on IQ tests) is dependent on the actual structure of the person’s nervous system – a biological feature and hence subject to genetic heritability. I believe this isn’t a controversial assumption.
In practice, this would imply that “maximum achievable intelligence” is in some part determined by actual inherited genes. In ideal conditions (that is: with no adverse environmental factors affecting gene expression) it might be wholly dependent on genetics. I say “might”, because we don’t really know whether “maximum achievable intelligence” is a coherent concept – that is: whether the capacity for intelligence has an intrinsic limit. Nevertheless, I believe there’s no reason to believe it doesn’t.
Having the potential to reach some level of intelligence (a particular IQ score, for example) does not, however, imply that any particular individual will reach that level. It is my understanding that development of intelligence is driven, to an extent, by environmental needs. An individual that has the capacity for high intelligence may never achieve the fullness of their potential, unless they are challenged with appropriately difficult problems. If that seems dubious, it should at least be obvious that a person with exceptional aptitude for the kind of problems one finds in computer science will never realize their potential in a society that hasn’t gotten around to inventing computers, yet.
The ability of an individual’s environment to stimulate the development of their intelligence to its maximum, biologically-dependent level – or rather, lack thereof – is what I call an “intelligence filter”.
I find this hypothesis attractive, because it explains a number of features observed in the data:
1. Environmental effects tend to be more pronounced in lower-income families (Turkheimer 2003; Harden 2007), which is consistent with the environment restricting the development of intelligence below its biological potential. The opposite is true of higher-income families which is to be expected if there are few environmental constraints towards developing intelligence to its biological maximum.
2. Environmental effects tend to be more pronounced in childhood than in adulthood (Tucker-Drob 2011), which is consistent with the idea that stimulation coming from the individual’s environment drives development of intelligence. For bonus points, environmental effects are much more prominent in lower SES families than higher SES ones, which again is consistent with the idea of a “filter”.
3. The hypothesis is also consistent with the findings of Capron and Duyme (1999), with difficult early circumstances having a markedly adverse effect on orphans’ intelligence development and post-adoption gains being proportional to the wealth of the adopting family, which is what we’d expect if we assume that:
a. intelligence has a positive correlation to wealth (which I believe to statistically be the case),
b. more affluent foster parents will have more resources to devote to improving the lot of the adopted child (this is kinda obvious).
I realize that the idea is likely not in any way novel, but I’m surprised nobody else has brought it up.
What would be a competing hypothesis? This just seems to be the default way of looking at it.
One competing hypothesis, which started this whole conversation, is that there is little or no biological determinism to intelligence and hence no “maximum achievable intelligence” on an individual level.
There’s two ways I can read this: either it is the case that there is some theoretical maximum that is essentially the same for all humans and differences can be explained by what environmental filters are in place, or it is the case that there exist environmental “intelligence amplifiers” available to some people, but not others, that explain observed correlations between SES and IQ.
I don’t think I need to elaborate on why this is an attractive position, politically.
That’s the “nurture” perspective. A vulgar reading of the “nature” perspective is that most or all eventually observed variance (in adults) is purely down to genetics and therefore environmental considerations aren’t going to affect outcomes in the long run.
This is one way to read Rushton and Jensen (2010) and is also attractive politically – to a different political mindset.
I wouldn’t accuse Scott of holding such a position, but it’s an impression that an inattentive reader might get from reading his writings on the subject.
Um… so what? Lots of people say lots of stupid things. When did Piaget get any more credible than the Tongue Map?
I was about to say something here, and then I noticed this:
This essay seems to be agonizing over the fact that people have used the same word, “gene”, to mean different things. The original meaning is the abstract unit of inheritance — genetics was discovered long before DNA was. And in that sense, it’s perfectly reasonable to talk about “the gene for X” to whatever degree X is narrow-sense heritable.
DNA was discovered as the culmination of the effort to find the physical basis for inheritance. It is, in fact, the physical basis for inheritance, and so to the extent that “genes” had physical form they were necessarily embodied in DNA. Investigation revealed that the more immediate function of DNA is to produce proteins, so theorists working in an entirely different paradigm adopted the term “gene” to refer to the stretch of DNA which codes any particular protein. But that’s mostly unrelated to the inheritance sense. There’s plenty of information contained in stretches of DNA which code for no proteins at all.
I don’t see how using the term “gene” to refer to the raw code for a protein means that using the older term “gene” to refer to the abstract concept of inheritance is a “diseased paradigm”.
Any paradigm that allowed people to say “the gene for intelligence is on chromosome six” or “there could be as many as ten genes affecting autism” is just object-level bad. I appreciate your attempt at charity here, but I lived through this period and I think people meant what it sounded like they meant.
There never was a paradigm that allowed people to say “the gene for intelligence is on chromosome six”. Nothing stopped them from saying it anyway, but why do you believe there was a theoretic paradigm backing them up?
People say “race does not exist” and “there hasn’t been enough time for Indians (in India) and Eskimos to evolve separate adaptations to their local climates” all the time too. They describe things as “light years ahead of their time”. They hallucinate “scientific” beliefs out of nothing:
(from here)
It makes perfect sense to me that there could be three or four different genes that affect intelligence, each contributing 25% of the hereditary portion of the variance.
This is how most of the traits we understood in 1990 (mostly simple genetic diseases) worked. Given that this was plausible, matched our existing evidence, and lots of people said they believed it, why do you think it wasn’t a real paradigm?
Or on a more philosophical level – yes, false paradigms will eventually be found to have errors in them, and some of those errors will even be found to be contradictions and nonsense. But saying they didn’t really exist as paradigms seems to be dismissing that “false paradigm” is a useful concept.
That’s a fair criticism.
I tried to choose some examples that I felt were related to what was going on with “genes”. Someone describing something as “light years ahead of its time” has confused “light years”, which measure distance, with “years”, which measure time, because of the unfair similarity of the words. I think this type of confusion over “gene” was the norm in science journalism and lay understanding and explains most of what you talk about in the post. I don’t think this sort of situation involves a false paradigm, just people confusing two good ones.
Someone saying “race does not exist” definitely is working from a false paradigm. If there were no paradigm, the least we could expect to see is an occasional rephrase like “there’s no such thing as race”. However, I don’t see that paradigm as being part of the field of genetics, and as genetics progresses I wouldn’t describe it as having overcome its own diseased roots. The “race does not exist” paradigm exists outside of genetics and is simply irrelevant to it except to the extent that believers attempt to run political interference.
People independently concluding that machines cannot think because, in their opinion, that would be bad, aren’t working from a paradigm either. But where a desire exists, people will try to deliver on it, and many of those people will be totally incoherent, and I think that also explains a lot of “we will find the gene for X” publicity.
I say there was no paradigm that allowed saying “the gene for intelligence is on chromosome 6” because regardless of how you want to use the word “gene”, it has always been clear that this couldn’t possibly be true. For example, Down’s Syndrome is unrelated to chromosome 6, but has a dramatic effect on intelligence. Many, many abnormalities have a dramatic effect on intelligence.
Food for thought: Turner’s Syndrome is genetically straightforward. According to the paradigm of your choice, where is the gene, or genes, that code for it?
“The original meaning is the abstract unit of inheritance — genetics was discovered long before DNA was.”
But it wasn’t just the abstract unit, it was known to be a discrete unit. At least that drops out of Mendel’s laws. And that rules out just retroactively defining a polygenic cause as “what we meant when we said gene for X in 1990”.
This is what happened to computer vision in the last 20 years, going from a few explainable causes to a model with many tiny undescribable ones. A neural network is what you end up with when you say “everything in the image needs to be able to affect everything else, and those effects need to be able to effect each other, many layers deep.” A convolutional neural network is what happens when you simplify the complexity by only allowing local interactions to happen at each level.
@Scott Alexander : Out of curiosity, are you saying that lead isn’t a big component of crime? The models and stuff I read around it (arguably, scientific vulgarisation articles rather than actual scientific papers) make it sound pretty solid.
It’s not like poverty, gun laws, behavioural expectations etc. don’t play a role. It’s that lead is the element responsible for the 80s and early 90s crime waves we saw around the world. But Japan crime stats are still better than the US’s or Mexico’s.
In order for a person to have normal or above-normal intelligence, a lot of things have to be working correctly. Thus, there are a lot of potential ways for a genetic mutation to reduce someone’s intelligence.
I expect that there are also a lot of ways that something could go wrong and a person could be depressed.
This is more or less what Jessica Wolpaw Reyes did in her original lead-crime paper: http://www.nber.org/papers/w13097.pdf
See especially the introduction, and table 6 on p. 59.
In general, genetics is a bad metaphor for empirical mircoeconomics, because in genetics there’s a fixed universe of possible causes given by the genome. Finding a gene that explains 1% of the variation in some phenotype isn’t that exciting in part because you expected some gene or other to be important. But finding out that e.g. HBCUs have to pay an extra 25 basis points to sell their bonds is a startling and compelling result, even though adding racial bias to an asset pricing model doesn’t help it fit the data much better.
Can we say information in genes is zipped?
Suppose I have 30 christmas lights in a long chain, and each light bulb can be switched between 3 colors. A computer is controlling it and to set up a certain combination of lights we need to enter data into a .txt file (Notepad). The first position represents the first light bulb, the second the second one etc. Into each position I can write a number 1 or 2 or 3, all other inputs are illegal and the bulb will be dark. 1 is red, 2 is white and 3 is blue. So if I want the christmas lights look like a French flag, I will enter 111111111122222222223333333333 into the file. Each digit, each position directly corresponds to one light and directly controls it.
Then I zip the file. I am not exactly sure how zipping works but the logical thing is to write something like 10×1,10×2,10×3 which is indeed shorter than the imput file. I modify the light bulb controller program to be able to unzip the file and work with that.
In my zipped file, positions do not correspond directly to light bulbs anymore. It has to be unzipped to do that. I cannot tell which byte controls light 4 anymore and modifying the zip file directly to make light 4 white is non-obvious. 3×1,1×2,7×1,10×2,10×3.
Is this how genes work?
Compression is probably a bad analogy. Genes actually encode (in an “uncompressed” format) the proteins they give rise to, together with signals that control how much and under what conditions the proteins are to be produced. These proteins then go on to lead to consequences (cell division, cell differentiation, tissue formation, up to the actual behaviour of specific cell types) that determine the ultimate shape and behaviour (“phenotype”) of the living organism.
Organisms of the same species agree on almost all DNA, two humans have >99% of DNA in common. However, small variability between individuals (often just single-point differences in DNA) lead to e.g. proteins that have small difference in their chemical characteristics, or small differences in how much they are expressed. This small variability is what we actually look at when we look for “genes” for something.
It’s actually really remarkable how simple the genetic code is. In situations where the length of the total genetic code in constrained, such as viruses, more information is sometimes packed in through frameshifting. Essentially, if you have a sequence of ABCABCABC you can read it as a repeating sequence of either ABC’s, BCA’s, CAB’s, CBA’s, BAC’s, or ACB’s by changing the starting point and direction that you read. I seem to remember some virus used 5 of the 6 possible senses to encode different proteins.
What if we apply that intuition backwards? In the preDNA era there were a number of candidates for what controlled heritability. It turns out that it was (more or less) just one molecule. Even in the early DNA period it wasn’t known how DNA controlled protein expression. It would have been plausible that proteins could be the result of a complex combination of properties of DNA, including the global sequence, interaction with DNA proteins, and topology. Should the fact that there exist things called genes which are spatially localized on DNA and have a universal three base pair alphabet be surprising?
Even if the mechanism was not known a priori, the payoff of discovering the single intelligence gene/hereditary molecule/genetic code is so much higher than discovering even a pretty good polycausal function that it makes sense to look for the easy answer first.