Things I Don’t Understand About Genetics (A Non-Exhaustive List)

Posted on August 23, 2013 by Scott Alexander

A couple months ago the Genetic Association Consortium’s study on SNPs for intelligence raised an important question: should all of our genetics studies be performed by organizations whose acronyms are also amino acid codons? And aspartic acid? Really? Kind of a boring choice.

But while we’re figuring that out, can someone explain to me how polygenic inheritance works?

We have really strong evidence that intelligence is highly heritable – maybe 50% to 80%. But genome-wide association studies show very low contributions from any particular SNP:

The study found that the genetic markers with the strongest effects on educational attainment could each only explain two one-hundredths of a percentage point (0.02 percent). To put that figure into perspective, it is known from earlier research that the SNP with the largest effect on human height accounts for about 0.40 percent of the variation.

Combining the two million examined SNPs, the SSGAC researchers were able to explain about 2 percent of the variation in educational attainment across individuals, and anticipate that this figure will rise as larger samples become available.

Here’s my question. Things determined by a larger number of independent randomly varying processes should tend to vary less. Suppose at Casino A, you flip a coin and if it comes up heads they give you $5000 and if it comes up tails they give you nothing. But at Casino B, you flip 5000 coins, and get $1 for each coin that comes up heads and nothing for each coin that comes up tails.

Both games have the same minimum and maximum winnings ($0 and $5000), and both have the same average ($2500). But the variance will be very different. The winnings at Casino A will vary a lot; half the people will walk out with $5000 and the other half will walk out broke. The winnings at Casino B will vary surprisingly little: if I understand binomial distributions correctly, well under 1% of gamblers will walk out with less than $2400 or greater than $2600. Pretty much everyone goes home with something like $2500.

The more independent loci determine human intelligence, the less variation in human intelligence we expect to see relative to the total amount it is possible to vary. If the most important SNP explains 0.02%, then there are at least 5000 genes involved, which means, as in the example above, that less than 1% of people should differ more than 2% of total possible variability from the average.

But actually, people differ in intelligence a lot, and a lot of that difference seems to have a genetic component. This implies either that the total possible genetic variability in intelligence is huge – that the right genes could give you an IQ of 2500 or so, or that variability in intelligence isn’t simple and additive and random the way I’m modeling it here – or that I’m doing the math wrong, always a distinct possibility.

The easiest way to get out of this, other than accepting I am terrible at math and should be kept away from it, is to assume that lots of different genes for good intelligence are correlated. Maybe one population had reason to evolve high intelligence with lots of smart SNPs, and a second population didn’t. Then it would make complete sense that all the genes involved would co-vary. Unfortunately, there seems to be significant IQ variation within the same family, let alone within the same population group, so that fails pretty hard as an explanation.

Another possibility is to accept the whole mutational load idea – which allows for high correlations in goodness or badness of the entire genome. Unfortunately, this is the other idea in genetics which has been confusing me terribly over the past few months.

This has no trouble explaining correlations, but it does have trouble explaining why things aren’t more correlated. We should find the same people being very smart and very tall and very athletic and very healthy. I don’t doubt there are some correlations between these traits, but they don’t seem nearly as high as one might expect.

And the paternal age effect keeps being brought in to explain this, but I don’t get that one either. Suppose I have a kid at 60. My sperm and my DNA have had 60 years to accumulate deleterious mutations, so there’s more chance my kid will have low IQ or psychiatric disease or whatever. Fine. But suppose while I’m having a kid at 60, my twin brother has had a kid at age 20, and his kid had a kid at age 20, and his kid had a kid at age 20, so that his first great-grandchild is being born exactly the same time my first child is. Both my kid and his great-grandkid have had 60 years worth of cell dividings to accumulate mutations. Why should their risks of autism be any different just because his kid had those 60 years divided among three different people?

(does less frequent division of spermatogonia before puberty cause them to accumulate fewer mutations during that time? If so, shouldn’t three generations of people who have kids at 33 still accumulate 80% as many mutations as two generations of people who have kids at 50?)

I’ve heard evolution’s ability to eliminate people with bad genes used as an explanation here, but I don’t quite get it. For one thing, it seems unlikely that evolution can produce beneficial mutations at the same rate people accumulate deleterious mutations (10 per generation or so). And if we imagine two lineages of gradually deteriorating intelligence going on for 1000 years, the Lions That Selectively Eat Low-IQ People will have just as much opportunity to cull the members of the one that reproduces quickly as the one that reproduces slowly. And, if that were the explanation we should fail to see a paternal age effect in the absence of such lions (or be dysgenic as hell), but the effect has been demonstrated in our own society, which is relatively free of lions and of almost everything else that kills people before they can reproduce.

I know other people have blogs where they explain things and readers bask in their wisdom, but having a blog where I say how confused I am about stuff and readers explain it to me has always worked for me before and I have faith it will continue to do so.

This entry was posted in Uncategorized and tagged biology, genetics, iq. Bookmark the permalink.

38 Responses to Things I Don’t Understand About Genetics (A Non-Exhaustive List)

Reverse order

David R. MacIver says:

August 23, 2013 at 2:51 am

Here’s a hypothesis that might explain some of it (and it’s based on navel-gazing introspection on the subject of intelligence rather than any real knowledge of genetics, so salt it thoroughly): The genes that matter most don’t control intelligence, they instead control ability to learn and attitude to learning. You get various positive and negative feedback loops – some people learn a certain amount and then start to feel the rest is a struggle, some people learn a certain amount and then go “Oh. This is cool. And this fits with that and…” and find future learning if anything easier.

The result is that rather than intelligence being a linear quantity determined by genes it’s instead a highly non-linear function of them because the genes basically determine which feedback loops you get and depending on the balance of these may result in your own little runaway intelligence explosion or your finding that your knowledge is essentially capped by the amount of effort you’re willing to expend on it.

There may then be genes which control things like raw capabilities – working memory, etc. which also contribute to intelligence, but these essentially just dictate the constant factors: Someone with high “intelligence” and poor learning feedback may do very well up until a point but then still cap. Someone with low “intelligence” may nevertheless do very well because they keep on learning indefinitely even if it takes them slightly longer.

This also helps explain why so much of the difference is social: Social pressures can either very effectively motivate you or demotivate you to learn, exposure to good mental habits can improve your learning skills and exposure to bad ones can hinder them.
- David R. MacIver says:
  
  August 23, 2013 at 3:00 am
  
  Sorry, that was a little incoherent. My intelligence is definitely capped before the coffee kicks in and my brain has properly woken up. Hope it made some sort of sense.
Anonymous says:

August 23, 2013 at 5:02 am

” This implies either that the total possible genetic variability in intelligence is huge – that the right genes could give you an IQ of 2500 or so, or that variability in intelligence isn’t simple and additive and random the way I’m modeling it here”

That’s it yes.

Additivity is a gross simplification but it’s the only we have to get things done for the most part. Genes do not act on intelligence by contributing to a giant bucket of coins. Instead the effect of each gene will vary dramatically (sometimes in opposite directions) depending on the environment and the other genes. A more realistic (but still grossly simplified) model would look like:

Genetic Loci–>Protein–>Biochemical Pathways –> Cellular Phenotypes –> Gross Phenotype (e.g. IQ).

For a given cellular phenotype, There might be say 3 pathways, each one with 10 different proteins effected by 3 different alleles in the population.

Say you have to break two of the biochemical pathways to effect your cellular phenotype, and each of the pathways needs to have two proteins broken before it shuts down. Your first mutation now does nothing. Your second one in the same pathway breaks a biochemical pathway, it now has a large effect. But any further mutations within this pathway now have no effect – it can’t get any more broken. So if you add up the effect of all 90 possible loci that could be mutated effecting this pathway, you vastly overestimate the additive variability in the genome.

This is the problem with genetics. We know that particular combinations of mutations are what matter, and we know effect are not additive, but without a way to put structure on it all we’re left with a combinatorial explosion of possible mutation combinations that cannot be unravelled, even if we sequence everyone on the planet.
Zslastman says:

August 23, 2013 at 5:06 am

Oh, and re: paternal effect – you are correct in that there are less actual cell divisions if you go through 3 thirty year olds as opposed to a single 60 year old, but the major difference is in sperm selection. Both sperm, eggs and embryos undergo a very efficient filtering process that roots out a lot of the worst mutations between generations. This process exerts selective pressure many times stronger than selection on individuals.
- Douglas Knight says:
  
  August 24, 2013 at 9:19 pm
  
  It’s true that sperm selection is tremendous, but the measurement that the typical Icelander has 15 new mutations from the mother and 2A-15 mutations from the father of age A is based on people not sperm. This is how many mutations remain in living children, not sperm or fetuses. So this mechanism does not answer the question.
- Scott Alexander says:
  
  August 25, 2013 at 8:27 pm
  
  Can you explain sperm selection? If all mutations are uncorrelated, sperm that have mutations in their essential reaching-the-egg genes should have exactly the same number of mutations in their person-they-will-be-when-they-grow-up genes as baseline.
steve hsu says:

August 23, 2013 at 5:47 am

Apologies for not contributing a longer comment, but I suggest this video lecture: http://www.youtube.com/watch?v=FgCSkGeBUNg (audio is a bit weak so you may want to use headphones).

Your intuition about polygenic variance is sort of correct: if you flip N coins the standard deviation (SD) scales like 1 / sqrt(N). We observe this SD in measurements of g. But if the additive polygenic model holds (and we have every reason to think it does), then there are genotypes with many more (+) variants than are typical in the population, and which yield truly exponentially rare phenotypes (i.e., super smart people like von Neumann or better). In experiments on corn, cows, chickens, drosophila, etc., breeders have been able to shift phenotypes by many SDs (e.g., > 30)!

http://infoproc.blogspot.com/2010/10/maxwells-demon-and-genetic-engineering.html
http://infoproc.blogspot.com/2011/08/epistasis-vs-additivity.html
Madeleine Ball says:

August 23, 2013 at 7:26 am

Maybe your intuition on how things will combine is off — you might want these to be simple additive things. Tempting as it is to imagine ourselves as bags filled with certain numbers of blue or red balls, we’re complicated machines. One gear might touch another gear which connects to an axle, changing the first gear by 2% and the other by 3% could just cancel each other out, or max out at the largest change, or even combine in a non-linear super additive effect to make 10%, god knows right?

So I don’t think it’s inconsistent to say we have lots of factors we think might matter, and no idea how they combine.
Deiseach says:

August 23, 2013 at 8:27 am

I know even less about genetics than yourself, but I am not at all surprised that there isn’t one simple answer to this. I did think that by now we had moved away from the model of “Gene A controls all about your liking for peanut butter so if we mess around with deleting/adding Gene B we can make you like Brussels sprouts instead!”.

My own personal opinion, based on little more than grumpiness and bias, is that there is much less variance in intelligence than we like to think. People Back Then (whether we’re talking about two hundred or five thousand years ago) were so much stupider whereas we’re So Much Smarter because we can play games on our mobile phones. Mmmm – if any of us were dropped in the middle of the forest with the clothes on our backs as we are now, how much would we know about making tools, lighting a fire, finding edible vegetation, constructing shelter and the rest of it?

I also don’t think we have a ‘population A are so much smarter than population B’ scenario as some people like to throw around (if we take it that Asians are smarter than Whites are smarter than Blacks, which seems to be the usual grading used). You may be the next (insert name of pet genius here) but if you’re living on a small island in the middle of the Pacific and spending your days fishing and herding goats, not many people outside your village are going to know that you’re the smartest goat herd on the island.

What we do have, I think, is a mass of population numbers and the kinds of intensive, large-scale testing that previously were not available and were not carried out. So maybe we will eventually find out more about human intelligence and the factors that influence it than “smart, rich, healthy, well-educated people who don’t have to spend fourteen hours a day on the simple act of survival tend to have smart kids who, with the advantage of Daddy and Mummy’s money and connections, go on to be healthy, well-educated and rich themselves”.
Douglas Knight says:

August 23, 2013 at 11:55 am

An IQ of 2500 is meaningless. But you could ask all the same questions about height.

Normal variation in height has a high heritability, which only measures additive contributes, so it really is true that your height is a count of how many tall genes you have (to be more precise, a weighted sum is a pretty good prediction of your height). But if you created unnatural people with all genes set to “tall,” would they be a mile high? No. Would they be too tall to live? Possibly, but probably not. Height is linear for normal numbers of tall genes, but not for abnormal numbers of them. How far do you have to go outside the normal range for linearity to break down? I don’t know, but I expect it breaks down already with actually existing variation.
- Douglas Knight says:
  
  August 23, 2013 at 12:17 pm
  
  By my last sentence, I mean that I imagine that the tails of the distribution of heights measurably deviate from the normal distribution.
- Scott Alexander says:
  
  August 23, 2013 at 9:35 pm
  
  Why do you expect it’s linear until a certain point, rather than uniformly subadditive?
  - Douglas Knight says:
    
    August 23, 2013 at 11:09 pm
    
    Heritability is a direct measure of how close it is to being linear.
    
    “Uniformly subadditive” is not precise. If you propose a concrete model, we can work through the heritability and whether the model allow people a mile high, let alone a mile low.
Peter McCluskey says:

August 23, 2013 at 12:32 pm

> less than 1% of people should differ more than 2% of total possible variability

What evidence do you have that variation in human intelligence doesn’t fit this pattern?

Note that IQ is calculated in a way that results in humans having a pre-defined mean and standard deviation regardless of the magnitude of the actual variation.

Show me a test that meaningfully distinguishes humans, bonobos, and cats, and then I’ll believe we have a measure which tells us something about how much of the possible variation humans differ on.
- Scott Alexander says:
  
  August 23, 2013 at 9:38 pm
  
  Since there should be much more pro-intelligence variants than are utilized in any human being, it implies it should be possible to create humans with an IQ of 2500 (whatever THAT means) just by utilizing already existing genetic variation, without any clever intelligence enhancement stuff going on at all. Not impossible, but if true it would probably be the most exciting thing ever discovered by science, which would make it odd that no one ever mentioned it.
- Kaleberg says:
  
  August 24, 2013 at 9:33 pm
  
  That’s right. There is actually very little variation in human intelligence. Almost everyone can pick up some language or another, figure out how to pick things up, learn to navigate after a fashion, recognize emotions and intentions and so on. This is the big reason Wallace, who, with Darwin, developed the theory of evolution by natural selection, never accepted that humans evolved from non-human animals. Wallace had spent years in the field in Indonesia and New Guinea, and recognized that the people there were not all that different than those he knew at home in England. The gap between the smartest human and the dumbest human as compared with the gap between the dumbest human and the smartest animal was just too small to be considered as bridgeable.
Rolf Andreassen says:

August 23, 2013 at 1:49 pm

So the first question I’d ask is, are you really sure that our observations exclude the prediction of huge variability in intelligence? Consider that IQ is measured with an assumption of a Gaussian distribution (which may miss long tails), and also that the scale tends to break down at the upper range. Maybe there are in fact some IQ-2500 people walking around, but the IQ test can’t really capture the difference between them and the ordinary geniuses with a mere 189 or whatever. (Of course, in that case you have to ask what an IQ of 2500 really means.) Maybe it’s the measurement rather than your estimate of the variability that’s bad. On the low end, of course, it seems likely that an IQ of “minus ten” actually manifests as brain damage sufficiently bad to kill the child in the womb.
Sarah says:

August 23, 2013 at 6:18 pm

So, a lot of things are highly heritable, but in GWAS studies we’re unable to find SNPs that explain more than a small fraction of the variation. Some of these traits are a lot more mundane than intelligence — height, for instance.

Part of this may be an artifact of the GWAS methodology. Most of these studies use logistic regression. That’s basically assuming that genes have additive effects.

Suppose you have two SNPs that lower intelligence, and you have a 50% chance of having each. If effects are additive, then we have something like
No mutation: IQ 150
Mutation A: IQ 100
Mutation B: IQ 100
Both: IQ 50.

If intelligence works like an “OR” gate, like a complicated machine where damage to any part will break it but there’s no such thing as double-broken, then the effects are *sub*additive.
No mutation: IQ 150
Mutation A: IQ 50
Mutation B: IQ 50
Both: IQ 50

If intelligence works like an “AND” gate, like a machine with redundant parts that won’t break unless all the parts are broken, the effects are *super*additive.
No mutation: IQ 150
Mutation A: IQ 150
Mutation B: IQ 150
Both: IQ 50

Obviously, these are corner cases. But in general, when effects are subadditive, the graph of intelligence vs. number of mutations is convex to the origin, while if effects are superadditive, the graph of intelligence vs. number of mutations is concave to the origin.

That means, if things are subadditive but you try to do linear or logistic regression as though they’re additive, you’re going to get the result that each mutation decreases IQ by 50 points. So, imagining that things are additive, you’d expect to see way more people with IQ 100 than IQ 50, because it’s more likely to have only one mutation than two. However, one mutation breaks the whole damn thing! In reality we have lots of people walking around with IQ 50 and nobody with IQ 100. There is more variance in the real population than would be explained by the linear regression model!
- Douglas Knight says:
  
  August 23, 2013 at 6:47 pm
  
  No, GWAS does not fail because of an unjustified assumption of additivity. Some basic measures of heritability only measure the additive part. We know exactly how much of variation in height is due to additive genes before we do the GWAS and fail to find those genes.
- Sarah says:
  
  August 23, 2013 at 7:24 pm
  
  Now suppose we have a (slightly) more realistic model with lots of genes. Let’s assume that most mutations are a lot rarer than 50%. And let’s say that the combined damage to IQ of several mutations is usually significantly worse than the sum of the damage to IQ of each mutation alone. (This is a super-additive world. There’s redundancy built in; things go to hell much faster when all the parts are knocked out)
  
  Linear regression is going to say that most of these mutations have small but nonzero coefficients. Why? Because (if we still assume mutations are distributed independently) we have a Gaussian curve of number of mutations. Most of the time you have a few mutations but not very few or very many. So the situations where Mutation A is your only (or almost your only) mutation, and harmless or almost harmless, are rare; as is the situation where Mutation A is combined with lots and lots of other mutations, and their combined effect is extremely terrible. Most of the time A has a small/modest effect. So linear regression will make it appear that most of the mutations do something, but not much, to intelligence.
  
  Now imagine that Mutation A has no effect at all on intelligence, either singly or in combination with other mutations. It really is harmless. Let’s have a 3-gene model.
  No mutation: 150
  A: 150
  B: 100
  C: 100
  AB: 100
  AC: 100
  BC: 0
  ABC: 0
  
  Then applying linear regression makes it appear that A has a coefficient of -5, and B and C have coefficients of 67.5. Rather than being completely meaningless, as it is by construction, A appears to have an intelligence-increasing effect.
  
  What this means is that in a super-additive world where we’re applying a linear model, you’ll get spurious coefficients that you wouldn’t get in a linear world.
  
  What we’d expect to get in a world where effects are super-additive but we use linear models is:
  a.) more genes with “significant effect” than there really are; spurious correlations that can’t be replicated
  b.) no huge correlations, lots of modest ones
  c.) less variance explained by the model than we observe in the real world.
  
  And this is in fact what we see.
  
  You don’t need to posit any biological mechanisms at all. Just simple statistics.
  
  In a sub-additive world, where the damage of multiple combined mutations is *less* than the sum of the damage dealt by each mutation singly (that is, lots of dependencies, knocking out one part is as good as disabling a bunch of parts), you *also* get less apparent variance out of a linear model than you would expect to see in the real world, and in the analogous case where A has no effect and B and C are harmful, we get an apparent coefficient of 3 for A, and 40 for B and C; A appears to be harmful to intelligence when in reality it has no effect.
  
  I’ll have to make this rigorous, but the claim I’m inclined to make is that sub-additive effects make meaningless mutations look harmful, and super-additive effects make meaningless mutations look helpful.
  
  The latter can probably be dismissed as noise if you tend to believe that it’s the rare/mutant variants which are harmful. The former is hard to detect after the fact.
  
  For heritable diseases, like schizophrenia, which are polygenic, GWAS studies find lots and lots of genes correlated with the disease, and the total explains a very small amount of the variance, compared to the high heritability. A lot of these correlations turn out to be spurious and fail to be replicated. (I seem to recall that schizophrenia is like 80% heritable, but only 10% of variance is explained by SNPs in these correlational studies, and only 3 genes are actually consistently replicated across many studies.) This is what you’d expect in a world where most mutations are harmful and have subadditive effects.
  - Douglas Knight says:
    
    August 23, 2013 at 9:23 pm
    
    If interactions between B and C make A appear significant, that sounds replicable.
    - Sarah says:
      
      August 24, 2013 at 8:28 am
      
      Yeah, I’ll have to think about this some more and maybe actually state the lemma. I was just messing around with examples and intuition.
- Scott Alexander says:
  
  August 23, 2013 at 9:42 pm
  
  Thank you. This answer was very helpful, and you have a talent for explaining complicated mathematical concepts.
  
  In the spirit of an Ayn Rand villain, let me now act as if your talent gives me the right to demand you utilize it on my behalf, and ask you to look over the links Steve Hsu posted above on why non-additive genomes should behave in an approximately additive way and explain what they mean and whether you think they are correct.
  - Sarah says:
    
    August 24, 2013 at 9:09 am
    
    So, tentatively (and I do mean that — I have not done a whole lot of reading in computational genetics. And I have several times before shouted “They’re all morons! I’m surrounded by morons!” only to find out later that there do exist more sophisticated scientists not doing the moronic thing.)
    
    But tentatively I don’t agree.
    
    Firstly, Steve quotes a paper that says “Continuously distributed quantitative traits typically depend on a large number of factors, each making a small contribution to the quantitative measurement. In general, the smaller the effects, the more nearly additive they are.” This is true. It’s a Law of Large Numbers argument.
    
    It also seems perhaps circular. After all, we determine that traits depend on a large number of genes with small effects precisely by doing statistical studies that look at correlations. What if things *aren’t* quite as polygenic as we think they are?
    
    As Douglas Knight pointed out, some GWAS studies actually do separate out additive components. What you do here is, instead of just building a linear model, you have linear terms for each of the genes (the additive part), epistatic terms (and dominance terms (for interactions.)
    
    If there are two genes, we have something like
    P_{ij} = \mu + \alpha_i + \alpha_j + d_{ij} + e_{ij}
    
    where \mu is the population mean, \alpha_i and \alpha_j are the additive effects of the tow genes, d_{ij} is the dominance term, and e_{ij} is the interaction term.
    
    Here’s a simple slideshow explaining this. http://www.ihh.kvl.dk/htm/kc/popgen/genetics/6/6/sld001.htm
    
    “Dominance” just refers to Mendelian dominance — the effects of two alleles of the same gene don’t add linearly if there’s a dominant/recessive relationship. *That,* you can get empirically.
    
    I’m still not sure how they come up with epistatic terms, but after some quick reading and some paper examples, at least some of them model these effects as normally distributed.
    
    And that seems like a place to hide under the rug. Of course if I imagine I have normally distributed, mean zero, interactions which are as likely to be positive as negative, they’ll tend to cancel out. I’m still uncertain of how we know we don’t live in a world that’s overwhelmingly full of negative epistasis (sublinear, the whole effect is less than the sum of individual effects, one part breaking wrecks the whole thing) or a world that’s overwhelmingly full of positive epistasis (superlinear, the whole effect is more than the sum of individual effects, parts are redundant and they all need to break before you get in trouble.)
    
    Are we *really* estimating additive vs. non-additive variation? Or are we assuming the conclusion?
    
    http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0045293
    http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1003502
    - Douglas Knight says:
      
      August 24, 2013 at 12:30 pm
      
      I said no such thing. I have no idea what non-linear things people do. On the contrary, I say they are right to ignore non-linear effects on a first pass.
      
      No, as I said before, we are not assuming the conclusion. We have measured narrow sense heritability h^2 and know that a lot of the variance in height is additive. Even more may be due to non-linear effects; the broad-sense H^2 may be almost everything, but that doesn’t take away from the fact that huge amounts of variance are predicted by a weighted sum of genes. The only question is what are the coefficients, not whether this is a good model.
      
      Yes, by using linear models, we are throwing out some of the genetic variance, perhaps as much as half, but we know how much is left and we know that it is a lot, much more than is predicted by particular genes. When you use non-linear models to explain why h^2=0, you are trying to explain something that isn’t happening.
      
      (What is h^2 for height? I’m not sure. I think it has been measured to be in the range 0.3-0.9. That’s a big range, but it excludes the much smaller number that is explained by specific genes in GWAS. Thus they are missing many other genes.)
    - Douglas Knight says:
      
      August 24, 2013 at 3:15 pm
      
      Your comment (“They’re all morons!”) sounds like you are trying to explain why H^2 could be large while at the same time h^2 could be small. My point is, both times, is that h^2 is not small. I thought that you were failing to know about the existence of h^2. But I missed the links at the bottom of this comment, which distinguish between h^2 and H^2. A further point that I should have made is that when I claim we have measured h^2 of height, I don’t mean by DNA means, but by comparing relatives, the pedigree method of the second paper. The point is that you can measure h^2, the sum across all genes, without knowing about any genes.
    - Sarah says:
      
      August 24, 2013 at 4:47 pm
      
      ok then I am confused. How do you measure h^2?
    - Douglas Knight says:
      
      August 25, 2013 at 11:00 am
      
      If you are a farmer, providing a uniform environment and breeding at random, narrow root-heritability h is simply the correlation between parental trait and child trait. It’s trickier in humans.
Sarah says:

August 23, 2013 at 7:40 pm

I am not a bioinformaticist and for all I know this has been done many times before, but the sensible thing to do seems like:

1.) pick a very small set of candidate loci with the highest coefficients in linear regression, and declare everything else irrelevant
2.) learn a nonlinear model to fit the data using only those loci
fiddlemath says:

August 24, 2013 at 12:32 am

Sorry for being content-free, but I love this, and may have to frame it and hang it:

I know other people have blogs where they explain things and readers bask in their wisdom, but having a blog where I say how confused I am about stuff and readers explain it to me has always worked for me before and I have faith it will continue to do so.
Andrew Ducker says:

August 24, 2013 at 12:35 am

For an awful lot of people the genetic potential is squandered. So they may well be potentially incredibly smart, but they aren’t getting anywhere near those limits. (Hence things like the Flynn effect.)

Bit like height really – I’d expect genetic correlations to height to be really obvious now- but almost unnoticeable 200 years ago, because the diet was so bad that most people would be stunted from that.
- Randy M says:
  
  August 26, 2013 at 7:49 am
  
  Although that makes one wonder how such things were selected upon in the first place!
M. Wittig says:

August 24, 2013 at 8:25 pm

My pet theory is that all the failures to find genes associated with major diseases are because most of the heritable traits people care about are passed down via some epigenetic mechanism.
Kaleberg says:

August 24, 2013 at 9:42 pm

There are also all sorts of tradeoffs. Look at the recent study showing how rams trade of longer horns which offer better breeding opportunities and longer life which offers more opportunities for breeding.

Human brains grow rapidly in youth, so why haven’t we evolved to maintain this growth into adulthood? Well, this high speed growth introduces risks of epilepsy and other disorders, many of which vanish with maturity. Having added neural flexibility is not all it is cut out to be. Many of these genes may be self limiting and best expressed in moderation.

More seriously, these genes don’t encode for intelligence or 0.1% intelligence or whatever. They encode for structural, regulatory and developmental things. If your Lego set has 1,000 different shaped blocks, you can usually build more or less anything with 985 of them. Of course, some bastards are just out of luck.
Douglas Knight says:

August 26, 2013 at 2:01 pm

Two comments about mutational load. First about correlation, then about a single trait.

We should find the same people being very smart and very tall and very athletic and very healthy.

As I said before, you seem to reify mutational load. If the genome has separate genes for height and IQ, there should be no correlation between them, even if there are a lot of genes for each and both traits are well-described in terms of mutational load. Then a person would have a height load and an IQ load, which are unrelated.

The more genes that contribute to the two traits, the more likely that one gene contributes to both traits. Indeed, that is probably why the two traits are correlated. In that sense, mutational load is about correlation. For example, if IQ is affected by 1% of genes, and height by 1% of genes, independently, so that .01% of genes affect both, the correlation between the two traits would be 0.1.

As for mutational load resolving your confusion about quantitative traits, it appears to me that you have some intermediate idea (maybe involving correlations) that you think must explain quantitative traits and so when you want mutational load to explain quantitative traits via mutational load, you claim that mutational load leads to this intermediate idea and that the intermediate idea solves your confusion. I think both steps are wrong. Moreover, my guess as to why you think mutational load is relevant because the same people talk about both ideas. But that is because understanding quantitative traits is a prerequisite to understanding mutational load, not vice versa.
Bob Stuart says:

August 26, 2013 at 3:09 pm

My guess is that notable intelligence arises primarily from an efficient pattern in the synapses for processing a specialized type of problem. There are many different kinds of smarts, but a genius tends to focus on one. So, in genetics, one might look for patterns of genes that will work symbiotically.
Anders says:

July 18, 2014 at 12:46 am

A high number of very rare alleles each with relatively large negative effect. This is exactly what you would expect under purifying selection, most differences should be caused by uncommon ways of messing up. The formula for binomial variance V[x] = np(1-p), if we set the expected value (np) to a constant (since we know median IQ is 100) then we find that the smaller p is the higher the variance is. The reason these alleles wouldn’t be identified as having strong effect is that are individually too rare to be statistically significant.

It is perfectly possible that there are also many common genes with with low effect, but by your own argument they wouldn’t contribute much to variance.
- Douglas Knight says:
  
  July 18, 2014 at 1:02 am
  
  How big a “relatively large” effect do you expect? Why?
  
  Steve Hsu estimates half a point per negative mutation.

Blogroll

Economics

Effective Altruism

Rationality

Science

SSC Elsewhere

Archives

Things I Don’t Understand About Genetics (A Non-Exhaustive List)

38 Responses to Things I Don’t Understand About Genetics (A Non-Exhaustive List)

Meta