codex Slate Star Codex

By the author of

Three Great Articles On Poverty, And Why I Disagree With All Of Them

QZ: The universal basic income is an idea whose time will never come. Okay, maybe this one isn’t so great. It argues that work is ennobling (or whatever), that robots probably aren’t stealing our jobs, that even if we’re going through a period of economic disruption we’ll probably adapt, and that “if the goal is eliminating poverty, it is better to direct public funds to [failing schools and substandard public services]” then to try a guaranteed income scheme. It ends by saying that “I can’t understand why we’d consider creating and then calcifying a perpetually under-employed underclass by promoting the stagnation of their skills and severing their links to broader communities.”

(imagine a world where we had created and calcified a perpetually under-employed stagnant underclass. It sounds awful.)

More Crows Than Eagles: Unnecessariat. This one is great. A blogger from the Rust Belt reports on the increasing economic despair and frustration all around her, in the context of the recent spikes in heroin overdoses and suicides. There’s an important caveat here, in that at least national-level economic data paint a rosy picture: the unemployment rate is very low, consumer confidence is high, and the studies of technological unemployment suggest it’s not happening yet. Still, a lot of people on the ground – the anonymous blogger, the pathologists she worked with, and me from my position as a psychiatrist in the Midwest – feel like there’s a lot more misery and despair than the statistics suggest. MCTE replaces the old idea of the “precariat” – people who just barely have jobs and are worried about losing them – with her own coinage “unnecessariat” – people who don’t have jobs, are useless to the economy, and nobody cares what happens to them. It reminds me of the old argument of sweatshop-supporting economists – sure, we’re exploiting you, but you’d miss us if we left. She hates Silicon Valley for building its glittering megaplexes while ignoring everyone else, but she hates even more the people saying “Learn to code! Become part of the bright new exciting knowledge economy!” because realistically there’s no way an opioid-depended 55-year-old ex-trucker from Kentucky is going to learn to code. The only thing such people have left is a howl of impotent rage, and it has a silly hairstyle and is named Donald J. Trump.

Freddie deBoer: Our Nightmare. Also pretty great. The same things deBoer has been warning about for years, but expressed unusually clearly. By taking on the superficial mantle of center-leftism, elites sublimate the revolutionary impulse into a competition for social virtue points which ends up reinforcing and legitimizing existing power structures. Constant tally-keeping over what percent of obscenely rich exploitative Wall Street executives are people of color replaces the question of whether there should be obscenely rich exploitative Wall Street executives at all. As such tendencies completely capture the Democratic Party and the country’s mainstream left, genuine economic anger becomes more likely to be funneled into the right wing, where the elites can dismiss it as probably-racist (often with justification) and ignore it. “I cannot stress enough to you how vulnerable the case for economic justice is in this country right now. Elites agitate against it constantly…this is a movement, coordinated from above, and its intent is to solidify the already-vast control of economic elites over our political system…[Liberalism] is an attempt to ameliorate the inequality and immiseration of capitalism, when inequality and immiseration are the very purpose of capitalism.”

These articles all look at poverty in different ways, and I think that I look at poverty in a different way still. In the spirit of all the crazy political compasses out there, maybe we can learn something by categorizing them:

Including only people who think society should be in the business of collectively helping the poor at all (ie no extreme libertarians or social Darwinists) and people who are interested in something beyond deBoer’s nightmare scenario (ie not just making sure every identity group has an equal shot at the Wall Street positions).

People seem to split into a competitive versus a cooperative view of poverty. To massively oversimplify: competitives agree with deBoer that “inequality and immiseration are the very purpose of capitalism” and conceive of ending poverty in terms of stopping exploitation and giving the poor their “just due” that the rich have taken away from them. The cooperatives argue that everyone is working together to create a nice economy that enriches everybody who participates in it, but some people haven’t figured out exactly how to plug into the magic wealth-generating machine, and we should give them a helping hand (“here’s government-subsidized tuition to a school where you can learn to code!”). Probably nobody’s 100% competitive or 100% cooperative, but I think a lot of people have a tendency to view the problem more one way than the other.

So the northwest corner of the grid is people who think the problem is primarily one of exploitation, but it’s at least somewhat tractable to reform. No surprises here – these are the types who think that the big corporations are exploiting people, but if average citizens try hard enough they can make the Man pay a $15 minimum wage and give them free college tuition, and then with enough small victories like these they can level the balance enough to give everybody a chance.

(These are all going to be straw men, but hopefully useful straw men)

The southwest corner is people who think the problem is primarily one of exploitation, but nothing within the system will possibly help. I put “full communism” in the little box, but I guess this could also be anarcho-syndicalism, or anarcho-capitalism, or theocracy, or Trumpism, or [insert your preferred poorly-planned form of government which inevitably fails here].

The northeast corner is people who think we’re all in this together and there are lots of opportunities to help. This is the QZ writer who said we should be focusing on “education and public services”. The economy is a benevolent force that wants to help everybody, but some people through bad luck – poor educational opportunities, not enough childcare, racial prejudice – haven’t gotten the opportunity they need yet, so we should lend them a helping hand so they can get back on their feet and one day learn to code. I named this quadrant “Free School Lunches” after all those studies that show that giving poor kids free school lunches improves their grades by X percent, which changes their chances of getting into a good college by Y percent, which increases their future income by Z percent, so all we have to do is have lots of social programs like free school lunches and then poverty is solved. But aside from the lunch people people, this category must also include libertarians who think that all we need to do is remove regulations that prevent the poor from succeeding, Reaganites who think that a rising tide will lift all boats, and conservatives who think the poor just need to be taught Traditional Hard-Working Values. Actually, probably 90% of the Overton Window is in this corner.

The southeast corner is people who think that we’re all in this together, but that helping the poor is really hard. They agree with the free school lunch crowd that capitalism is more the solution than the problem, and that we should think of this in terms of complicated impersonal social and educational factors preventing poor people from fitting into the economy. But the southeasterners worry school lunches won’t be enough. Maybe even hiring great teachers, giving everybody free health care, ending racism, and giving generous vocational training to people in need wouldn’t be enough. If we held a communist revolution, it wouldn’t do a thing: you can’t hold a revolution against skill mismatch. This is a very gloomy quadrant, and I don’t blame people for not wanting to be in it. But it’s where I spend most of my time.

The exploitation narrative seems fundamentally wrong to me – I’m not saying exploitation doesn’t happen, nor even that it isn’t common, just that isn’t not the major factor causing poverty and social decay. The unnecessariat article, for all its rage against Silicon Valley hogging the wealth, half-admits this – the people profiled have become unnecessary to the functioning of the economy, no longer having a function even as exploited proletarians. Silicon Valley isn’t exploiting these people, just ignoring them. Fears of technological unemployment are also relevant here: they’re just the doomsday scenario where all of us are relegated to the unnecessariat, the economy having passed us by.

But I also can’t be optimistic about programs to end poverty. Whether it’s finding out that schools and teachers have relatively little effect on student achievement, that good parenting has even less, or that differences in income are up to fifty-eight percent heritable and a lot of what isn’t outright genetic is weird biology or noise, most of the research I read is very doubtful of easy (or even hard) solutions. Even the most extensive early interventions have underwhelming effects. We can spend the collective energy of our society beating our head against a problem for decades and make no headway. While there may still be low-hanging fruit – maybe an scaled-up Perry Preschool Project, lots of prenatal vitamins, or some scientist discovering a new version of the unleaded-gasoline movement – we don’t seem very good at finding it, and I worry it would be at most a drop in the bucket. Right now I think that a lot of variation in class and income is due to genetics and really deep cultural factors that nobody knows how to change en masse.

I can’t even really believe that a rising tide will lift all boats anymore. Not only has GDP uncoupled from median wages over the past forty years, but there seems to be a Red Queen’s Race where every time the GDP goes up the cost of living goes up the same amount. US real GDP has dectupled since 1900, yet a lot of people have no savings and are one paycheck away from the street. In theory, a 1900s poor person who suddenly got 10x his normal salary should be able to save 90% of it, build up a fund for rainy days, and end up in a much better position. In practice, even if the minimum wage in 2100 is $200 2016 dollar an hour, I expect the average 2100 poor person will be one paycheck away from the street. I can’t explain this, I just accept it at this point. And I think that aside from our superior technology, I would rather be a poor farmer in 1900 than a poor kid in the projects today. More southeast corner gloom.

The only public figure I can think of in the southeast quadrant with me is Charles Murray. Neither he nor I would dare reduce all class differences to heredity, and he in particular has some very sophisticated theories about class and culture. But he shares my skepticism that the 55 year old Kentucky trucker can be taught to code, and I don’t think he’s too sanguine about the trucker’s kids either. His solution is a basic income guarantee, and I guess that’s mine too. Not because I have great answers to all of the QZ article’s problems. But just because I don’t have any better ideas1,2.

The QZ article warns that it might create a calcified “perpetually under-employed stagnant underclass”. But of course we already have such an underclass, and it’s terrible. I can neither imagine them all learning to code, nor a sudden revival of the non-coding jobs they used to enjoy. Throwing money at them is a pretty subpar solution, but it’s better than leaving everything the way it is and not throwing money at them.

This is why I can’t entirely sympathize with any of the essays I read on poverty, eloquent though they are.


1. And then there’s the rest of the world. Given the success of export capitalism in Korea, Taiwan, China, Vietnam, et cetera, and the pattern where multinationals move to some undeveloped country with cheap labor, boost the local economy until the country is developed and labor there isn’t so cheap anymore, and then move on to the next beneficiary – solving international poverty seems a lot easier than solving local poverty. All we have to do is keep wanting shoes and plastic toys. And part of me wonders – if setting up a social safety net would slow domestic economic growth – or even divert money that would otherwise go to foreign aid – does that make it a net negative? Maybe we should be optimizing for maximum economic growth until we’ve maxed out the good we can do by industrializing Third World countries? My guess is that enough of the basic income debate is about how to use existing welfare payments that this wouldn’t be too big a factor. And I would hope (for complicated reasons), that basic income would be more likely to help than hurt the economy3.

2. Obviously invent genetic engineering and create a post-scarcity society, but until then we have to deal with this stuff.

3. And then there’s the whole open borders idea, which probably isn’t very compatible with basic income at all. Right now I think – I’ll explain at more length later – fully open borders is a bad idea, because the risk of it destabilizing the country and ruining the economic motor that lifts Third World countries out of poverty is too high.

OT50: Opentecost

This is the bi-weekly open thread. Post about anything you want, ask random questions, whatever. Also:

1. Thanks to Douglas Knight, who proposed a new system of open threads I think I’ll be using from now on. There is an Open Thread tag above. When you click it, you will be taken to the newest Open Thread. Once every second Sunday, the Open Thread will be posted publicly on the main blog like this one. On Wednesdays and the other Sunday, it will be posted quietly and appear only if you’re looking for it in the Open Thread section. So you can find a new Open Thread there starting this Wednesday the 25th.

2. I don’t plan to introduce Reddit-style voting on comments here because most people have said they don’t want it. If you do want it, there is a Greasemonkey script available that will let you have it, without affecting the comments viewed by everyone else.

3. I finally gave in and banned; abuses were just getting too annoying. Sorry.

Posted in Uncategorized | Tagged | 805 Comments

Teachers: Much More Than You Wanted To Know

[Epistemic status: This is really complicated, this is not my field, people who have spent their entire lives studying this subject have different opinions, and I don’t claim to have done more than a very superficial survey. I welcome corrections on the many inevitable errors.]


Newspapers report that having a better teacher for even a single grade (for example, a better fourth-grade teacher) can improve a child’s lifetime earning prospects by $80,000. Meanwhile, behavioral genetics studies suggest that a child’s parents have minimal (non-genetic) impact on their future earnings. So one year with your fourth-grade teacher making you learn fractions has vast effects on your prospects, but twenty-odd years with your parents shaping you at every moment doesn’t? Huh? I decided to try to figure this out by looking into the research on teacher effectiveness more closely.

First, how much do teachers matter compared to other things? To find out, researchers take a district full of kids with varying standardized test scores and try to figure out how much of the variance can be predicted by what school the kids are in, what teacher’s class the kids are in, and other demographic factors about the kids. So for example if the test scores of two kids in the same teacher’s class were on average no more similar than the test scores of two kids in two different teachers’ classes, then teachers can’t matter very much. But if we were consistently seeing things like everybody in Teacher A’s class getting A+s and everyone in Teacher B’s class getting Ds, that would suggest that good teachers are very important.

Here are the results from three teams that tried this (source, source, source):

These differ a little in that the first one assumes away all noise (“unexplained variance”) and the latter two keep it in. But they all agree pretty well that individual factors are most important, followed by school and teacher factors of roughly equal size. Teacher factors explain somewhere between 5% and 20% of the variance. Other studies seem to agree, usually a little to the lower end. For example, Goldhaber, Brewer, and Anderson (1999) find teachers explain 9% of variance; Nye, Konstantopoulos, and Hedges (2004) find they explain 13% of variance for math and 7% for reading. The American Statistical Association summarizes the research as “teachers account for about 1% to 14% of the variability in test scores”, which seems about right.

So put more simply – on average, individual students’ level of ability grit is what makes the difference. Good schools and teachers may push that a little higher, and bad ones bring it a little lower, but they don’t work miracles.

(remember that right now we’re talking about same-year standardized test scores. That is, we’re talking about how much your fourth-grade history teacher affects your performance on a fourth-grade history test. If teacher effects show up anywhere, this is where it’s going to be.)

Just as it’s much easier to say “this is 40% genetic” than to identify particular genes, so it’s much easier to say “this is 10% dependent on school-level factors and 10% based on teacher-level factors” then to identify what those school-level and teacher-level factors are. The Goldhaber study above tries its best, but the only school-level variable they can pin down is that having lots of white kids in your school improves test scores. And as far as I can tell, they don’t look at socioeconomic status of the school or its neighborhood, which is probably what the white kids are serving as a proxy for. Even though these “school level effects” are supposed to be things like “the school is well-funded” or “the school has a great principal”, I worry that they’re capturing student effects by accident. That is, if you go to a school where everyone else is a rich white kid, chances are that means you’re a rich white kid yourself. Although they try to control for this, having a couple of quantifiable variables like race and income probably doesn’t entirely capture the complexities of neighborhood sorting by social class.

In terms of observable teacher-level effects, the only one they can find that makes a difference is gender (female teachers are better). Teacher certification, years of experience, certification, degrees, et cetera have no effect. This is consistent with most other research, including Miller, McKenna, and McKenna (1998) and Goldhaber and Brewer (1998). A few studies that we’ll get to later do suggest teacher experience matters; almost nobody wants to claim certifications or degrees do much.

One measurable variable not mentioned here does seem to have a strong ability to predict successful teachers. I’m not able to access these studies directly, but according to the site of the US Assistant Secretary of Education:

The most robust finding in the research literature is the effect of teacher verbal and cognitive ability on student achievement. Every study that has included a valid measure of teacher verbal or cognitive ability has found that it accounts for more variance in student achievement than any other measured characteristic of teachers (e.g., Greenwald, Hedges, & Lane, 1996; Ferguson & Ladd, 1996; Kain & Singleton, 1996; Ehrenberg & Brewer, 1994).

So far most of this is straightforward and uncontroversial. Teachers account for about 10% of variance in student test scores, it’s hard to predict which teachers do better by their characteristics alone, and schools account for a little more but that might be confounded. In order to say more than this we have to have a more precise way of identifying exactly which teachers are good, which is going to be more complicated.


Suppose you want to figure out which teachers in a certain district are the best. You know that the only thing truly important in life is standardized test scores [citation needed], so you calculate the average test score for each teacher’s class, then crown whoever has the highest average as Teacher Of The Year. What could go wrong?

But you’ll probably just give the award to whoever teaches the gifted class. Teachers have classes with very different ability, and we already determined that innate ability grit explains more variance than teacher skill, so teachers who teach disadvantaged children will be at a big, uh, disadvantage.

So okay, back up. Instead of judging teachers by average test score, we can judge them by the average change in test score. If they start with a bunch of kids who have always scored around twentieth percentile, and they teach them so much that now the kids score at the fortieth percentile, then even though their kids are still below average they’ve clearly done some good work. Rank how many percentile points on average a teacher’s students go up or down during the year, and you should be able to identify the best teachers for real this time.

Add like fifty layers of incomprehensible statistics and this is the basic idea behind VAM (value-added modeling), the latest Exciting Educational Trend and the lynchpin of President Obama’s educational reforms. If you use VAM to find out which teachers are better than others, you can pay the good ones more to encourage them to stick around. As for the bad ones, VAM opponents are only being slightly unfair when they describe the plan as “firing your way to educational excellence”.

A claim like “VAM accurately predicts test scores” is kind of circular, since test scores are what we used to determine VAM. But I think the people in this field try to use the VAM of class c to predict the student performance of class c + 1, or other more complicated techniques, and Chetty, Rothstein, and Rivkin, Hanushek, and Kane all find that a one standard deviation increase in teacher VAM corresponds to about a 0.1 standard deviation increase in student test scores.

Let’s try putting this in English. Consider an average student with an average teacher. We expect her to score at exactly the 50th percentile on her tests. Now imagine she switched to the best teacher in the whole school. My elementary school had about forty teachers, so this is 97.5th percentile eg two standard deviations above the mean. A teacher whose VAM is two standard deviations above the mean should have students who score on average 0.2 standard deviations above the mean. Instead of scoring at the 50th percentile, now she’ll score at the 58th percentile.

Or consider the SAT, which is not the sort of standardized test involved in VAM but which at least everybody knows about. Each of its subtests is normed to a mean of 500 and an SD of 110. Our hypothetical well-taught student would go from an SAT of 500 to an SAT of 522. Meanwhile, average SAT subtest score needed to get into Harvard is still somewhere around 740. So this effect is nonzero but not very impressive.

But what happens if we compound this and give this student the best teachers many years in a row? Sanders and Rivers (also Jordan, Mendro, and Weerasinghe) argue the effects are impressive and cumulative. They compare students in Tennessee who got good teachers three years in a row to similar students who got bad teachers three years in a row (good = top quintile; bad = bottom quintile, so only 1/125 students was lucky or unlucky enough to qualify). The average bad-bad-bad student got scores in the 29th percentile; the average good-good-good student got scores in the 83rd percentile – which based on the single-teacher results looks super-additive. This is starting to sound a lot more impressive, and maybe Harvard-worthy after all. In fact, occasionally it is quoted as “four consecutive good teachers would close the black-white achievement gap” (I’m not sure whether this formulation requires also assigning whites to four consecutive bad teachers).

A RAND education report criticizes these studies as “using ad hoc methods” and argue that they’re vulnerable to double-counting student achievement. That is, we know that this teacher is the best because her students get great test scores; then later on we return and get excited over the discovery that the best teachers’ students get great test scores. Sanders and Rivers did some complicated things that ought to adjust for that; RAND runs simulations and finds that depending on the true size of teacher effects vs. student effects, those complicated things may or may not work. They conclude that “[Sanders and Rivers] provide evidence of the existence and persistence of teacher or classroom effects, but the size of the effects is likely to be somewhat overstated”.

Gary Rubinstein thinks he’s debunked Sanders and Rivers style studies. I strongly disagree with his methods – he seems to be saying that the correlation between good teaching and good test scores isn’t exactly one and therefore doesn’t matter – but he offers some useful data. Just by eyeballing and playing around with it, it looks like most of the gain from these “three consecutive great teachers” actually comes from the last great teacher. So the superadditivity might not be quite right, and Sanders and Rivers might just be genuinely finding bigger teacher effects than anybody else.

At what rate do these gains from good teachers decay?

They decay pretty fast. Jacob, Lefgren and Sims find that only 25% of gains carry on to the next year, and only 15% to the year after that. That is, if you had a great fourth grade teacher who raised your test scores by x points, in fifth grade your test scores will be 0.25x higher than they would otherwise have been. Kane and Rothstein find much the same. A RAND report suggests 20% persistence after one year and 10% persistence after two. Jacob, Lefgren, and Sims find that only 25% of gains remain after one year, and about 13% after two years, after which it drops off much more slowly. All of this contradicts Sanders and Rivers pretty badly.

None of these studies can tell us whether the gains go all the way to zero after a long enough time. Chetty does these calculations and finds that they stabilize at 25% of their original value. But this number is higher than the two-year number for most of the other studies, plus Chetty is famous for getting results that are much more spectacular and convenient than anybody else’s. I am really skeptical here. I remember a lot more things about last year than I do about twenty years ago, and even though I am pretty sure that my sixth grade teacher (for some weird reason) taught our class line dancing, I can’t remember a single dance step. And remember Louis Benezet’s early 20th century experiments with not teaching kids any math at all until middle school – after a year or two they were just as good as anyone else, suggesting a dim view of how useful elementary school math teachers must be. And even Chetty doesn’t really seem to want to argue the point, saying that his results “[align] with existing evidence that improvements in education raise contemporaneous scores, then fade out in later scores”.

In summary, I think there’s pretty strong evidence that a +1 SD increase in teacher VAM can increase same-year test scores by + 0.1 SD, but that 50% – 75% of this effect decays in the first two years. I’m less certain how much these numbers change when one gets multiple good or bad teachers in a row, or how fully they decay after the first two years.


When I started looking for evidence about how teachers affected children, I expected teachers’ groups and education specialists to be pushing all the positive results. After all, what could be better for them than solid statistical proof that good teachers are super valuable?

In fact, these groups are the strongest opponents of the above studies – not because they doubt good teachers have an effect, but because in order to prove that effect you have to concede that good teaching is easy to measure, which tends to turn into proposals to use VAM to measure teacher performance and then fire underperformers. They argue that VAM is biased and likely to unfairly pull down teachers who get assigned less intelligent lower-grit kids.

It’s always fun to watch rancorous academic dramas from the outside, and the drama around VAM is really a level above anything else I’ve seen. A typical example is the blog VAMboozled! with its oddly hypnotic logo and a steady stream of posts like Kane Is At It Again: “Statistically Significant” Claims Exaggerated To Influence Policy. Historian/researcher Diane Ravitch doesn’t have quite as cute an aesthetic, but she writes things like:

VAM is Junk Science. Looking at children as machine-made widgets and looking at learning solely as standardized test scores may thrill some econometricians, but it has nothing to do with the real world of children, learning, and teaching. It is a grand theory that might net its authors a Nobel Prize for its grandiosity, but it is both meaningless in relation to any genuine concept of education and harmful in its mechanistic and reductive view of humanity.

But tell us how you really feel.

I was originally skeptical of this, but after reading enough of these sites I think they have some good points about how VAM isn’t always a good measure.

First, it seems to depend a lot on student characteristics; for example, it’s harder to get a high VAM in a class full of English as a Second Language students. It makes perfect sense that ESL students would get low test scores, but since VAM controls for prior achievement you might expect them to get the same VAM anyway. They don’t. Also, a lot of VAM models control for student race, gender, socioeconomic status, et cetera. I guess this is better than not doing this, but it seems to show a lack of confidence – if controlling for prior achievement was enough, you wouldn’t need to control for these other things. But apparently people do feel the need to control for this stuff, and at that point I bring up my usual objection that you can never control for confounders enough, and also all to some degree these things are probably just lossy proxies for genetics which you definitely can’t control for enough.

Maybe because of this, there’s a lot of noise in VAM estimates. Goldhaber & Hansen (2013) finds that a teacher’s VAM in year t is correlated at about 0.3 with their VAM in year t + 1. A Gates Foundation study also found reliabilities from 0.19 to 0.4, averaging about 0.3. Newton et al get slightly higher numbers from 0.4 to 0.6; Bessolo a wider range from 0.2 to 0.6. But these are all in the same ballpark, and Goldhaber and Hanson snarkily note that standardized tests aimed to assess students usually need correlations of 0.8 to 0.9 to be considered valid (the SAT, for example, is around 0.87). Although this suggests there’s some component of VAM which is stable, it can’t be considered to be “assessing” teachers in the same way normal tests assess students.

Even if VAM is a very noisy estimate, can’t the noise be toned down by averaging it out over many years? I think the answer is yes, and I think the most careful advocates of VAM want to do this, but President Obama wants to improve education now and a lot of teachers don’t have ten years worth of VAM estimates.

Also, some teachers complain that even averaging it out wouldn’t work if there are consistent differences in student assignment. For example, if Ms. Andrews always got the best students, and Mr. Brown always got the worst students, then averaging ten years is just going to average ten years of biased data. Proponents argue that aside from a few obvious cases (the teacher of the gifted class, the teacher of the ESL class) this shouldn’t happen. They can add school-fixed effects into their models (eg control for average performance of students at a particular school), leaving behind only teacher effects. And, they argue, which student in a school gets assigned which teacher ought to be random. Opponents argue that it might not be, and cite Paufler and Amrein-Beardsley‘s survey of principals, in which the principals all admit they don’t assign students to classes randomly. But if you look at the study, the principals say that they’re trying to be super-random – ie deliberately make sure that all classes are as balanced as possible. Even if they don’t 100% achieve this goal, shouldn’t the remaining differences be pretty minimal?

Maybe not. Rothstein (2009) tries to “predict” students’ fourth-grade test scores using their fifth-grade teacher’s VAM and finds that this totally works. Either schools are defying the laws of time and space, or for some reason the kids who do well in fourth-grade are getting the best fifth-grade teachers. Briggs and Domingue not only replicate these effects, but find that a fifth-grade teacher’s “effects” on her students in fourth-grade is just as big as her effect on her students when she is actually teaching them, which would suggest that 100% of VAM is bias. Goldhaber has an argument for why there are statistical reasons this might not be so damning, which I unfortunately don’t have enough understanding grit to evaluate.

Genetics might also play a role in explaining these results (h/t Spotted Toad’s excellent post on the subject). A twin study by Robert Plomin does the classical behavioral genetics thing to VAM and finds that individual students’ nth grade VAM is about 40% to 50% heritable. That is, the change in your test scores between third to fourth grade will probably be more like the change in your identical twin’s test scores than like the change in your fraternal twin’s test scores.

At first glance, this doesn’t make sense – since VAM controls for past performance, shouldn’t it be a pretty pure measure of your teacher’s effectiveness? Toad argues otherwise. One of those Ten Replicated Findings From Behavioral Genetics is that IQ is more shared environmental in younger kids and more genetic in older kids. In other words, when you’re really young, how smart you are depends on how enriched your environment is; as you grow older, it becomes more genetically determined.

So suppose that your environment is predisposing you to an IQ of 100, but your genes are predisposing you to an IQ of 120. And suppose (pardon the oversimplification) that at age 5 your IQ is 100, at age 15 it’s 120, and change between those ages is linear. Then every year you could expect to gain 2 IQ points. Now suppose there’s another kid whose environment is predisposing her to an IQ of 130, but whose genes are predisposing her to an IQ of 90. At age 5 her IQ is 130, at age 15 it’s 90, and so every year she is losing 4 IQ points. And finally, suppose that your score on standardized tests is exactly 100% predicted by your IQ. Since you gain two points every year, in fifth grade you’ll gain two points on your test, and your teacher will look pretty good. She’ll get a good VAM, a raise, and a promotion. Since your friend loses four points every year, in fifth grade she’ll lose four points on her test, and her teacher will look incompetent and be assigned remedial training.

This critique meshes nicely with the Rothstein test. Since you’re gaining 2 points every year, Prof. Rothstein can use your 5th grade gains of +2 points to accurately predict your fourth grade gain of +2 points. Then he can use your friend’s 5th grade loss of -4 points to accurately predict her fourth grade loss of -4 points.

This is a very neat explanation. My only concern is that it doesn’t explain decay effects very well. If a fifth grade teacher’s time-bending effect on students in fourth grade is exactly the same as her non-time-bending effect on students in fifth grade, how come her effect on her students once they graduate to sixth grade will only be 25% as large as her fifth grade effects? How come her seventh-grade effects will be smaller still? Somebody here has to be really wrong.

It would be nice to be able to draw all of this together by saying that teachers have almost no persistent effects, and the genetic component identified by Plomin and pointed at by Rothstein represents the 15 – 25% “permanent” gain identified by Chetty and others which so contradicts my lack of line dancing memories. But that would be just throwing out Briggs and Domingue’s finding that the Rothstein effect explains 100% of identified VAM.

One thing I kept seeing in the best papers on this was an acknowledgement that instead of arguing “VAMs are biased!” versus “VAMs are great!”, people should probably just agree that VAMs are biased, just like everything else, and start figuring out ways to measure exactly how biased they are, then use that number to determine what purposes they are or aren’t appropriate for. But I haven’t seen anybody doing this in a way I can understand.

In summary, there are many reasons to be skeptical of VAM. But some of these reasons contradict each other, and it’s not clear that we should be infinitely skeptical. A big part of VAM is bias, but there might also be some signal within the noise, especially when it’s averaged out over many years.


So let’s go back to that study that says that a good fourth grade teacher can earn you $89,000. The study itself is Chetty, Friedman, and Rockoff (part 1, part 2). You may recognize Chetty as a name that keeps coming up, usually attached to findings about as unbelievable as these ones.

Bloomberg said that “a truly great” teacher could improve a child’s earnings by $80,000, but I think this is mostly extrapolation. The number I see in the paper is a claim that a 1 SD better fourth-grade teacher can improve lifetime earnings by $39,000, so let’s stick with that.

This sounds impressive, but imagine the average kid works 40 years. That means it’s improving yearly earnings by about $1,000. Of note, the study didn’t find this. They found that such teachers improved yearly earnings by about $300, but their study population was mostly in their late twenties and not making very much, and they extrapolated that if good teachers could increase the earnings of entry-level workers by $300, eventually they could increase the earnings of workers with a little more experience by $1000. The authors use a lot of statistics to justify this assumption which I’m not qualified to assess. But really, who cares? The fact that having a good fourth grade teacher can improve your adult earnings any measurable amount is the weird claim here. Once I accept that, I might as well accept $300, $1,000, or $500,000.

And here’s the other weird thing. Everyone else has found that teacher effects on test scores decay very quickly over time. Chetty has sort of found that up to 25% of them persist, but he doesn’t really seem interested in defending that claim and agrees that probably test scores just fade away. Yet as he himself admits, good teachers’ impact on earnings works as if there were zero fadeout of teacher effects. He and his co-authors write:

Our conclusion that teachers have long-lasting impacts may be surprising given evidence that teachers’ impacts on test scores “fade out” very rapidly in subsequent grades (Rothstein 2010, Carrell and West 2010, Jacob, Lefgren, and Sims 2010). We confirm this rapid fade-out in our data, but find that teachers’ impacts on earnings are similar to what one would predict based on the cross-sectional correlation between earnings and contemporaneous test score gains.

They later go on to call this a “pattern of fade-out and re-emergence”, but this is a little misleading. The VAM never re-emerges on test scores. It only shows up in the earnings numbers.

All of this is really dubious, and it seems like Section III gives us an easy way out. There’s probably a component of year-to-year stable bias in VAM, such that it captures something about student quality, maybe even innate ability, rather than just teacher quality. It sounds very easy to just say that this is the component producing Chetty’s finding of income gains at age 28; students who have higher innate ability in fourth grade will probably still have it in their twenties.

Chetty is aware of this argument and tries to close it off. He conducts a quasi-experiment which he thinks replicates and confirms his original point: what happens when new teachers enter the school?

The thing we’re most worried about is bias in student selection to teachers. If we take an entire grade of a school (for example, if a certain school has three fifth-grade teachers, we take all three of them as a unit) this should be immune to such effects. So Chetty looks at entire grades as old teachers retire and new teachers enter. In particular, he looks at such grades when a new teacher transfers from a different school. That new transfer teacher already has a VAM which we know from his work at the other school, which will be either higher or lower than the average VAM of his new school. If it’s higher and VAM is real, we should expect the average VAM of that grade of his new school to go up a proportionate amount. If it’s lower and VAM is real, we should expect the average VAM of that grade of his new school to go down a proportionate amount. Chetty investigates this with all of the transfer teachers in his data, finds this is in fact what happens, and finds that if he estimates VAM from these transfers he gets the same number (+ $1000 in earnings) that he got from the normal data. This is impressive. Maybe even too impressive. Really? The same number? So there’s no bias in the normal data? I thought there was a lot of evidence that most of it was bias?

Rothstein is able to replicate Chetty’s findings using data from a different district, but then he goes on to do the same thing on Chetty’s quasi-experiment as he did on the normal VAMs, with the same results. That is, you can use the amount a school improves when a great new fifth-grade teacher transfers in to predict that teacher’s students’ fourth-grade performance. Not perfectly. But a little. For some reason, teacher transfers are having the same freaky time-bending effects as other VAM. Rothstein mostly explains this by saying that Chetty incorrectly excluded certain classes and teachers from his sample, although I don’t fully understand this argument. He also gives one other example of when this might happen: suppose that a neighborhood is gentrifying. The new teachers who transfer in after the original teachers retire will probably be a better class of professional lured in by the improving neighborhood. And the school’s student body will also probably be more genetically and socioeconomically advantaged. So better transfer teachers will be correlated with higher-achieving kids, but they won’t have caused such high achievement.

After this came an increasingly complicated exchange between Rothstein and Chetty that I wasn’t able to follow. Chetty, Friedman, and Rockoff wrote a 52 page Response To Rothstein where they argued that Rothstein’s methodology would find retro-causal effects even in a fair experiment where none should exist. According to a 538 article on the debate, a couple of smart people (albeit smart people who already support VAMs and might be biased) think that Chetty’s response makes sense, and even Rothstein agrees it “could be” true. 538 definitely thought the advantage in this exchange went to Chetty. But Rothstein responded with a re-replication of his results that he says addresses Chetty’s criticisms but still finds the retro-causal effects indicating bias; as far as I know Chetty has not responded and nobody has weighed in to give me an expert opinion on whether or not it’s right.

My temptation would usually be to say – here are some really weird results that can’t possibly be true which we want to explain away, here’s a widely-respected Berkeley professor of economics who says he’s explained them away, great, let’s forget about the whole thing. But there’s one more experiment which I can’t dismiss so easily.


Project STAR (Student Teacher Achievement Ratio) was a big educational experiment in the 80s and 90s to see whether or not smaller class size improved student performance. That’s a whole different can of worms, but the point is that in order to do this experiment for a while they randomized children to kindergarten classes within schools across 79 different schools. Since one of the biggest possible sources of bias for these last few studies has been possible nonrandom assignment of students to teachers, these Tennessee schools were an opportunity to get much better data than were available anywhere else.

So Chetty, Friedman, Higer, Saez, Schanzenbach, and Yagan analyzed the STAR data. They tried to do a lot of things with predicting earnings based on teacher experience, teacher credentials, and other characteristics, and it’s a bit controversial whether they succeeded or not – see Bryan Caplan’s analysis (1, 2) for more. Caplan is skeptical of a lot of the study, but one part he didn’t address – and which I find most convincing – is based on something a lot like VAM.

Because of the random assignment, Chetty et al don’t have to do full VAM here. It looks like their measure of kindergarten teacher quality is just the average of all their students’ test scores (wait, kindergarteners are taking standardized tests now? I guess so.) When they’re using teacher quality to predict the success of specific students, they use the average of all the test scores except that of the student being predicted, in order to keep it fair.

They find that the average test score of all the other students in your class, compared against the average score of all the students in other randomly assigned classes in your school, predicts your own test score. “A one percentile increase in entry-year class quality is estimated to raise own test scores by 0.68 percentiles, confirming that test scores are highly correlated across students within a classroom”. This fades to approximately zero by fourth grade, confirming that the test-score-related benefits of having a good teacher are transient and decay quickly. But, students assigned to a one-percentile-higher class have average earnings that are 0.4% higher at age 25-27! And they say that this relationship is linear! So for example, the best kindergarten teacher in their dataset caused her class to perform at the 70th percentile on average, and these students earned about $17000 on average (remember, these are young entry-level workers in Tennessee) compared to the $15500 or so of their more average-kindergarten-teacher-having peers. Just their kindergarten teacher, totally apart from any other teacher in their life history, increased their average income 10%. Really, Chetty et al? Really?

But as crazy as it is, this study is hard to poke holes in. Even in arguing against it, Caplan notes that “it’s an extremely impressive piece” that “the authors are very careful”, and that it’s “one of the most impressive empirical papers ever written”. The experimental randomization means we can’t apply most of the usual anti-VAM measures to it. I don’t know, man. I just don’t know.

Okay, fine. I have one really long-shot possibility. Chetty et al derive their measure for teacher quality from the performance of all of the students in a class, excluding each student in turn as they try to predict his or her results. But this is only exogenous if the student doesn’t affect his or her peers’ test scores. But it’s possible some students do affect their peers’ test scores. If a student is a behavioral problem, they can screw up the whole rest of their class. Carrell finds that “exposure to a disruptive peer in classes of 25 during elementary school reduces earnings at age 26 by 3 to 4 percent”. Now, this in itself is a crazy, hard-to-believe study. But if we accept this second crazy hard-to-believe study, it might provide us with a way of attacking the first crazy hard-to-believe study. Suppose we have a really screwed-up student who is always misbehaving in class and disrupting the lesson. This lowers all his peers’ test scores and makes the teacher look low-quality. Then that kid grows up and remains screwed-up and misbehaving and doesn’t get as good a job. If this is a big factor in the differences in performances between classes, then so-called “teacher quality” might be conflated with a measure of how many children in their classes are behavioral problems, and apparent effects of teacher quality on earnings might just represent that misbehaving kids tend to become low-earning adults. I’m not sure if the magnitude of this effect checks out, but it might be a possibility.

But if we can’t make that work, we’re stuck believing that good kindergarten teachers can increase your yearly earnings by thousands of dollars. What do we make of that?

Again, everybody finds that test score gains do not last nearly that long. So it can’t be that kindergarten teachers provide you with a useful fund of knowledge which you build upon later. It can’t even be that kindergarten teachers stimulate and enrich you which raises your IQ or makes you love learning or anything like that. It has to be something orthogonal to test scores and measurable intellectual ability.

Chetty et al’s explanation is that teachers also teach “non-cognitive skills”. I can’t understand the regressions they use, but they say that although a one percentile increase in kindergarten class quality has a statistically insignificant increase (+ 0.05 percentiles) on 8th grade test scores, it has a statistically significant increase (+0.15 percentiles) on 8th grade non-cognitive scores (“non-cognitive scores” in this case are a survey where 8th grade teachers answer questions like “does this student annoy others?”) They then proceed to demonstrate that the persistence of these non-cognitive effects do a better job of predicting the earning gains than the test scores do. They try to break these non-cognitive effects into four categories: “effort”, “initiative”, “engagement” and “whether the student values school”, but the results are pretty boring and about equally loaded on all of them.

This does go together really well with my “behavioral problem” theory of the kindergarten class-earnings effect. The “quality” of a student’s kindergarten class, which might have more to do with the number of students who were behavioral problems in it than anything else, doesn’t correlate with future test scores but does correlate with future behavioral problems. It also seems to match Plomin’s point about how very early test scores are determined by environment, but later test scores are determined by genetics. A poor learning environment might be a really big deal in kindergarten, but stop mattering as much later on.

But this also goes together with some other studies that have found the same. The test scores gains from pre-K are notorious for vanishing after a couple of years, but a few really big preschool studies like the Perry Preschool Program found that such programs do not boost IQ but may have other effects (though to complicate matters, apparently Perry did boost later-life standardized test scores, just not IQ scores, and to further complicate matters, other studies find children who went to pre-K have worse behavior). This also sort of reminds me of some of the very preliminary research I’ve been linking to recently suggesting that excessively early school starting ages seem to produce an ADHD-like pattern of bad behavior and later-life bad effects, which I was vaguely willing to attribute to overchallenging kids’ brains too early while they’re still developing. If I wanted to be very mean (and I do!) I could even say that all kindergarten is a neurological insult that destroys later life prospects because of forcing students to overclock their young brains concentrating on boring things, but good teachers can make this less bad than it might otherwise be by making their classes a little more enjoyable.

But even if this is true, it loops back to the question I started with: there’s strong evidence that parents have relatively little non-genetic impact on their childrens’ life outcomes, but now we’re saying that even a kindergarten teacher they only see for a year does have such an impact? And what’s more, it’s not even in the kindergarten teacher’s unique area of comparative advantage (teaching academic subjects), but in the domain of behavioral problems, something that parents have like one zillion times more exposure to and power over?

I don’t know. I still find these studies unbelievable, but don’t have the sort of knock-down evidence to dismiss them that I’d like. I’m really impressed with everybody participating in this debate, with the quality of the data, and with the ability to avoid a lot of the usual failure modes. It’s just not enough to convince me of anything yet.


In summary: teacher quality probably explains 10% of the variation in same-year test scores. A +1 SD better teacher might cause a +0.1 SD year-on-year improvement in test scores. This decays quickly with time and is probably disappears entirely after four or five years, though there may also be small lingering effects. It’s hard to rule out the possibility that other factors, like endogenous sorting of students, or students’ genetic potential, contributes to this as an artifact, and most people agree that these sorts of scores combine some signal with a lot of noise. For some reason, even though teachers’ effects on test scores decay very quickly, studies have shown that they have significant impact on earning as much as 20 or 25 years later, so much so that kindergarten teacher quality can predict thousands of dollars of difference in adult income. This seemingly unbelievable finding has been replicated in quasi-experiments and even in real experiments and is difficult to banish. Since it does not happen through standardized test scores, the most likely explanation is that it involves non-cognitive factors like behavior. I really don’t know whether to believe this and right now I say 50-50 odds that this is a real effect or not – mostly based on low priors rather than on any weakness of the studies themselves. I don’t understand this field very well and place low confidence in anything I have to say about it.

Further reading: Institute of Education Science summary, Edward Haertel’s summary, TTI report, Adler’s critique of Chetty, American Statistical Society’s critique of Chetty/VAM, Chetty’s response, Ballou’s critique of Chetty

Open Thread 49.75

Not really sure where I’m going with this, but let’s try it.

Posted in Uncategorized | Tagged | 954 Comments

Open Thread 49.5

This is an experiment with more open threads. Post about anything you want, ask random questions, whatever.

Posted in Uncategorized | Tagged | 1,458 Comments

Skin In The Game


One of the most interesting responses I got to my post supporting the junior doctors strike was by Salem, who said that this situation was (ethically) little different than that around adjunct professors, who also become overworked and miserable trying to break into a high-status profession. Salem very kindly didn’t directly accuse me of hypocrisy, but maybe he should have.

While I sympathize with adjuncts’ terrible conditions, my natural instinct is to say feedback mechanisms should keep doing their work. You can probably trace the argument- imagine a simplified toy model where the only two jobs are professor and salesperson, and being a professor is fun and high-status but being a salesperson is boring and low-status. Everyone will become a professor, and this will decrease the demand for professors and increase the demand for salespeople until the employers involved change their policies accordingly. Eventually it will stabilize where the nonmonetary advantages of being a professor are perfectly compensated by the monetary advantages of being a salesperson. If professors are getting paid shockingly little, it means the system is sending a signal that the nonmonetary advantages of being a professor are shockingly high, or else why would people keep trying? If we demand that professors get paid more, then we’re letting them keep all their nonmonetary advantages over salespeople but demanding they have monetary advantages as well. It destroys the system’s incentives to have people go into less fun but nevertheless necessary fields.

All of this makes perfect sense in the adjunct case. So why do I feel so differently in the doctor case?

Maybe for personal reasons. When I was in college, my two dueling career plans were doctor and philosophy professor. I brought this up with my professors, who universally told me not to go into academia. They told me that it was grueling, thankless, and for the vast majority of people involved doomed to failure, and that they couldn’t in good conscience advise me to try it. I listened to their advice and became a doctor instead. It might not have quite the I-can’t-believe-they’re-paying-me-to-do-this amazingness of debating metaethics all day, but I still love it and it has much better career prospects.

So I guess you could argue that one reason I have less sympathy for adjuncts is that letting them achieve their goal would be a kick in the face to Past-Scott, who made what he considered the sober choice and went into a better-paying profession. If it turns out all I had to do was hang on a few years, and then the government would decree that people who got paid to argue about metaethics had to have career prospects as good as doctors’, then I was a huge chump to try to do things the hard way. Maybe it’s that fear of chumpness that makes it harder to sympathize.

And maybe the reason I feel such solidarity with doctors is that it’s not supposed to be a profession you go into knowing you have no hope. Healing the sick is a lot more practical and socially-subsidizable an activity than pondering Truth; it seems like the sort of thing it should be easy to get paid for. Here in America, this is the conventional wisdom: make it into medicine, and you’re promised a pretty good career. So maybe my solidarity with British doctors is a big cultural misunderstanding. Maybe, coming from America, I’ve absorbed a social promise that doctors will be treated well (which is true in America), but in Britain those doctors go into medical school knowing 100% that their lives will be unbearable and their compensation miniscule. Maybe they do it anyway because for them medicine is as much of an I-can’t-believe-they’re-paying-me-to-do-this as philosophy is to me.

Suppose somebody tells me that before going to medical school, every doctor in the British Isles has to sign a Waiver Of Appreciation Of Consequences, which spells out in excruciating detail all of the horrible things about a medical career. It says exactly how many 100 hour weeks they’ll be expected to work, exactly how many 36-hour shifts they’ll take, does its best to give them an idea how much senior doctors will abuse them. Maybe there’s even a video portion that has interviews with disgruntled current doctors describing in explicit detail all the worst things that have happened to them. Maybe they even have to shadow current junior doctors, work 36 hour shifts themselves even though they can’t do any procedures yet, just so that they can never claim that they didn’t understand on an experiential level what a 36-hour shift means. Suppose the government (unlike in the real world) does this 100% fairly and accurately, and then they stick to it – ie as bad as things are, they never get any worse than the video promised. And suppose that after all of this, representatives from Private Industry come in to discuss their options for non-medical jobs, and explain how conditions in these jobs will probably be much better. If a medical student signs this waiver, completes their training, graduates, hates their job, and demands better pay and conditions, can we have any sympathy for them?

(I’m talking about a hypothetical world here. In the real world, we have to consider that lots of people didn’t know what they were going into, and the thing they were going into keeps getting worse than they were led to expect. This is purely a Least Convenient Possible World argument here.)

The first argument we might use is that instead of focusing on the virtue of the employee, we should focus on the problems with society. That is, who cares whether the doctor made a good choice or not? Society still owes people decent jobs and working conditions. I think I reject this argument. There are some decent jobs attainable by the sorts of people who could become doctors (this is not true for everyone, but it’s pretty true of the people who could become doctors), and not every alternative has to be great for everybody. Suppose there are many entry-level gardening jobs available, but I become a skyscraper-window-washer. Then I complain because I am afraid of heights, and I want special accomodations for this. When you say “maybe you should try a different line of work”, I say “stop making this about my virtue and start focusing on the problems with society such that it can’t give everyone decent conditions.” It’s not a very sympathetic argument. Society is totally allowed to have jobs that not everybody would enjoy as long as it gives everybody a wide range of options.

(one might argue that nobody could possibly enjoy being a junior doctor, but weirdly enough this is false. There are some people who love it. I usually assume these people are on cocaine, but maybe some of them aren’t.)

The second argument we might use is that instead of focusing on the virtue of the employee, we should focus on distortions in the economy. Who decided to have a medical career consist of ten to twenty years of misery followed by an elusive opportunity to get a really nice job with great pay and hours? Isn’t part of the problem that hospitals aren’t competing for junior doctors? Can we oppose dualization of the labor force as just a generally bad idea? I think this argument is right, but I also think there’s still a lingering question of “Okay, but since we do have this terrible industry with bad incentives, should we feel sympathy for people who voluntarily place themselves right in the middle of it?”

The third argument we might use is that we sometimes need to save people from their terrible decisions. Just as we ban people from permanently selling themselves into slavery, no matter how aware they are of the consequences when they make the deal, so maybe we should assume that signing the Waiver Of Appreciation of Consequences is an irrational decision motivated by time discounting and that people’s future selves ought to be freed from the tyranny of their past selves’ poor judgment. There’s definitely some truth to this, but it’s also a little too close to the policies of the BETA-MEALR Party for comfort. And doesn’t this ignore that doctors aren’t slaves, and can leave the medical profession any time they want? Even if there are some reasons they can’t (difficulty finding other jobs, sunk cost effects) wouldn’t it be less distortionary to smooth their path to non-medical careers than to try to reform medicine?

All of this makes sense. And yet as a psychiatrist, I constantly have people come into my office saying their jobs are making them suicidal. You would think somebody would leave a job before it makes them suicidal, but this doesn’t always happen (the same is true of relationships, by the way). My suggestions that maybe they find a different job or boyfriend tend to fall on deaf ears.

I think one big reason I am so much more sympathetic to doctors than adjuncts is that I know them and have worked with them and I can see a reality – both a real reality and an emotional reality – that makes the model look kind of weak and threadbare.


A few years ago I rented a house. In the rental contract, it said “tenant must have their own rental insurance to cover damages to the property”. I reminded my landlord that rental insurance mostly just covers damages to the renter’s stuff, and that the landlord usually has a separate property insurance to cover damages to the property, and that the way he was doing it might not even be legal. He just said yes it was and I had better get the rental insurance. I made a mental note to get the rental insurance and then got distracted by everything else in life and never got around to it.

A few years later, a pipe burst and the house flooded. The damage was assessed at way more than I had the ability to pay. I told the landlord he had better fix it, and the landlord told me I had better get my rental insurance to fix it for me – ie, the rental insurance I had forgotten to get.

There followed a gigantic disaster. I asked some lawyers whether the landlord was legally required to insure his own property, and they gave me vague and conflicting answers, then all agreed that even if he was it would cost more money than I had to fight the case in court. I ended up on the verge of breakdown. The landlord was clearly deliberately stonewalling me, trying to make it as hard as possible for me to figure out what was going on in the hopes that I gave in and gave him the money I didn’t have. Eventually, after several months’ of living out of a friend’s spare room while I tried to get the fate of the house sorted out, we settled out of court for a big fraction of my life’s savings.

(I later got some evidence that the landlord did have the property insured and was running some kind of insurance scam trying to get the money out of both me and his insurance, but that’s a different story and I’m still not sure either way)

The reason I bring this up was that for the duration of the crisis, and to a lesser degree even to the present, I was utterly convinced that the government had an obligation to make sure landlords insure their properties. Offering a contract where the tenant was responsible for insuring the property? Totally unacceptable. Maybe even a human rights violation. What’s the new phrase people are using these days? “A denial of your right to exist”? Even so.

It’s easy to craft the argument where I’m in the wrong. Different landlords should be allowed to experiment with different arrangements as long as their terms are clearly listed in their contracts and the other contracting party agrees to them. I signed the contract then failed to do what the contract said, and clearly I needed to pay the penalty. We can imagine some nanny-state laws that might ameliorate that – for example, if a contract says a tenant must purchase insurance, it’s the landlord’s responsibility to make sure the insurance was actually purchased – but the arguments for implementing these laws are on really shaky ground.

And all I’m saying is that in the middle of this crisis, I had no sympathy for any of this. In the middle of this crisis, my thought was that I had worked really hard for years to get a little bit of money saved up, I was going to lose all of it for some burst pipe that wasn’t my fault, this was cosmically unfair, somebody needed to do something about it, the landlord was a big company that probably had millions of dollars, and somebody needed to do something about this right now.


An article by Freddie deBoer in this month’s Current Affairs proposes “Journalistic Self-Outsourcing”. DeBoer notes that lots of journalists and intellectuals suggest that protectionism and other anti-globalization policies are immoral. For example, Zack Beauchamp of Vox calls Bernie Sanders’ skepticism of free trade “screwing the global poor”; Brad deLong calls the same “a call to keep China a society of poor subsistence rice farmers as long as possible – keep them poor, barefoot, uneducated, and by no means allow them to work at any of the high-value manufacturing occupations we want to keep in the United States.”

DeBoer has a few things to say about how we should take money away from the rich in a way that can help both poor Americans and poor Chinese, but he quickly transitions into the barb he clearly relishes: if Beauchamp and DeLong are so in favor of poor Chinese people, how come they haven’t outsourced their jobs? They both earn quite a bit of money, and China must have some decent journalists and economists who would love to telecommute to well-paid American positions. Are DeLong and Beauchamp hypocrites for not arranging to expedite the transition of their jobs to Chinese people?

One argument in their favor: they’re currently living in a relatively globalized world, their employers must have the option of replacing them with Chinese people, but both of them still have jobs. This suggests that maybe journalism and economics aren’t as replaceable as deBoer thinks. Beauchamp might be able to retort “As soon as Vox finds a Chinese journalist who’s as good as I am they’re welcome to hire him/her, but until then I can stay on, secure in the consistency of my principles.” Presumably he can say this without fear, since the subtleties of being in touch with the pulse of the American people and writing English-language articles would elude most lower-class Chinese.

But this lets them off too easily for purely contingent reasons. Even granting that they can’t be replaced with Chinese people right away, are they forced to will their replacement with Chinese people, deep in their hearts? Should Beauchamp go into work every morning asking his boss whether he’s found a suitable Chinese person to replace him with yet, and be disappointed every time his boss says no?

Again, there’s a contingent argument otherwise. If you’re really pro-globalization, you might believe that it’s impossible for the Chinese to take all our jobs, in the same way that the Luddite Fallacy says it’s impossible for robots to take all our jobs. The more Chinese take manufacturing jobs, the more Americans will have lots of money which will encourage new service jobs that the Chinese can’t easily take. If this is Beauchamp’s argument, he could say that we should globalize all the jobs that can be globalized, including his if possible, and then he will just move to an unglobalizable job. Since his job hasn’t been globalized yet, maybe he’s already in an unglobalizable job, so people should just stop bothering him.

But this is still too contingent. Let’s least convenient possible world again, and suppose that economists determine that there are permanent negative effects on American jobs from Chinese globalization, those effects disproportionately go to the poorest Americans, and no new unglobalizable jobs arise to restore cosmic balance. Now does Beauchamp have to will his own replacement?

Okay, argument from the other side: suppose a 1980s version of Beauchamp is writing an article against apartheid in South Africa. The South African whites argue that if apartheid ends, they’ll be competing for jobs against much poorer blacks and so their quality of life will go way down. They say that if Beauchamp really wants to end apartheid, he himself should give up his journalist position to a South African black who will take it for a few dollars a day.

I’m both very convinced that the right thing to do in that situation would be to fight apartheid, and also convinced that the South Africans would be right about the personal jab – the median American journalist who pushed the fight wouldn’t want his job taken over by Zulus willing to work for lower wages. When I put myself in that situation, and imagine myself being undercut by foreigners willing to practice medicine for almost nothing, I’m pretty pissed off about the idea too.

So maybe we should just let the journalist keep being a hypocrite. Journalists are pretty privileged people, and if we call journalists hypocrites every time they stand up for the less privileged without giving up all their own privilege, we’ll probably just end up with journalists who stop standing up for the less privileged.


Consider two contradictory arguments.

The first says that people who experience a problem have unique insight into it, and people who don’t experience it debate it from a position of ignorance or callousness. Thus a rich person can say “The poor don’t need help because they can pull themselves up by their own bootstraps”, but a poor person knows this is harder than it sounds. There’s a really boring sense of this in which the poor person may know specific facts the rich person is missing – for example, the rich person might falsely think welfare is much more generous than it is. But then a rich person who’s read a lot of books on poverty might claim to have a better perspective than the poor person, since she would likely know more about the welfare system. The more interesting claim is that there’s a sort of near-mode-vs-far-mode thing going on here, where things that are easy to dismiss in the abstract become a lot more relevant in the concrete. A really good example of this is Hitchens’ waterboarding – he said he thought it was an acceptable interrogation technique, some people offered to waterboard him to see if he changed his mind, and he quickly did. I’m fascinated by this incident because it’s hard for me to understand, in a Mary’s Room style sense, what he learned from the experience. He must have already known it was very unpleasant – otherwise why would you even support it as useful for interrogation? But somehow there’s a difference between having someone explain to you that waterboarding is horrible, and undergoing it yourself. Just so with everything else. Under this view, the only people we should trust to tell us whether junior doctors get treated fairly are junior doctors; the only people we should trust to tell us whether globalization is acceptable are people whose own jobs are on the line.

The second says that people with skin in the game are the last people we should be trusting. Who do you trust to tell you how many subsidies the government should give the oil companies? Some economist in the budget watchdog organization who calculates exactly what the costs and benefits are? Or an oil company CEO who says “Trust me, we need lots and lots of money”? Under this argument, everyone has access to logic and reason, people detached from the situation are able to use it, and people within the situation are (perhaps excusably) motivated by self-interest. We can perhaps understand the fears of the white South African who thinks he’ll lose his job and end up bankrupt, but it’s our job as dispassionate external observers to notice that his concerns are outweighed by the concerns of other people whom his self-interest makes him unable to understand. And the last person you want giving you a sober cost-benefit analysis of torture is the person who is being waterboarded – everyone knows a waterboarding victim will say anything at all to make it stop!

The second argument obviously has its uses, but I’m fascinated by the first. In the few cases where I have direct experience with it, it seems to bring knowledge beyond just “this is really bad”. In fact, the anti-junior-doctor argument seems resistant to just learning medicine is worse than you thought – if anything, medicine being very bad makes the argument stronger, since it means the doctor should be extra quick to listen to feedback and go into a different career. But somehow experience with doctors has made me much more reluctant to believe that argument. In the same way, one could imagine Hitchens saying “Yes, this waterboarding is really unpleasant – good thing that means it’ll be really easy to make the terrorists want to talk” – but that wasn’t the conclusion he drew from it.

I guess the thing I’m not sure about is – does personal experience/”skin in the game” reduce fully to factual propositions? Does a factory worker have an advantage over a journalist in understanding globalization just because he knows that being laid off is really bad, and that it’s harder to get a new job than a journalist thinks – two things we would expect any journalist worth their salt to already know about? Or is there some hard-to-communicate knowledge that’s neither factual nor just a cover for “the secret hard-to-communicate knowledge that I am selfish and want a system that benefits me rather than other people”?

Awkwardly, my far mode says that there isn’t and my near mode says that there is.


[Epistemic status: I am not British, it’s been years since I’ve been in the HSE, and the HSE is not the NHS. All of this may be misunderstood or outdated.]

I don’t usually blog on labor disputes here, but I want to talk about one on which I have a tiny bit of inside knowledge.

Last month junior doctors in Britain went on strike for two days, protesting imposition of a new contract. There’s a lot of anger about this, and admittedly when you’re being rushed by ambulance into the emergency department for sudden onset chest pain, “doctors are on strike today” is not something you want to hear. My normal instincts would be to question whether this is really necessary. My experience tells me it is.

“Oh, you’re a junior doctor. Of course you would support a doctor’s strike.” Okay, but I’m not a British junior doctor. I work in America, where I would describe conditions as “tough, but fair”. Sure, Dr. Cox yells at you a lot, but only because he secretly thinks you’re one of the best doctors ever to pass through the doors of this hospital. My own specialty of psychiatry is a lot better than most and overall I have little to complain about in my own life.

But that’s not to say that I don’t have any special knowledge here. I went to medical school in Ireland, where I worked alongside junior doctors in a system very much based off of the British one. And it was pretty shocking.

Technically European law caps junior doctor work weeks at 48 hours a week. Then again, technically American law caps junior doctor work weeks at 80 hours a week. My first week on a non-psychiatry service as an American junior doctor, I worked a bit over 100 hours – and so did everybody else I encountered. When I asked about the law, everyone just gave me that “oh you sweet summer child” look.

Such caps seem to be honored more in the breach than in the observance, and this is the British custom too. Physicians Weekly describes it as “the 48 hour trainee work week sham”, and the Telegraph and The Daily Mail both seem to agree that many British doctors are working 100 hour shifts. They seem to circumvent the law either by giving them a few weeks off afterwards and saying it “averages” to 48 hours/week, or else by doing what my hospital did – carefully schedule a 48 hour shift in big bold letters, assign 100 hours worth of work, and then get angry if anyone goes home before their work is done.

Many of the junior doctors I worked with in Ireland were working a hundred hours a week. It’s hard to describe what working 100 hours a week is like. Saying “it means you work from 7 AM to 9 PM every day including weekends” doesn’t really cut it. Imagine the hobbies you enjoy and the people you love. Now imagine you can’t spend time on any of them, because you are being yelled at as people die all around you for fourteen hours a day, and when you get home you have just enough time to eat dinner, brush your teeth, possibly pay a bill or two, and curl up in a ball before you have to go do it all again, and your next day off is in two weeks.

And this is the best case scenario, where everything is spaced out nice and even. The junior doctors I knew frequently worked thirty-six hour shifts at a time (the European Court of Human Rights has since declined to fine Ireland for this illegal practice). Dr. Brid McGrath (my lab partner in medical school) has been collecting some stories for the Irish media:

My stories are like my colleagues’ stories: working through illness, personal turmoil, and deprivation of sleep, food and toilet breaks. The worst stint was working 73 hours within an 82 hour period. I have been bullied, and to my shame, bullied others. I realised I was falling into the trap of treating others the way I had been treated. My self esteem faltered and I began to believe I truly was a nasty person. I had the insight to get help, but not everyone is so lucky.

I came to talk to you about the imminent arrival of your very premature baby, at just 24 weeks. I held your hand and passed you tissues as we talked about his name, how tiny he was, how hard his life could be, but how we would try to give him the best possible chance… and how we might also have to accept the reality that he might not make it. That same day, I’d worked a 28-hour shift while I was 24 weeks pregnant myself. I fought back tears before I saw you. I worried about how I would cope with your pain and distress, barely able to think about the baby growing inside me. I had dinner a 1am, and worked on. An incredible nurse sat me down for a glass of water. She had to force me to. I was so busy.

The other night, after a particularly busy 16-hour shift in the Emergency Department and Theatre, I went up to the wards to take blood samples for a patient. I’d had no dinner. A patient in the same room was handing around coconut buns, and gave me one. I inhaled it, it smelt so good. She then pushed the box into my hands and said “You look like you need them more than me!”

Imagine having to decide between going to the bathroom or getting a bag of crackers from the vending machine because you don’t have enough time between cases to do both. Imagine having to remember the difference between nephritic syndrome and nephrotic syndrome (two totally different things) after ten hours of work, after getting three hours of sleep the night before. Imagine that you’ve just admitted a neurotic old woman to the hospital and you know in your heart that you should take her hand and explain to her in a soothing voice that everything is going to be okay, except that you already feel like every nerve of yours is beaten raw and you have three patients left to go before you can so much as sit down for a few minutes. Imagine your attending yelling at you because you got something wrong and saying you need to spend more time studying, and you trying to keep your mouth shut instead of telling him that you literally have only a half-hour in the day that could be considered free time by even the broadest stretch of the imagination and you are damned if you are going to spend that studying endocrinology.

The psychological consequences are predictable: after one year, 55% of junior doctors describe themselves as burned out, 30% meet criteria for moderate depression, and 12% report considering suicide.

A lot of American junior doctors are able to bear this by reminding themselves that it’s only temporary. The worst part, internship, is only one year; junior doctorness as a whole only lasts three or four. After that you become a full doctor and a free agent – probably still pretty stressed, but at least making a lot of money and enjoying a modicum of control over your life.

In Britain, this consolation is denied most junior doctors. Everyone works for the government, and the government has a strict hierarchy of ranks, only the top of which – “consultant” – has anything like the freedom and salary that most American doctors enjoy. It can take ten to twenty years for junior doctors in Britain to become consultants, and some never do. In Ireland (I don’t know about the UK) there was a very scary distinction between “training” and “service” positions, the former of which were always in short supply. Imagine that you’re a freshman in college, and your university announces that due to budget cutbacks there are only about half as many sophomore positions available this year, so the top fifty percent of freshmen can go on to become sophomores, and the rest will have to stay freshmen until more money comes in. Also, there are no other colleges in the entire country so you have no choice but to follow along and hope for the best. This is what being a junior doctor is like.

Faced with all this, many doctors in Britain and Ireland have made the very reasonable decision to get the heck out of Britain and Ireland. The modal career plan among members of my medical school class was to graduate, work the one year in Irish hospitals necessary to get a certain certification that Australian hospitals demanded, then move to Australia. In Ireland, 47.5% of Irish doctors had moved to some other country. The situation in Britain is not quite so bad but rapidly approaching this point. Something like a third of British emergency room doctors have left the country in the past five years, mostly to Australia, citing “toxic environment” and “being asked to endure high stress levels without a break”. Every year, about 2% of British doctors apply for the “certificates of good standing” that allow them to work in a foreign medical system, with junior doctors the most likely to leave. Doctors report back that Australia offers “more cash, fewer hours, and less pressure”. I enjoy a pretty constant stream of Facebook photos of kangaroos and the Sydney Opera House from medical school buddies who are now in Australia and trying to convince their colleagues to follow in their footsteps.

Upon realizing their doctors are moving abroad, British and Irish health systems have leapt into action by…ignoring all systemic problems and importing foreigners from poorer countries who are used to inhumane work environments. I worked in some rural Irish towns where 99% of the population was white yet 80% of the doctors weren’t; if you have a heart attack in Ireland and can’t remember what their local version of 911 is, your best bet is to run into the nearest mosque, where you’ll find all the town’s off-duty medical personnel conveniently gathered together. This seems to be true of Britain as well, with the stats showing that almost 40% of British doctors trained in a foreign country (about half again as high as the US numbers, even though the US is accused of “stealing the world’s doctors” – my subjective impression is that foreign doctors try to come to the US despite barriers because they’re attracted to the prospect of a better life here, but that they are actively recruited to Britain out of desperation). Many of the doctors who did train in Britain are new immigrants who moved to Britain for medical school – for example, the Express finds that only 37% of British doctors are white British (the corresponding number for America is something like 50-65%, even though America is more diverse than Britain). While many new immigrants are great doctors, the overall situation is unfortunate since a lot of them end up underemployed compared to their qualifications in their home country, or trapped in the lower portions of the medical hierarchy by a combination of racism, language difficulties, and just the fact that everyone is trapped in the lower portions of the medical hierarchy these days.

If Britain continues along its current course, they’ll probably be able to find more desperate people willing to staff its medical services after even more homegrown doctors move somewhere else (70% say they’re considering it, although we are warned not to take that claim at face value). I work with several British and Irish doctors in my hospital here in the US Midwest, they’re very talented people, and we could always use more of them. But this still seems like just a crappy way to run a medical system.

I don’t know anything about the latest dispute that has led to this particular strike in Britain. Both sides’ positions sound reasonable when I read about them in the papers. I would be tempted to just split the difference, if not for the fact several years of medical work in the British Isles have taught me that everything that a government health system says is vile horrible lies, and everybody with a title sounding like “Minister of Health” or “Health Secretary” is an Icke-style lizard person whose terminal value is causing as many humans to die of disease as possible. I can’t overstate the importance of this. You read the press releases and they sound sort of reasonable, and then you talk to the doctors involved and they tell you all of the reasons why these policies have destroyed the medical system and these people are ruining their lives and the lives of their patients and how they once shook the Health Secretary’s hand and it was ice-cold and covered in scales. I don’t know how much of this is true. I just think of it as something in the background when the health service comes up to doctors and says “Hey, we have this great new deal we want to offer you!”

(I remember reporting into the hospital one day and seeing almost a carnival atmosphere, and one surgeon who had never been known to do anything but yell at his subordinates gave me a friendly nod and smile as he passed me in the corridor, and I started to worry I had walked into some Stepford Wives bizarro-world. Finally I learned that, the evening before, the Irish health minister had resigned in disgrace. This is the only time anyone ever saw that surgeon happy.)

[EDIT: a strong argument that the junior doctors have the right of it and the NHS’ position is based on a misunderstanding of patient care statistics here]

Whatever caused this latest dispute is probably relevant mostly as a straw that breaks the camel’s back. If British junior doctors today are anything like the Irish junior doctors of a few years ago, all of their complaints are legitimate and they’re also hiding several dozen other legitimate complaints you have yet to hear about. I sort of sympathize with the government’s complaints that they don’t have enough money to make a system where doctors don’t have to work a bunch of 36 hour shifts, but I feel like if you don’t have enough money to run a health system that treats its employees like human beings, maybe you shouldn’t be running your country’s health system.

Labor disputes suck, and I have no good theory of them. Part of me is outraged at people being mistreated, and another part of me worries about a world where anybody who can convince the media that they’re being oppressed can force other people into paying them whatever amount of money they think they deserve. I long for some kind of principled system that will solve these problems more elegantly than letting everybody shout their grievances at each other and seeing which ones stick. I long for something that will take care of the deeper problems underlying unfair labor practices like dualization of entire industries. This is why I find libertarian ideas like letting competition among firms determine people’s pay and conditions so attractive.

But these may or may not work, insofar as they do work they only work in certain situations, and insofar as they do work under certain situations a 100% socialized industry run as a government monopoly probably isn’t one of them. So we’ve got to do the thing where people get mistreated and have to cry out for redress of their grievances. And my experience tells me the grievances of British junior doctors are copious, horrifying, and entirely valid.

Posted in Uncategorized | Tagged | 935 Comments

Links 5/16: Linko de Mayo

The Theory Of Deadly Initials proposed that people whose initials spelled out negative words, like D.I.E. or B.A.D., died earlier because of the associated stress. People believed this for years before someone figured out it was all based on bad statistics.

Jamie Brew has become Internet-famous for his predictive text generator that makes hilarious mishmash out of sources like the political debates (“I am in this campaign for the sake of the four largest people in the history of the world, people who should have a lot of healthcare”). But how come he is able to do this so much better than anybody else armed with a Markov chain and a source text? Some kind of shiny new machine learning algorithm? Rationalist Tumblr user @nostalgebraist investigates and bursts all our dreams by finding that nope, it’s mostly done by good old human judgment.

This seems unbelievable to me, so I challenge readers to tell me how to reconcile my perceptions with the data: of all candidates (including Trump), Hillary Clinton has received the most negative media coverage.

You know those Neuro drinks that are on sale everywhere and promise to lift your mood or help you relax or whatever? They’re now paying $500,000 for misleading advertising. Sounds like a pretty fair decision to this psychiatrist.

BMJ: a large study from 1973 found that replacing saturated fat with vegetable oil did not decrease death from coronary disease, but the results sat in a file drawer for forty years. And the New York Times’ popular presentation of same.

Although shared environment has kind of gotten the short end of the stick in recent behavioral genetics studies, it still shows up sometimes in early childhood and in studies done on the most deprived populations. But what percent of that is prenatal versus postnatal environment? Abstract, table of results. Most interesting finding: adopted adults’ IQ is so unrelated to the IQ of their adoptive mother that in some studies the correlation shows up as nonsignificantly negative.

There’s been some past discussion here about Success Academy, a chain of charter schools that has achieved impressive results. Freddie deBoer argues this will never scale because their business model is hiring a tiny number of elite teachers who have just graduated from top colleges for really cheap, luring them with promises of social impact and getting to live in desirable areas. This might work – have the best teachers teach poorer students and those poor students will do well – but it doesn’t scale beyond the tiny number of elite teachers willing to work in those conditions. I find this idea plausible but far from proven – first of all because the schools themselves say it’s their (easily scalable) discipline policies that lead to their success, and because the research on the importance of teacher quality seems mixed.

A while back I posited a utopian online future of automated machine learning filters that prevent you from ever having to see trolls. Now Hugh Hancock makes the case for pessimism by positing a dystopian online future of automated machine learning trolls.

I can’t improve on this title: Reflections On Reasons for Reduced Rates of Replicability.

A while ago I got a bit paranoid about some kind of deliberate conspiracy to prevent working class people from getting jobs painlessly, and how the government used bureaucracy to smite any opportunity that arose outside this system. This probably isn’t going to help my paranoia: San Francisco to require Uber and Lyft drivers to obtain business licenses.

Related: Google, Ford, Uber, Lyft, Volvo, etc, form lobbying group for self-driving cars. I’d forgotten that people could also lobby in favor of things I want!

Classic Programmer Paintings dot tumblr dot com.

Scientific American: Scott Aaronson Answers Every Ridiculously Big Question I Throw At Him. I disagree with John Horgan about a lot, sometimes vehemently, but man can he do a good science interview.

Andrew Gelman dissects a study on airplane inequality. And Asheley Landrum dissects a study on Ted Cruz and bullshit.

Scientist suggests that quantizing inertia would explain flyby anomaly and make the EmDrive not contradict physics. Anyone want to tell me if this is crazy or not? (EDIT: probably crazy)

Marginal Revolution: Regulatory Arbitrage, Rent-Seeking, and the Deal Of The Year. Why did the Real Estate Board of New York give its Ingenious Deal Of The Year Award to somebody who literally destroyed value with a wrecking ball for no economic reason? And what does it say about our society that they were right to do so? An interesting companion piece to some of what I talked about in my review of Art of the Deal.

Correlation of -0.68 between “rule of law” in a country as defined by the World Justice Project, versus road accident deaths per capita in that country. Is this something boring, like better governments making better road systems, or everything about countries always being correlated by development anyway? Or some more fundamental connection between people following the rules while driving and following the rules while governing. I’d say “paging Garett Jones” except that I think I got this link from his Twitter.

Vox: Inequality As Waste. Discusses increasingly costly signaling in terms of houses, weddings, and parties as a multipolar trap in which everybody has to keep up with a small group of increasingly super-rich Joneses.

Study: “About 40% of studies fail to fully report all experimental conditions and about 70% of studies do not report all outcome variables included int he questionnaire. Reported effect sizes are about twice as large as unreported effect sizes and three times more likely to be statistically significant.

Vox’s profile of Mencius Moldbug is a thing that exists. Nick Land praises it as “almost saintly in its attempt to get the phenomenon right”. Ross Douthat responds in the NYT calling reaction potentially “something genuinely new…a vision as strange and motley as reality itself.”

Also in the NYT, this time by Amanda Hess: “Those who try to signal their wokeness by saying ‘woke’ have revealed themselves to be very unwoke indeed.” I am deeply grateful to have a bubble that mostly insulates me from the sort of people for whom this is a problem.

I had a fun time presenting Plomin’s paper Top Ten Replicated Findings From Behavioral Genetics to a room full of psychoanalysts last month, then fielding their increasingly angry and horrified questions. But this group might be more in need of the (partial) antidote, Turkheimer’s Weak Genetic Explanations 20 Years Later, which I endorse as the most pessimistic about genetic explanations it is possible to be while still being 100% intellectually honest.

In the context of recent papers finding the global warming “hiatus” is real after all, David Friedman notes that he has been predicting this for years, and further predicts (if I understand correctly) that the warming trend should return with a vengeance around 2030.

The percent of Americans who identify as environmentalist has gone down from 78% in 1991 to 42% today! I find this really surprising, and indeed, Gallup notes that how Americans actually feel about environmentalist issues has changed much less or not at all. So what’s going on here? One possibility: global warming has so eclipsed all other environmental concerns that the mainstream environmentalist movement has entirely folded into the anti-global-warming movement, which doesn’t have a catchy name or identitarian label. But I wonder if there’s something deeper going on here – something like environmentalism so permeating the culture that normal people stop identifying with it and the term becomes more relegated to an extremist fringe. How might that relate to other political movements?

Speaking of how people self-identify: did you know the average self-identified vegetarian eats one serving of meat per day? Or that 60% of self-identified vegetarians say they’ve eaten meat in the past 24 hours? Related: Rational Conspiracy on cost-effectiveness of vegetarianism.

Rational Conspiracy: whatever you do, don’t subscribe to the Boston Globe.

New n = 9,000 blinded resume study finds no preference for white over black or Hispanic applicants, contradicting previous research. Before you get too excited, I think there’s a lot of previous research this contradicts, so more studies are needed. Also, they signaled black race by using the last names “Washington” or “Jefferson”, instead of previous studies that had used first names like “Jamal” or “DeShawn”. While people convincingly argued that Jamal and DeShawn might be less popular among employers than the average black person, I worry that “Washington” and “Jefferson”, while indeed disproportionately black names, may not be black enough to effectively signal blackness. On the other hand, the Hispanics were “Hernandez” and “Garcia”, you’d think that would have worked.

Related: “implicit racist attitudes” as measured by Implicit Association Tests do not actually predict whether someone will racially discriminate or not, are of questionable meaningfulness.

r/SubRedditSimulator is a subreddit made entirely of bots; each bot generates posts and comments based off of predictive text from a different subreddit. 8th post is by the r/CrazyIdeas bot: “Open a pizzeria that only serves food made by two different parasites fighting for control in our solar system by detonating calculated explosions near the soda fountain…”

Popehat attorney Marc Randazza files a legal brief about Klingon, partly in Klingon, supporting a very Klingon conception of copyright law.

President Obama makes a Red Wedding joke at the White House Press Correspondents Dinner, threatens to have security bar the doors and take out all the Republicans in the room. Funny in context, but I appreciated Pax Dickinson’s commentary – our history of drone strikes on Pakistan is pretty grim, and jokes about killing everybody at a wedding are less funny when the person making them has actually done that before.

Weasel shuts down Large Hadron Collider in the most blatant act of animal aggression against the particle physics community since a bird dropped a baguette into CERN machinery and a conspiracy of raccoons took down Fermilab.

Aptly-named Impossible Foods says it will have a high-tech vegetarian burger as good as the real thing available at select restaurants this July. No word on when it’ll be available direct to consumers.

Did you know: light bulb manufacturers maintained an honest-to-goodness conspiracy to prevent the introduction of longer-lasting bulbs. I would say this should increase our concern about this sort of thing happening today, except the conspiracy lasted barely ten years before other companies managed to undercut them, so maybe it should decrease our concern.

The price of solar power has decreased 50% in 16 months. Maybe. There’s a lot of complicated stuff about subsidized versus unsubsidized power and I’m not sure it’s an apples-to-apples comparison. But there’s some very impressive claim about solar power that’s true. Sometimes it seems like technologies only have two possible modes – stagnant for decades, or doubling every eighteen months.

David Chapman always has posts that are structurally brilliant and revelatory until I sit back and think about them later and realize I don’t know what half the terms in them mean and I am just assuming they are brilliant and revelatory because they are put together in a way which is a superstimulus for formally correct thought. His latest, A Bridge To Meta-Rationality Vs. Civilizational Collapse, is a typically engaging and impressive example of the genre. I really wish I knew more about post-modernism, or that somebody who does would write an engaging and meaningful introduction.

A scuba diver petting a moray eel, with relevant commentary here.

In 1737, William Penn’s children made a (shady, possibly forged or forced) treaty with the Lenape Indians that granted white settlers all territory within thirty-six hours’ walk from the Lehigh River. Then they hired the fastest power-walkers and best surveyors in the colony to cover as much ground as humanly possible within thirty-six hours. The history of the Walking Purchase.

Scott Aaronson and a student find that the 7918th Busy Beaver number is unknowable. This is a fun read even for someone like me who only understands the tiniest fraction of what’s going on. I think it is about a function which proceeds from being finite, knowable, and known to being Godelian and unknowable in an orderly fashion in a finite number of steps (apparently, less than 7918). If I’m understanding this right, my brain hurts.

The French company behind the TGV supertrain has invested 80 million euros in the Hyperloop.

Tow truck owner refuses to tow Bernie Sanders supporter. This is the world you people have built for us.

Oddly prescient Onion from 2012: Shrieking White Hot Sphere Of Pure Rage Early GOP Front-Runner For 2016. Between this and the Long National Nightmare Of Peace And Prosperity article I’m starting to think the Onion employs Nostradamus.

A roundup of everybody who said Trump could never win the nomination so we can laugh at them for being wrong. There’s actually an important rationality lesson here, which is that a person who said Trump had only a 20% chance of winning the nomination (like Nate Silver) may in fact be perfectly virtuous – things with only a twenty percent chance of happening do happen one in every five times. By extension, even a person who said there was only a 0.000001% chance of Trump winning the nomination may be virtuous, although it’s pretty unlikely. I am less contemptuous of anybody who provided a number, and more contemptuous of the sort of people who said “Anyone who thinks Trump might win the nomination is an idiot and shouldn’t be taken seriously”. SSC’s own (rather late) prediction was 60% chance he would be the nominee – an earlier pseudo-prediction was non-numerical and very carefully hedged.

French study shows diversity causes social anomie, but I get kind of suspicious when “social anomie” is treated as a quantified study endpoint. Related: contra usual conventional wisdom, study suggests that ethnic diversity does not decrease support for redistribution, except maybe in special cases involving recent immigrants.

What do actual epigenetics professors and researchers think of the pop epigenetics that always gets cited in the media as the hot new explanation for social phenomena? Jerry Coyne collects some biting responses.

Egypt Independent – “Salah Abdel Sadeq, head of the State Information Service, has blamed the spread of violence and extremism in the Arab world on Tom & Jerry cartoons and video games.” The fun thing about this is that every time another culture blames their problems on the way things are portrayed in the media, it sounds hilarious, but whenever our culture does it people find it totally plausible. Related: Mexican Congresswoman Declares War On Memes

The Open Philanthropy Project has declared that AI risk will be one of their major priorities this year, an important development given both their levels of funding/talent/connections, and their reputation as a gold standard for analysis of what charitable opportunities are important. Especially interesting given that the OPP leader who wrote the report, Holden, was previously one of MIRI’s strongest critics – he notes that “my views on this cause have evolved considerably over time”, though it’s also important to note a lot of his criticisms were MIRI-specific rather than related to the entire field.

Has the more charismatic candidate really won every one of the last thirteen presidential elections?

The theologians say that Hell is the absence of God, marked not by divine abandonment of human souls but by humans who deliberately refuse the salvific power of the Divine. On the one hand, I feel like this is an uncharitable portrayal of nonbelievers, many of whom are not opposed to God but only intellectually unconvinced of His existence. On the other, Haifa Man Seeks Restraining Order Against God

Yet another study showing permanent increase in Openness (and the ominous-sounding “brain entropy”) after LSD use (h/t Emil Kierkegaard)

What does it look like to walk along the ridge of the Matterhorn? (warning: it looks like something that will trigger people who are scared of heights). A less dizzying perspective. Relevant Reddit commentary.

Brad DeLong vs. John Cochrane on the Ease of Doing Business Index.

You know that chart showing how US GDP keeps going up steadily, but after 1973, wages stop going up along with it? Somebody broke it down and figured out why. Some of it is The 1 Percent, but a lot isn’t.

New York bar told it is discriminatory to deny service to pregnant women.

Percent Neanderthal genes in Europeans has been declining over the past 40,000 years in a way consistent with natural selection acting against them.

Ten percent of federal judgeships are currently vacant – study finds that this leads to a thousand fewer incarcerations each year as prosecutors triage which cases they want to bring to trial. Suggested trollish by technically correct spin: Congressional Republicans have done more for the fight against mass incarceration than almost anyone else.

A counterpoint to a recent post on Chinese happiness: Pew asks a very subtly different question and sees vast improvement in all emerging markets including China.

America has 35% fewer police officers per capita than the world average, even though its prison system is much larger. Alex Tabarrok wonders if this suggests a strategy of shifting criminal justice resources from prisons to police, in the hopes that criminals use a rational P(caught)*punishment strategy to determine whether or not to commit a crime and so if we increase catch rate we can shorten sentences.

Artir with a very long and data-intensive argument that there is no technological stagnation. Strongest possible rebuttal I can imagine after this data overflow (unless you can prove the post is cherry-picking indicators, which it doesn’t look like) is that for some reason stagnation is uniquely limited to things that can’t be graphed – progress in how much energy can be stored in a single battery is going as fast as ever, but there are fewer completely new ideas like airplanes. But that might be too close to a god of the gaps argument – people can graph a lot of things.

An argument against denser zoning in San Francisco good enough to get featured on Marginal Revolution???

Why are there billboards across Utah advertising the 9th President of the US, William Henry Harrison?

Is there an evolutionary reason why humans continue to live after they stop being able to reproduce? We still don’t know, but of note, A Simple Offspring-To-Mother Size Ratio Predicts Post-Reproductive Lifespan, suggesting that long life might be a spandrel of the health needed to survive the stress of childbirth.

Is Social Darwinism A Myth? (1, 2). Despite the ubiquitous demands not to be like those nasty social Darwinists who must have dominated 19th century thought or something, there’s very little evidence that people of that era used the term ‘social Darwinism’ or used Darwinian theory to justify their social policies. The whole thing may have been mostly invented by one guy in the 1940s as an attempt to tarnish economists he didn’t like.

Venezuela has come up with a sure-fire solution to its hyperinflation problems which is 100% in keeping with socialist principles.

Can anybody explain whether this image (apparently derived from here?) contradicts or even reverses the narrative that Democrats have stayed pretty normal but Republicans have become much more extreme?

Did iTunes delete all the music on this guy’s hard drive? vs. Apple doesn’t delete all the music on your hard drive unless you do something wrong, which given Apple’s confusing policies and dictatorial business model you inevitably will.

The great thing about ketamine is that it relieves depression near-instantly and much more reliably than ordinary antidepressants. The bad thing about it is that it’s ketamine – a potentially dangerous hallucinogenic drug – and similar but safer compounds don’t seem to have the same effects. Now scientists have found (at least in mice) that it is not ketamine itself but a metabolite of ketamine that treats depression, and the metabolite is relatively safe. Also, the metabolite affects AMPA receptors, not NMDA receptors, which means previous research was looking in the wrong place and now we can look in the right one. Exciting progress!

Thing of Things: Contra Piaget, very young infants probably have object permanence.

The first few paragraphs of this article are standard intra-Christian exhortation boilerplate, but if you can make it through them, the rest is a fascinating and terrifying ethnography of a creepy new charismatic movement.

Posted in Uncategorized | Tagged | 1,319 Comments

OT49: Open Secret

This is the bi-weekly open thread. Post about anything you want, ask random questions, whatever. Also:

1. Bakkot has made a new script that allows people to filter out SSC comments by specific users they don’t want to read (including Anonymous). You can get it here for Chrome and here for Firefox.

2. I’m still trying to figure out how to relieve pressure on the open threads. I’m moving away from the idea of a forum (which wasn’t too popular) to having more regular open threads on the blog. I just need to figure out how to make it not clutter and detract from regular threads. One possibility would be to have even-numbered open threads have pictures and announcements and so on, and odd-numbered open threads be just the words “this is the open thread” so that it doesn’t take up as much room in feeds and the front page. A more interesting possibility: have the open thread be a hidden post like this. There would be a tab on the top, by the Comments tag and the About tag and all the others, that says Open Thread. It would link to whatever the hidden open thread was. After 1000 comments, some bot would automatically post a new hidden open thread and the location to which the tab directed would change. That way there would always be an open thread with fewer than 1000 comments. Would people use this? Would anybody want to program this for me?

3. Best comments of the week are people trying to explain mutational load to me, including Simon here, Gwern here, Ilai Bar-Natan here with a really interesting point that sexual reproduction is necessary to control mutational load (is this widely agreed and appreciated? should it be?), and Rosalind Arden (author of some of the papers my post cited) here.

4. Note a new advertisement by Numerai, which describes itself as “participatory cybernetic finance” and “an attempt at a hedge fund crowd-sourcing stock market predictions”. It offers prizes for algorithms that can predict a dataset they provide which corresponds to some features of the stock market that they plan to make money off of. I kind of thought the sort of people who have AIs that can predict the stock market would probably be, uh, busy with other things, but apparently this is a well-investigated field with a lot of possible incremental progress.

Posted in Uncategorized | Tagged | 1,147 Comments

Myers’ Race Car Versus The General Fitness Factor

[Epistemic status: I am not a geneticist, and even the geneticists I know aren’t sure about a lot of this. Take as speculation only.]


PZ Myers argues against Stephen Hsu’s genetic engineering proposal here – a disappointing attitude toward mad science for a guy whose blog header is a crocodile-duck hybrid. The piece has a lot of errors, the worst of which other people have already discussed – but I want to talk about what I think is its strongest point. Myers writes:

Note [Hsu’s] estimate of the number of genes that contribute to IQ: 10,000. That’s half the human genome! Hmmm. I wonder if any of those genes play a role in other processes in human physiology that might be affected by his plan?

Here’s an analogy for you: let’s say a novice car designer has decided that the one quality of an automobile that is most important is speed, raw speed. He doesn’t know much about cars, so he asks more qualified engineers about what elements of the car contribute to acceleration and velocity, and they start off with the obvious…details of the engine, fuel mixes, etc. Then they’re talking tires. Aerodynamics. Weight. Pretty soon they have to admit that just about everything in the car is going to affect the speed at which it travels.

So our blithe designer decides that making a fast car is simple: we just look at each component of the car one by one, and we pick an available option for it entirely on the basis of which option makes the car go faster. We’ll easily be able to make a car that can rocket along at a thousand miles an hour, he thinks.

But we have to ask whether we would want a car where the seats and steering were optimized for speed, where safety options were discarded, where something like visibility or reliability were jettisoned for the sole virtue of going really fast.

This makes a lot of sense. A car in which every component was optimized for speed would probably be uncomfortable, unsafe, ugly, difficult to maintain, and otherwise not the sort of car you want to drive. A human in which every component was optimized for intelligence might well be unhealthy, ugly, physically weak, antisocial, et cetera.

And we can do more than just hand-wave at an analogy to cars. Natural selection constantly weeds out worse alleles and replaces them with better ones. If an allele increases intelligence enough to improve reproductive fitness, but has no negative effects, it should sweep across all human populations in an amount of time proportional to its fitness benefit1. We see genetic evidence of various alleles sweeping various human populations in both the distant and recent past. These do not include the intelligence-boosting alleles Hsu is talking about.

For example, Hsu cites Rietvald et al‘s finding that rs1487441 is linked to cognitive ability (though it only gives you 0.3 extra IQ points, typical of the generally unimpressive contribution of single genes). About 20% of both Europeans and Japanese have the (A:A) variant, which suggests that however many thousands of years it’s been since Europeans and Japanese diverged from each other isn’t long enough for the gene to undergo much selection. That means neither allele can have any overwhelming advantage, which means there must be some reason to have the opposite allele worth as much as 0.3 IQ points. I think this is the rigorous version of what Myers is saying.

Despite the fact that the race car argument makes perfect sense both analogically and rigorously, it seems to be wrong.


What are the sorts of things we might trade off against intelligence? Perhaps fitness, height, attractiveness, health, longevity, social well-adjustedness?

But in fact none of these trade off against intelligence, many are strongly positively associated with it, and in some the link has been proven genetic!

People with high IQ tend to live longer. For example, a person with IQ 115 (85th percentile) is 20% more likely to survive to age 76 than an average person with IQ 100. One can of course posit many possible connections. Maybe high-IQ people are smart enough to eat healthy and exercise. Maybe rich people can afford both good schools and good doctors. Maybe good health behaviors protect the brain as well as the body and increase both IQ and longevity. But further investigation has cast doubt on all of these theories and strongly supports the hypothesis that no, the same genes that give you high intelligence also make you live longer. See for example the International Journal of Epidemiology: The Link Between Intelligence And Lifespan Is Mostly Genetic, which find genetics explain 95% of the correlation. A few of the genes linking intelligence and longevity may be already known; SSADH seems to be a contributing factor. My favorite study in this area, though, is one that is not yet complete: since all mammals are basically the same [citation needed] some London School of Economics researchers have developed an IQ test for dogs in the hope of checking whether the same correlation applies to them. Since canine intelligence doesn’t affect things like diet, exercise, or tobacco status, a positive correlation in them too would help solidify the finding. We’re still waiting on those results, but even without them the genetic hypothesis is looking pretty strong.

People with high IQ tend to be taller. This is interesting since height is often used as a measure of health and fitness during childhood, and since taller people get a bunch of advantages including being rated as more attractive and earning higher income. Once again we can imagine all sorts of possible confounders; once again studies find that the link is genetic. See for example Common Genetic Variants Explain The Majority Of The Correlation Between Height And Intelligence, The Genetic Correlation Between Height And IQ: Shared Genes Or Assortative Mating, Resolving The Genetic And Environmental Sources Of The Correlation Between Height And Intelligence, On The Height-Intelligence Correlation.

People with high IQ may be more attractive. This is the conclusion of a meta-analysis that finds a positive correlation between intelligence and body symmetry, usually used as a proxy for attractiveness unaffected by things like hairstyle and cosmetics; a second study failed to find this relationship. The jury is out on the positive link, but there certainly isn’t the negative link that Myers’ race car would predict.

People with high IQ commit much less crime – which is going to be our measure for social well-adjustedness here since it’s well studied. Once again it’s easy to think of possible confounders (I’ll add lead levels to the usual lot). Once again the studies show that at least some of the effect is genetic – here’s one on low-IQ/antisocial-behavior correlation in children, and here’s one cleverly linking fathers’ criminal history to sons’ vs. nephews’ IQ and then throwing enough statistics at it to find that the relationship is genetic. Likewise, the relationship between high IQ and low drug abuse seems to be genetic as well.

People with high IQ tend to be more physically fit. This is the conclusion of a study of 1.2 million Swedes. I don’t have any strong evidence that this relationship is genetically mediated (although Gottfredson on the fitness factor may be relevant here), I just want to note that, once again, there is less than zero evidence for Myers’ race car hypothesis.

People with high IQ have lower rates of heart disease, stroke, circulatory diseases, and diabetes. Intelligence may or may not decrease cancer risk, but again contra the race car hypothesis, it certainly does not increase it. Sibling designs suggest that shared family environment during youth is not responsible for the benefits; differential socioeconomic status as an adult may be, but this status is itself likely caused be the intelligence differences.

So despite its apparent plausibility the race car hypothesis crashes and burns.


In one sense, this is bizarre. It’s as if somebody optimized every part of a race car for speed, and found that by coincidence this also made it the safest, most comfortable, and cheapest car on the road.

Yet if we think about it some more maybe it’s not too surprising. Consider Niels Bohr. He was a Nobel Prize winning physicist, professional football player, activist who helped Jews flee the Nazis, loving father of six children, and so healthy he kept doing science well into his seventies. And his talents show every sign of being at least partly genetic – his brother Harald was a leading mathematician, anti-Nazi activist, and Olympic silver medalist; one of Bohr’s children was also a Nobel Prize winning physicist and another was also an Olympic athlete. So it’s obviously possible to design a human with all-around great genes. Why does evolution restrict such designs to the Bohr family?

I can think of a few possibilities, all of which people who know more than I do are welcome to shoot down.

First, some of these all-around-beneficial genes could be good in heterozygosity but bad in homozygosity. We know something similar is true in the case of sickle-cell anaemia, which is mostly good in heterozygosity (protects vs. malaria) and very bad in homozygosity (causes sickle cell). This is exactly the sort of gene that should exist at a constant low frequency in the population, never getting more or less common. If the frequency got too low, then there’s no risk of two carriers mating, so evolution would encourage it as a free disease cure. If it got too high, evolution would discourage it – any carrier would probably marry another carrier and give their kids sickle cell. Suppose there are ten such genes, each of which grants higher intelligence on heterozygosity and has a frequency of 10% in the population. The average person will on average carry one such gene and have a 10% chance of a horrible genetic disease. Maybe Niels Bohr lucked out and carried all ten such genes without going homozygous on any. Maybe some other poor guy who is lost to history got all ten genes homozygous and died at birth of ten horrible genetic diseases at once.

This would make Hsu’s gene-editing project very promising; all he would need to do is give everybody one copy of the relevant genes (and then never let them mate). But the hypothesis can’t be quite right: I think it would predict that Niels Bohr’s children would have unusually high rates of genetic diseases. In fact, the children of great men regress to the mean a little bit but show no signs at all of being unusually cursed.

Second, we could be talking not about polymorphisms but about mutational load. That means that there’s some genome that works for humans (plus or minus a few hundred thousand polymorphisms that aren’t too important at this level of analysis) and genetic health is determined by how many detrimental mutations you and your parents randomly accreted. If your mother spent too much time near the local nuclear reactor when she was pregnant, maybe you get a few hundred extra mutations and end up with lower IQ, a worse heart, less attractive features, et cetera. And this is obviously true in the case of a literal nuclear reactor, but I’m having trouble figuring out what plays the reactor role in real life. I know Greg Cochran and others have talked about things like paternal age at conception, climate, et cetera, but he applies these only to differences between populations. I’m not sure whether it would work out to expect a big difference in mutational load between Niels Bohr and his underachieving next-door neighbor. Maybe Bohr came from a long line of people who lucked out and got hit by unusually few cosmic rays? I don’t know if this makes sense or not. Part of my problem might be that I still don’t really understand how mutational load ever decreases – I’ve heard “the most heavily-loaded people are weeded out by natural selection”, but it seems like that should only be able to slow the gradual universal dysgenesis.

This would also bode well for Hsu’s project. In fact, it would make it even easier; it would reduce to the modal genome idea (make a baby whose genome, at each location, has the nucleotide which is most common at that location among all humans worldwide) which could be done without even performing the groundwork to see which genes do or don’t influence intelligence.

Third, maybe all of these other good things are trading off against things that were important in the environment of evolutionary adaptedness but not today. Greg Cochran brings up infectious disease resistance; some commenters bring up calorie requirements. This latter seems especially plausible; the brain uses a lot of energy and energy was a scarce resource through much of evolutionary history. Either of these would explain why evolution kept the seemingly detrimental version around for so long, and why right now in our low-infection high-calorie modern civilization one allele or another seems to be an unalloyed good.

Again, this would bode well for Hsu’s project, although the supergeniuses so produced would probably want to stay well away from any malarial swamps.

Or maybe it is some mixture of all four possibilities – Myers’ trade-offs, heterozygosity, mutational load, and disease burden. The latter three could provide a sufficiently positive effect for intelligence to hide the negative effect of the first. I would be really surprised if something like this wasn’t true – the theoretical argument for the first seems compelling, and it ought to happen at least a little even if it isn’t the main source driving intelligence differences.

I was going to write that in this case we’d have to sort through every intelligence gene one-by-one to make sure we weren’t getting one of the trade-off ones, but maybe this isn’t true – a genius designed by Hsu’s method should have on average the same number of trade-offs versus unalloyed-goods as a genius born normally (right? or am I missing something?). Since most of us would prefer a natural-born baby with IQ 150 to a natural-born baby with IQ 100, it seems whatever trade-offs are necessary to reach that point are widely considered worth it. So unless there’s a difference I’m missing between normal recombination and Hsu’s method, we should be okay with designing an IQ 150 baby as well (from a purely health-related perspective, at least).

Whether this generalizes to creating an IQ IQ 200 or 300 baby depends not just on ethics, but on whether for some reason the costs and trade-offs of intelligence compound more than linearly. It’s possible, for example, that there are ten different genes coding for something that protects heart health, all of which can be traded off against intelligence. If you switch one gene from heart health to intelligence, whatever, you still have nine genes protecting your heart. But switching all of them at once would be a bad idea. I don’t know any reason to think this is true, but it’s a possibility that might give us pause between the IQ 150 and the IQ 300 level.

Overall though, I think the race car idea, despite its plausibility, is likely to be less of an impediment to genetic engineering than it might seem.

Only One Footnote, But It’s Really Long

1. This is an assumption I’m granting for the sake of argument, but possibly not true at the margin.

Consider that evolution doesn’t care about intelligence nearly as much as we do. The most recent common ancestor of Europeans and Japanese wasn’t going to use her intelligence to design a mammoth-seeking rocket. In fact, it’s not totally clear why humans did evolve intelligence before the modern age; sure, tools are nice, but early hominids stuck to the same tools for a million years at a stretch; that doesn’t exactly give a tight feedback loop to work with. The most convincing argument I’ve heard is the Machiavellian intelligence hypothesis which says that our ancestors used intelligence to navigate tribal politics and gain status within a social group.

But this theory would naively predict that the smartest person in high school would be the most popular. If intelligence is for gaining status, it seems to have diminishing returns beyond a certain point, which would explain why evolution didn’t generally make us more intelligent even though greater-than-average intelligence is clearly possible (eg geniuses).

In the rare cases where evolution did have an incentive to evolve higher intelligence, it did so quickly and effectively. Several highly mercantile societies independently evolved the same set of genes producing higher IQ. The most notable were the Ashkenazi Jews, who have an average IQ 12-15 points higher than their European neighbors and whose genes show strong signatures of recent selection for intelligence; this most likely occurred during the Middle Ages when they were the mercantile class of Europe, since non-Ashkenazi Jews show no such effect. The genes involved tend to produce sphingolipidoses when homozygous, which shows a pretty good reason why evolution didn’t do this kind of thing more often. Myers has previously dismissed this research, but I think wrongly – the paper itself considers and rejects the all the criticisms he raises (see pages 15 – 31). The very short summary is that Myers dismisses the genetic pattern as “variations amplified by chance,” but the expected level of chance variations can be calculated and this isn’t it. Ashkenazim have similar heterozygosity to other Europeans in neutral markers – ruling out a simple bottleneck – and the mutations involved are too potentially deleterious in homozygosity to persist for many generations by chance alone. Further, the mutations are all clustered in a few key pathways, many of which are clearly linked to intelligence. For example, Ashkenazim are at high risk for torsion dystonia, which is associated with higher IQ in sufferers.

So in response to the argument that evolution must trade off against something else, I would argue that evolution doesn’t share our exchange rate. Suppose that we could gain 20 IQ points at the cost of having larger heads that are harder to fit through a birth canal (remember, some of the known genes for intelligence are associated with head size, and the obstetrical dilemma used to be a big deal!). For hunter-gatherers, who had little use for IQ but lots of use for getting through birth canals, this was a bad deal and evolution didn’t take it. For moderns, who can use IQ points to cure cancer and explore space, and who have modern obstetric techniques, it’s a lot more attractive. So yes, let’s be cautious, but I think we’d all feel pretty stupid if we avoided bootstrapping our way to superintelligence out of fears of “things man was not meant to meddle with”, only to learn later that the whole problem could have been solved with c-sections.

Posted in Uncategorized | Tagged , , | 598 Comments