Growth Mindset 3: A Pox On Growth Your Houses

[EDIT: The author of this paper has responded; I list his response here.]

Jacques Derrida proposed a form of philosophical literary criticism called deconstruction. I’ll be the first to admit I don’t really understand it, but it seems to have something to do with assuming all texts secretly contradict their stated premise and apparent narrative, then hunting down and exposing the plastered-over areas where the author tries to hide this.

I have no idea whether this works for literature or not, but it’s a useful way to read scientific papers.

Consider a popular field – or, at least, a field where a certain position is popular. For example, we’ve been talking a lot about growth mindset recently. There seem to be a lot of researchers working to prove growth mindset and not a lot working to disprove it. Journals are pretty interested in studies showing growth mindset interventions work, and maybe not so interested in studies showing they don’t. I’ll admit that my strong suspicions of publication bias don’t seem to be borne out by the facts here – see this meta-analysis – but I bet its more sinister cousin “all experimenters believe the same thing and have the same experimenter effects” bias is alive and well.

In a field like that, you’re not going to get the contrarian studies you want, but one way to find the other side of the issue is to look a little more closely at the studies that do get published, the ones that say they’re in support of the thesis, and see if you can find anything incriminating.

Here’s a perfect example: Mindset Interventions Are A Scalable Treatment For Academic Underachievement, by a team of six researchers including Carol Dweck.

The abstract reads:

The efficacy of academic-mind-set interventions has been demonstrated by small-scale, proof-of-concept interventions, generally delivered in person in one school at a time. Whether this approach could be a practical way to raise school achievement on a large scale remains unknown. We therefore delivered brief growth-mind-set and sense-of-purpose interventions through online modules to 1,594 students in 13 geographically diverse high schools. Both interventions were intended to help students persist when they experienced academic difficulty; thus, both were predicted to be most beneficial for poorly performing students. This was the case. Among students at risk of dropping out of high school (one third of the sample), each intervention raised students’ semester grade point averages in core academic courses and increased the rate at which students performed satisfactorily in core courses by 6.4 percentage points. We discuss implications for the pipeline from theory to practice and for education reform.

This sounds really, really impressive! It’s hard to imagine any stronger evidence in growth mindset’s favor.

And then you make the mistake of reading the actual paper.

The paper asked a 1,594 students from a bunch of different high schools to take a 45 minute online course.

A quarter of the students took a placebo course that just presented some science about how different parts of the brain do different stuff.

Another quarter took a course that was supposed to teach growth mindset.

Still another quarter took a course about “sense of purpose” which talked about how schoolwork was meaningful and would help them accomplish lots of goals and they should be happy to do it. This was also classified as a “mindset intervention”, though it seems pretty different.

And the final quarter took both the growth mindset course and the “sense of purpose” course.

Then they let all students continue taking their classes for the rest of the semester and saw what happened, which was this:

[EDIT: I totally bungled these graphs! See discussion of exactly how on the author’s reply above, without which the information below will be misleading at best]

Among ordinary students, the effect on the growth mindset group was completely indistinguishable from zero, and in fact they did nonsignificantly worse than the control group. This was the most basic test they performed, and it should have been the headline of the study. The study should have been titled “Growth Mindset Intervention Totally Fails To Affect GPA In Any Way”.

Instead they went to subgroup analysis. Subgroup analysis can be useful to find more specific patterns in the data, but if it’s done post hoc it can lead to what I previously called the Elderly Hispanic Woman Effect, after medical papers that can’t find their drug has any effect on people at large, so they keep checking different subgroups – young white men…nothing. Old black men…nothing. Middle-aged Asian transgender people…nothing. Newborn Australian aboriginal butch lesbians…nothing. Elderly Hispanic women…p = 0.049…aha! And the study gets billed as “Scientists Find Exciting New Drug That Treats Diabetes In Elderly Hispanic Women.”

As per the abstract, the researchers decided to focus on an “at risk” subgroup because they had principled reasons to believe mindset interventions would work better on them. In their subgroup of 519 students who had a GPA of 2.0 or less last semester, or who failed one or more academic courses last semester:

Growth mindset still doesn’t differ from zero. And growth mindset does nonsignificantly worse than their “sense of purpose” intervention where they tell children to love school. In fact, the students who take both “sense of purpose” and growth mindset actually do (nonsignificantly) worse than sense-of-purpose alone!

But the control group mysteriously started doing much worse in all their classes right after the study started, so growth mindset is significantly better than the control group. Hooray!

Why would the control group’s GPA suddenly decline? The simplest answer would be that by coincidence the class got harder right after the study started, and only the intervention kids were resilient enough to deal with it – but that can’t be right, because this was done at eleven different schools, and they wouldn’t have all had their coursework get harder at the same time.

Another possibility is that sufficiently low-functioning kids are always declining – that is, as time goes on they get more and more behind in their coursework, so their grades at time t+1 are always less than at time t, and maybe growth mindset has arrested this decline. This is plausible and I’d be interested in seeing if other studies have found this.

Perhaps aware that this is not very convincing, the authors go on to do another analysis, this one of percent of students passing their classes.

This is the same group of at-risk students as the last one. It’s graphing what percent of these students pass versus fail their courses. The graph on th left shows that a significantly higher number of students in the intervention conditions pass their courses than in the control condition.

This is better, but one part still concerns me.

Did you catch that phrase “intervention conditions”? The authors of the study write: “Because our primary research question concerned the efficacy of academic mindset interventions in general when delivered via online modules, we then collapsed the intervention conditions into a single intervention dummy code (0 = control, 1 = intervention).

We don’t know whether growth mindset did anything for even these students in this little subgroup, because it was collapsed together with the (more effective) “sense of purpose” intervention before any of these tests were done. I don’t know if this is just for convenience, or if it is to obfuscate that it didn’t work on its own.

[EDIT: Scott McGreal looks further and finds in the supplementary material that growth mindset alone did NOT significantly improve pass rates!]

The abstract of this study tells you none of this. It just says: “Mindset Interventions Are A Scalable Treatment For Academic Overachievement…Among students at risk of dropping out of high school (one third of the sample), each intervention raised students’ semester grade point averages in core academic courses and increased the rate at which students performed satisfactorily in core courses by 6.4 percentage points” From the abstract, this study is a triumph.

But my own summary of these results, as relevant to growth mindset is as follows:

For students with above a 2.0 GPA, a growth mindset intervention did nothing.

For students with below a 2.0 GPA, the growth mindset interventions may not have improved GPA, but may have prevented GPA from falling, which for some reason it was otherwise going to do.

Even in those students, it didn’t do any better than a “sense-of-purpose” intervention where children were told platitudes about how doing well in school will “make their families proud” and “make a positive impact”.

In no group of students did it significantly increase chance of passing any classes.

Haishan writes:

If ye read only the headlines, what reward have ye? Do not even the policymakers the same? And if ye take the abstract at its face, what do ye more than others? Do not even the science journalists so?”

Titles, abstracts, and media presentations are where authors can decide how to report a bunch of different, often contradictory results in a way that makes it look like they have completely proven their point. A careful look at the study may find that their emphasis is misplaced, and give you more than enough ammunition against a theory even where the stated results are glowingly positive.

The only reason we were told these results is that they were in the same place as a “sense of purpose mindset” intervention that looked a little better, so it was possible to publish the study and claim it as a victory for mindsets in general. How many studies that show similar results for growth mindset lack a similar way of spinning the data, and so never get seen at all?

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

254 Responses to Growth Mindset 3: A Pox On Growth Your Houses

  1. Ryu Ken says:

    This study is shockingly bad. How did this even get published?

    Do you think getting abstracts to be written independently of the paper itself as part of the peer review process would improve science communication?

    Btw first time commenting, and I’ve gotta say I love your blog. If I ever get to meet you personally I’m gonna give you a big hug lol.

    • syllogism says:

      You know, in my field (computational linguistics) as a reviewer, you typically *do* write an abstract of the paper!

      The first line or two of a good review summarises your understanding of it, and instructs the area chair/editor whether you buy the claims etc. So it probably wouldn’t be so much work to make this official.

    • suntzuanime says:

      I imagine it would improve the quality of reviews, by forcing the reviewers to actually read the paper. But maybe I’m just bitter. (And maybe actually reading the paper would prove too burdensome and this would bring the peer-review edifice crashing down on our heads.)

      • Ilya Shpitser says:

        You can’t blame reviewers, though, you have to improve the incentive structure for peer review. People always follow incentives, you can’t yell at them for that. People doing that is the problem statement.

        • Carl Shulman says:

          The disposition of bystanders to yell is one source of incentives, sometimes important and sometimes not.

          • vV_Vv says:

            But in this case it is a “tragedy of the commons” scenario:

            Continuous public bitching about the shortcomings of peer reviews erodes the status of academic scientists as a group. Each academic scientist probably doesn’t like that, but individually they can do nothing about it, after all reviewers remain anonymous and can’t gain personal status by doing a good job. Therefore, the individual incentive to slack remains unbalanced.

          • Steve Sailer says:

            Baseball statistics analysis has improved radically over the last 30 years, while most social science has not. Why?

            Baseball statistics was pretty much of a free for all among ambitious amateurs criticizing the old guard and each other.

            Plus, baseball was an ideologically Safe Space where bright white males would have a hard time getting in trouble over gender, sexual orientation, or even race if they practiced a little crimestop. In contrast, social sciences are sedated by the need for “protective stupidity” to avoid mentioning, except in approved fashions, the big, recurrent factors that drive so many results: race, sex, etc.

            Of course, the sabermetricians still did a terrible job on steroids, sticking their heads deep in the sand.

        • Troy says:

          You can’t blame reviewers, though, you have to improve the incentive structure for peer review. People always follow incentives, you can’t yell at them for that. People doing that is the problem statement.

          I’m all for changing incentives, but I sure can yell at people for following incentives, when doing so is morally wrong. It is inexcusable to recommend a paper for publication that you have not read. (It is much more excusable to recommend a paper for rejection that you have only read part of and have found to be incomprehensible.)

          • Ilya Shpitser says:

            Ok, but that’s not productive. A sizable portion of humanity follows incentives, morality be damned. You may not like it, you may yell at them for it, but that does not seem like a productive response.

            Do you think yelling will enact change? There is no bolshevik’s “new man.” There is just “man.”

          • Troy says:

            I can’t change everyone, but I may be able to change some of those I am in direct contact with in academia, e.g., my students.

      • bring the peer-review edifice crashing down on our heads

        You say this like it’s a bad thing.

      • Ryu Ken says:

        You know, it seems like many things in the world are some sort of priesthood or another.

        What I mean is, we attempt to give added credibility to a field by grouping together people who are *supposed to*, through moral force or something, review each other and police each other’s actions, but this typically NEVER happens.

        So the people in these fields get a ‘status rent’ through their inclusion in the priesthood, but this is typically underserved. Let’s call this a ‘status priesthood’. I think humanities academia, congress, peer reviewers, medicine, law etc. are all places where this occurs. ‘Religious thinking’, in boh the reactionary sense and in the Sailerian ‘megaphone’ sense, probably runs amok in these places due to such social mechanisms.

        But this immdiately raises the question: what happens in a society with no ‘status priests’ at all? Something like Twitter or tumblr lynchings i suspect. The worst aspects of egalitarianism or devolving to a common denominator will occur, and everything will degenerate into popularity contests. Which then raises the question of how a true meritocracy can ever transpire in real life. A ‘Raven’s matrices meritocracy’ probably lol.

    • youzicha says:

      In general I’m all in favor of having abstracts written by reviewers instead of authors. But if the situation is like Scott proposed, that all researchers in this field believe growth mindset interventions work and are excited about new research proving how well it works, then probably it would not help that much?

      Like, the paper presumably has an intro section putting the same spin on the result as the abstract does, and presumably the reason the reviewers accepted it is that they bought that spin. Then I would expect that they would just repeat that point of view in their own reviews. (A typical accepting review starts something like “I recommend this paper for acceptance because it makes the following important contributions: …”.) If they were as jaded as Scott, they’d just reject it outright…

  2. E. Harding says:

    Wow. This is the first time I’ve ever seen anyone make something useful out of Deconstruction.

  3. Anonymous` says:

    weirder and weirder subgroups – young white men…nothing. Old black men…nothing. Middle-aged Asian transgender people…nothing. Newborn Australian aboriginal butch lesbians…nothing. Elderly Hispanic women…p = 0.05…aha!

    Heh heh.

    I hate that I have this sense. The quote was perfectly innocent.

    • veronica d says:

      It would have been better to say “groups that become decreasingly numerous” — but worded better. It’s actually hard to find a good way to word that. But yeah, “weird” is a less-than-perfect word for this.

  4. social justice warlock says:

    I skimmed this post.

    • Wrong Species says:

      Based off the first paragraph, I believe he’s saying that deconstructing growth mindset causes chicken pox or something.

  5. John says:

    On the topic of deconstruction, I recommend How to Deconstruct Almost Anything: My Postmodern Adventure.

    • Shieldfoss says:

      On the topic of deconstruction, I recommend How to Deconstruct Almost Anything: My Postmodern Adventure.

      The thing about Deconstruction is that it works, I use it often and it gives me accurate results that allow me to make successful future predictions. I hesitate to claim I know the true and final reason why people disparage it, but my guesses are:

      1) It is commonly used by the Greens, forcing the Blues to deny its efficacy (See: All other subjects)

      2) For the untrained, it is hard to distinguish good and bad deconstruction, enhanced by the fact that there is so much bad deconstruction.

      1+2=3) A common attack against good deconstruction is that it is nonsense. A common counter is that the speaker must be unsofisticated if he thinks it is nonsense. Since deconstruction is a skill that must be learned, this makes it hard for outsiders to see whether any given exchange of nonsense/unsofistication is based in fact or partisan bias. Since arguments are soldiers, even people with the skill will tend to judge along partisan lines.

      • zslastman says:

        I’m sure a lot of people around here would be interested an elaboration of that. The community kind of dismisses that stuff, for lack of the time to properly investigate it. What does it allow you to predict? Can you demonstrate?

        • Shieldfoss says:

          What does it allow you to predict? Can you demonstrate?

          I am under NDAs on basically every project I work on so not with a real-world example, but much like Scott, I can get away with a composite story made up from individual pieces.

          So:

          Say you’re on a project for a company. The company creates a position for, and hires, a Chief Diversity and Sustainability Officer. Shortly after, it is announced that there will be a course on Business Ethics and a subsequent test. The test is taken online. The course is not mandatory, but the test is.

          So here’s a real question: Do you need to take the course (a significant investment of time during crunch on a big project) to avoid failing the test? If you are skilled at deconstruction, you take the Text (Here: the announcements of the new CDSO, course and test) and find out the real intentions behind the authors – are we going to commit real resources to this or is it just a low-effort signalling game? Will the course have actual new practices we put into effect or just a lot of nice-sounding yes words?

          The cynical answer is “It is always a signalling game.” The idealist answer is that of course they care about X.

          I have worked in places where the cynical assumption would get you (a) dead wrong and (b) no re-hire for future projects and I have also worked places where the idealist answer would be a waste of everybody’s time.

          Really, deconstruction is just an academic formalized version of HPMOR’s CP. 20: “The import of an act lies not in what that act resembles on the surface, Mr. Potter, but in the states of mind which make that act more or less probable” – deconstruction is the rationalist skill of figuring out the possible mind-states of the Author of the Text instead of just taking the Text at face value, which will often be wrong, or just assuming the Text is a lie, which will also often be wrong.

          Incidentally, the deconstruction of “How to Deconstruct Almost Anything: My Postmodern Adventure” is fairly simple: From the fact that the author spends so much time talking about how silly Deconstructionism is, you can easily infer that the author cares deeply about it – possibly because it has been annoying him for quite some time.

          • Shieldfoss says:

            deconstruction is the rationalist skill of figuring out the possible mind-states of the Author of the Text instead of just taking the Text at face value, which will often be wrong, or just assuming the Text is a lie, which will also often be wrong.

            Expanding on this: A signalling Text is different from an idealist text written in a signalling company is different from an idealist text written in an idealist company. With experience, you learn to tell the difference.

            Not to say Heidegger and Derrida were not, in other ways, absolutely full of it – I do not recommend wasting any time reading them, you can get their actual insights much better in other ways.

            (Bonus homework for the class: Deconstruct this post. There are three paragraphs here. There is at least one and at most two paragraphs for purposes that are half signalling and half an attempt to convey information without stating it outright. What information might that be?)

          • anon says:

            Sounds like if it’s improperly used it can quickly become Bulverism.

          • TomA says:

            You’re being a little obtuse here. Deconstruction method (like Game Theory methods) are just tools that can improve decision-making if implemented properly. The formalisms of procedure, structure, organization, and mathematics are simply techniques that can be learned and applied. And there is no guarantee of success, just an improved probability.

          • Magicman says:

            This is not even wrong. Deconstruction is not the revealing of some hidden message in the text whether construed as the intention of the author, the social subtext, the historical context etc. In fact this is one of the forms of textual interpretation that Derrida explicitly argued against.
            Perhaps you were being ironic-sarcastic?

          • Mary says:

            How does “deconstruction” in the sense you give here differ from good ol’ “reading between the lines”?

          • 27chaos says:

            I agree with MagicMan. Deconstructionists don’t claim that everyone is signalling, they claim that truth-claims have certain inconsistencies or limitations, and that looking at these is often useful.

            Some deconstructionists might say that inconsistencies like this exist because truth doesn’t exist. But another way of looking at deconstruction would be to say that humans aren’t perfectly objective, so our vocabulary for describing the world is fundamentally broken – the map is not the territory, but also the map is more like an ugly patchwork quilt than a solid sheet of paper. If all the breaks in a sentence or essay line up in a certain way, then that suggests something about what the mistakes of certain ideas are.

            One problem in deconstruction is that what one deconstructionist reads as an assumption of the work or considers to be “inconsistent” can be pretty different than what a reasonable person might. The worst argument in the world runs rampant in the field, for example. Also, like a lot of criticism, it’s in practice more about destroying bad ideas than introducing new good ideas as alternatives – it’s not good at building useful things, and often it’s too quick to sneer at ideas or reject them on a flimsy basis. But deconstructionists are often good at poking holes in mainstream thought-cached ideas despite the method’s weaknesses.

            It’s really more an art than a science. I use deconstruction, personally, in a “Random Noise is Our Most Valuable Resource” kind of way. I don’t take it seriously, but it’s sometimes nice for brainstorming.

          • Desertopa says:

            My understanding is that within the field of literary criticism (where the technique of deconstruction originated,) the intent of the author is *not* considered relevant to the interpretation of the text. Motive analysis is a useful skill to have, but the academic practice is very distinct from motive analysis.

      • Cauê says:

        The thing about Deconstruction is that it works, I use it often and it gives me accurate results that allow me to make successful future predictions.

        (…) 2) For the untrained, it is hard to distinguish good and bad deconstruction, enhanced by the fact that there is so much bad deconstruction.

        Is there a test that a particular example of deconstruction can pass or fail? How do you know it works?

        Do the “trained” agree among themselves on which ones are good and which are bad? How does one recognize who is sufficiently trained?

      • HeelBearCub says:

        Serious question, isn’t deconstruction basically “Debate 101”? You find the weak points in the opponents arguments and destroy them.

        If not, how is it different? How (from the outside) can we tell deconstruction from a mere desire to defeat the author’s/speaker’s argument?

    • FullMetaRationalist says:

      I’ve read the link before. I’ve also read and reread the wikipedia article. I still have no idea what Deconstruction entails. If I steelman this, it sort of resembles the rationalist game “Consider the Opposite”. Maybe.

  6. Carl Shulman says:

    You could just email Dweck and ask about the chronology of the study design, i.e. when they decided to collapse all the interventions together and break out subgroups, etc. And her general practices regarding creation/change of analysis plan after seeing the data.

    Gelman’s “Garden of Forking Paths” is relevant here:

    http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf

  7. The frustrating thing is, I don’t know how to put a stop to weaselly studies like this. I’m a grad student and I can see myself doing this kind of thing all the time when I write papers. I try to at least be aware of it so that I’m not fooling myself (there’s a mental feeling that goes along with spin, like trying to stretch something into a state it shouldn’t be in, and I’ve learned to recognize and hate it). But it’s hard to avoid altogether – I’m under pressure from my supervisor to produce high-impact results and present them in the best light possible. And sure, I could try to make a stand and stop spinning my results, but what would that do? Probably my papers would just be published in slightly lower-quality journals (Ha! The better I try to make my papers the worse places they end up), or in some cases the paper wouldn’t get published at all, and my supervisor would be annoyed with me, and I might be delayed in finishing my PhD.

    Something something incentives something something Moloch.

    • Troy says:

      (Option 1) If you are under pressure to publish in an ideologically unbalanced field (like social psychology), there is a lot of low-hanging fruit in terms of results that the ideologues of the field don’t want to find (see, e.g., Lee Jussim’s work picking some of this fruit in social psychology). You can pick some of this fruit. Downside: you publish ideologically unpopular results.

      (Option 2) Once you get your Ph.D., get a job that focuses more on teaching than research. Obviously there are still bad incentives in teaching, but if you’re a good teacher you have a lot of freedom here.

      (Option 3) Contribute to what in most fields is a growing literature on methodology. If you do it well, you can get widely cited and recognized. See, for example, this very clever experiment in psychology. You might think this has the same downsides as option 1, but in fact methodological naysayers are often popular, largely because everyone agrees there’s a problem but is convinced it’s not their problem.

      • James D. Miller says:

        Actually the downside of (1) is that you don’t get your stuff accepted for publication.

        • Troy says:

          I think it’s more accurate to say that it’s harder to get your stuff accepted for publication. (There are, after all, people like Jussim who get published.) On the other hand, it’s too easy to get published right now — witness the topic of this post. So perhaps the extra scrutiny is not entirely a bad thing.

    • HeelBearCub says:

      What kind of life do you want?

      Do you want an academic research position so you can have an academic research position? Or do you want an academic research position so that you can do good research?

      That may make it sound like an “easy” decision, or that I am priming you for one answer over the other. I assure you I am not, but rather echoing back to you the essence of what you said. There are legitimate arguments either way.

      If you don’t keep an eye on your ultimate goals, then you end up somewhere you did not want to be.

      • Well actually, I don’t want an academic research position at all. But the same question applies to my PhD, so it’s a fair point. And at this point I would say that, yeah, I am pretty much getting my PhD just to get my PhD. I think I mostly just drifted into going to grad school by default, because it seemed like the thing to do at the time (to use the LW-favoured terms, which I think are unfairly maligned, I was acting more NPC-ish than PC-ish). Now that I’m here, though…well, I get paid a livable wage, I’m not overworked at all, I have a very flexible schedule. So it’s not a bad gig, really. Of course, I’m quite cynical about the value of my research (not that it’s worthless, per se – more overhyped). But if you want to get published you have to play the spin game, and so I play the spin game.

        • HeelBearCub says:

          So, get your PhD, but don’t sell your soul to do so.

          But of course, there is also the “what then” question…

  8. Steve Sailer says:

    A lot of academics want a piece of what you might call the Motivational Writing and Speaking industry. Malcolm Gladwell, who repackages social science studies for employees, largely, in sales and marketing, really opened a lot of eyes among academics to how much money you can make speaking at corporate events. Gladwell’s fee for addressing, say, a corporate sales conference is routinely in the $50,000 and up range.

    A large part of the appeal of Gladwell’s speeches is his assertion that what he’s telling you is Science with a capital S. So, if you are a social scientist, why not try to cut out the Gladwell-like middlemen and get into the greater Motivational industry yourself?

    For example, here’s Dweck’s page on the website of her agent, the All American Speakers Bureau:

    Carol Dweck
    Leading Researcher in the Field of Motivation; Professor of Psychology at Stanford University; Author of “Mindset”
    Categories: Authors, Mental Health, Motivation, Empowerment,

    Booking Fee Range: $10,001 – $20,000

    Speaker Travels From: California – CA

    Carol S. Dweck, Ph.D., is one of the world’s leading researchers in the field of motivation and is the Lewis and Virginia Eaton Professor of Psychology at Stanford. Her research focuses on why people succeed and how to foster their success. More specifically, her work has highlighted the critical role of mindsets in business, sports, and education, and for self-regulation and persistence on difficult tasks in general. In addition, she has shown how praise for ability or talent can undermine motivation and learning.

    http://www.allamericanspeakers.com/speakers/Carol-Dweck/9233

    • Steve Sailer says:

      Dweck’s 10k to 20k speaker’s fee range is pretty low, by the way. I only recognize the names of about 1/4th of the celebs in that range, and most of them are over the hill, like Alan Bean, Alan Keyes, Alice Rivlin, Amber Rose, Andrew Fastow (I guess he’s out of prison for Enron by now), Andy Dick, Angela Davis, Bart Starr, and Bela Karolyi.

      In contrast, in the $30,000 to $50,000 range, the bureau features such luminaries as Adam Carolla, Art Laffer, Boomer Esiason, Carmen Electra, Clayton Christensen (of Harvard BS), etc.

      So, there’s a lot of upside if you can become a little more famous than Dr. Dweck is right now.

    • Steve Sailer says:

      Here are some academics listed on the All American Speakers Bureau:

      Cornel West
      Dr. West’s Writing, Speaking, and Teaching Weaves Together the American Traditions of the Black Baptist Church, Progressive Politics, and Jazz
      Fee Range: $30,001 – $50,000 About Fees
      Travels From: New Jersey – NJ

      Richard Florida
      Urbanist and Commentator on Creativity and Innovation; Author of “The Rise of the Creative Class”
      Fee Range: $30,001 – $50,000 About Fees
      Travels From: District of Columbia – DC

      Laura J. Snyder
      Science Historian, Philosopher and Author of “The Philosophical Breakfast Club”; TED Talker
      Fee Range: $5,001 – $10,000 About Fees
      Travels From: New York – NY

      Michael Beschloss
      Michael Beschloss is an award-winning historian of the Presidency and the author of eight books, including his most recent work, the acclaimed New York Times bestseller The Conquerors: Roosevelt, Truman and the Destruction of Hitler’s Germany, 1941-1945.
      Fee Range: Please Contact About Fees
      Travels From: District of Columbia – DC

      Ray Kurzweil
      Inventor, Entrepreneur, Author and Futurist
      Fee Range: $50,001 and above About Fees
      Travels From: Boston – MA

      Adam Grant
      Professor, The Wharton School of Business at the University of Pennsylvania and Author, “Give and Take: A Revolutionary Approach to Success”
      Fee Range: $30,001 – $50,000 About Fees
      Travels From: Pennsylvania – PA

      Alice M. Rivlin
      Founding director of the Congressional Budget Office, served at the Department of Health
      Fee Range: $10,001 – $20,000 About Fees
      Travels From: Washington – DC

    • Troy says:

      Steve: perhaps I’m just not in tune with all of the professional incentives in social science right now, but in most fields — certainly my own — popular publishing is (I think wrongly) very much frowned upon, and contributes almost nothing towards tenure. So you’re balancing financial incentives in terms of speaker’s fees etc. against what strike me as much larger academic disincentives (not to mention the fact that most academics are not great public speakers).

      • John Schilling says:

        Someone is making $30,000 for one or two hours of speechifying, and you think saying “…and so now we will not let you be a Tenured Professor of Somethingology” is a large disincentive?

        • Troy says:

          Academics aren’t choosing between making $30,000 an hour on speaking or becoming tenured. They’re choosing between focusing their energies on making $30,000 an hour with a 1% chance of succeeding vs. focusing their energies on getting tenure with an 80% chance of succeeding.

          Perhaps those percentages are off, but even if it’s, say, 5% vs. 50%, working towards tenure still seems the rational course of action.

          • HeelBearCub says:

            There is essentially zero chance that 1% is too low an estimate. There are too many professors and not enough high-profile speaking engagements.

            It was projected that the increase in number of professors in the US between 2004 and 2014 would be 524,000 .

          • Whatever happened to Anonymous says:

            I’d say achieving tenure is also significantly below 50%, wasn’t this discussed extensively in a former post?

          • Troy says:

            Could well be. It of course depends also on your field and personal skills/qualifications.

          • Other says:

            Tenured academics remained tenured, and pretty much all influential academics are tenured. Practically no one can go the academic route to becoming a public speaker without first making it through the tenure track. So what actually happens in terms of incentives is that prior to receiving tenure, people have a strong incentive to do whatever maximizes their chances of progressing through the next bottle-neck towards tenure if that’s what they are aspiring to do. College students maximize their chances of getting into a PhD program. PhD candidates maximizes their chances of getting a good post-doc fellowship. Post-docs maximize their chances of becoming an associate professors. Associate professors maximize their chances of getting tenured. And then tenured professors maximize their chances of becoming influential either within their field or by writing for more mainstream audiences. (Becoming prominent in one’s field before writing for a popular audience seems like a more-winning strategy than first aiming to write for a popular audience. Chomsky, Levitt, Christensen, Ariely, Pinker, Hawking, and most of the other big-name authors who are also academics were already tenured professors who were well-known within their field before they wrote a book that gained more mainstream popularity. The only noteworthy exception to this rule I can think of is Dawkins who fell off the tenure tract and was merely a lecturer/reader by the time The Selfish Gene made him famous. Even Gould who is pretty much the iconic example of a pop-science scientist that gives pop-science a bad name among people who take scholarship seriously, became one of the biggest names within paleontology long before he wrote anything for a mainstream audience.)

            I don’t know what your cutoffs are for picking your numbers… but if you are counting PhD candidates who hope to someday become professors as academics (which thepenforests comment that you replied to was), they seem pretty far-fetched to me.

            After getting into a PhD program, academia becomes much more selective… I wouldn’t be terribly surprised if 50% of people who manage to become associate professors eventually getting tenure, but that’s already pretty far along the path to becoming an academic. But that number does seem pretty high to me.

            By my estimations, fewer than 5% of the people who go to grad-school hoping to eventually become tenured professors end up becoming tenured professors, and less than 1% of the tenured professors hoping to become celebrity researchers end up making it to the level that allows them to get enough speaking gigs that the speaking brings in more income a year than their day-job as a professor.

            In math and physics at my university (which are the only departments I enteracted with enough to have any data), there were many more PhD candidates and post-docs hoping to eventually become professors than there were total professors (associate+tenured), and there were more associate professors than there were tenured professors, and the average amount of time people spent tenured once tenured was at least three times as long as the average PhD candidate/post-doc spent in those two phases of their careers so that gives a 1/(2*2*3) = 8.3% likelihood of someone who had managed to become a PhD candidate eventually becoming a tenured professor… that’s back of the envelope from a limited sample, but all of my rounding is generous to the prospect of getting tenure and all of the included facts would apply to most universities.

            It might not be fair to your comment to count all PhD candidates who hope to become professors as academics, even though most of them would describe themselves as such. My general impression of the weed-out process in academia is that it is practically non-existent until post-doc fellowships. 10/11 of my friends from undergrad that applied to get into a PhD in physics got into a program (all 10 of them hope to eventually become professors). I don’t think this is non-representative of people applying to PhD programs in hard sciences from top-tier universities. (I can’t compare to other universities, but the same heuristics holds for people interested in other hard sciences that I knew from my university. My friend group was drawn disproportionately from the hard sciences, but it was not drawn disproportionately from the people in the top of the class within those disciplines — I don’t think.)

          • Troy says:

            Thanks for the thoughtful comment. I think you (and Steve) are right that the incentives change after tenure, and that the overall incentive structure is more like you described: work towards tenure until you get it, then work towards influence, which may come in the form of being a public intellectual but usually even then goes through prestige in one’s field first. And as I reflect on my own field, it does seem right that the big “public intellectuals” became so post-tenure.

            I don’t know what your cutoffs are for picking your numbers… but if you are counting PhD candidates who hope to someday become professors as academics (which thepenforests comment that you replied to was), they seem pretty far-fetched to me.

            I was thinking mainly of pre-tenure professors with those numbers. I suppose I was assuming that Ph.D. candidates aren’t going to be at a place in their career where they would pursue the popular media option. But upon reflection it seems to me to actually be anecdotally more likely for Ph.D. candidates than pre-tenure professors to go the popular media route – albeit on a much smaller scale than Carol Dweck et al. – perhaps partly because it looks like a better gig than actually finishing their Ph.D.

            In math and physics at my university (which are the only departments I enteracted with enough to have any data), there were many more PhD candidates and post-docs hoping to eventually become professors than there were total professors (associate+tenured), and there were more associate professors than there were tenured professors, and the average amount of time people spent tenured once tenured was at least three times as long as the average PhD candidate/post-doc spent in those two phases of their careers so that gives a 1/(2*2*3) = 8.3% likelihood of someone who had managed to become a PhD candidate eventually becoming a tenured professor… that’s back of the envelope from a limited sample, but all of my rounding is generous to the prospect of getting tenure and all of the included facts would apply to most universities.

            I think your numbers are probably more accurate than my earlier numbers, but doesn’t this presume a fixed size, i.e., that academia is not growing?

          • One small quibble. It isn’t $30,000 an hour, it’s $30,000 a day, because you have to allow for the time spent getting to and from an event that is probably a fair distance from where you live. Still pretty good pay, however.

          • I don’t know if Other is basing his picture of the academic hierarchy on a country other than the U.S., but he seems to have left out one step—assistant professor. That’s the position the post-doc is trying for. The next important step is tenure, which may come either with promotion to associate professor or, later, to full professor.

          • DES3264 says:

            “In math and physics at my university … there were many more PhD candidates and post-docs hoping to eventually become professors than there were total professors ”

            But many institutions have no Ph. D program at all, and all of their professors need to come from somewhere. Your numbers are reasonable for the odds of becoming a professor at an institution that pursues graduate research, but should be raised significantly when you include all the undergraduate-only institutions. (That said, thanks for trying to get some numbers in the conversation! Such numbers are very hard to find.)

          • gwern says:

            It was projected that the increase in number of professors in the US between 2004 and 2014 would be 524,000 .

            No. It was projected, back in 2007, that there would be an increase of 524,000 ‘postsecondary teachers’, which at an average salary of $64k in 2006, is not 524,000 professors. As the BLS comments of the ‘postsecondary teacher’ category involved in that projection, “Many jobs are expected to be for part-time or adjunct faculty.” (http://www.bls.gov/ooh/education-training-and-library/postsecondary-teachers.htm) Good luck making a speaking career on the strength of being an adjunct.

          • HeelBearCub says:

            @gwern

            Good luck making a speaking career on the strength of being an adjunct.

            The question proposed by Troy was what percentage of those pursuing an academic career focus on pursuing a speaking career, rather than optimizing their chance at tenure.

            524,000 is the right number to use. That is the overall increase in the number of people pursuing an academic career. I assure you, all of those adjunct faculty members would love to have tenure, and are far more likely to get tenure than get on the speaking circuit.

            The speaking circuit rewards people who are impressively credentialed in some way. Either they have an impressive array of academic honors, or an impressive number of sales of books, an impressive number of op-eds published in impressive news outlets, etc.

            None of those set of impressive credentials has a readily available path for most academics. Tenure on the other hand, while hard to get, has a readily available path. Apply for a tenure track job, be chosen for the tenure track job, do work that your university peers find to be worthy of tenure.

      • Douglas Knight says:

        What is your point?

        Regardless of whether Dweck made a reasonable gamble, the fact that she has a speaking agent shows that she is interested in speaking. Leaving aside the public speaking and the money, popular books can generate a lot of press coverage, which professors and deans also want. Writing a popular book may be a bad gamble, but it doesn’t take a lot of academics making that gamble for the people we hear about to be dominated by them.

        It is very strange for you to emphasize tenure. Once you have tenure, popular writing is still bad for promotion, but who cares?

        Dweck is almost 70. She spent decades working her way up through the system, moving from school to school, obtaining press coverage, until finally, at age 55, she moved to Stanford, wrote a popular book, had her press coverage explode, and finally had a shot as a speaker. Stanford may well have recruiter her for this new phase in her career.

        (I’m not sure what Steve’s point was, either.)

        • Troy says:

          My point was that Steve seemed to me to be overemphasizing the extent to which “[a] lot of academics want a piece of what you might call the Motivational Writing and Speaking industry.” Obviously I don’t deny that some academics (like Dweck) are interested in that. I was just observing that academic incentives still favor not getting very involved in “popular” writing/speaking. I mentioned tenure because it is one of the biggest carrots the academic world offers, but obviously there are other incentives too.

          • Steve Sailer says:

            There are a lot of tenured professors out there. Even ones battling for tenure have career incentives to publish papers announcing results that Malcolm Gladwell and Company would find comforting and usable.

      • Steve Sailer says:

        Carol Dweck is 68. She came to Stanford when she was about 57, presumably with tenure. Public speaking allows her to sock away a nice nest egg. It also allows her to have more influence on her time by getting her ideas (and face) out there in front of more people.

        • Scott Alexander says:

          I’m a little creeped out by this morphing into a discussion on Dweck’s personal qualities. Let’s keep this about the study.

    • Steve Sailer says:

      Dweck’s career got a nice little push in 2002 from Gladwell’s article in The New Yorker: “The Talent Myth: Are Smart People Overrated.”

      http://www.newyorker.com/magazine/2002/07/22/the-talent-myth

  9. Anonymous says:

    Good Guy Scott Alexander:
    “You can continue to expect less blogging” Posts a new blog post everyday for five days in a row.

    • Randy M says:

      I think the problem is he expects to have less writing time while he plans on working more, but on the job his mind is working on ideas that he feels compelled to type up. And research, footnote, annotate, etc.

    • stillnotking says:

      He’s making a meta-point about expectations, obviously.

    • Scott Alexander says:

      I really am really busy. For some reason I become more productive when I have less time. Either it has to do with me being so stimulated by working hard all day that I can’t turn off, or with time management being easier when you have so little of it – ie I don’t have to choose when to do the thing, I just do it in the only slot of free time that I have.

  10. While we’re discussing statistics, I’ve had an idea on how to help alleviate the replication crisis floating around in my head for the last few days:

    Rather than forcing journals to publish a truckload of negative results alongside the positive ones, just have researchers record each time they test a p-value. Then whenever they submit a paper, they get scrutinized based on how many p-value tests they did. Did the last 19 come up negative? Then no publication for you. This seems like it would perfectly eliminate the Elderly Hispanic Woman Effect. Obviously you need to drill this in people’s heads as a norm such that not recording your p-value tests is as big a sin as outright falsifying data, but other than that it seems like an extremely cheap and easy way of improving things.

    It also seems so simple that I have a bad feeling someone else has thought of it before and found a reason why it won’t work.

    • Carl Shulman says:

      What Gelman describes as the forking path problem is that researchers don’t run that many tests: they ‘eyeball’ the data and make only a few tests they can see are likely to pass, then convince themselves they didn’t really engage in multiple testing that has to be corrected for (or just knowingly engage in the practice knowing that it misleads).

      • Steve Sailer says:

        Personally, I don’t think the specific practices that Gelman complains about are the heart of the problem. In the Big Picture, what’s really missing is a critical attitude and a culture of doing quick reality checks on assertions in papers.

        And a lot of that is due to a culture of what Orwell called “crimestop” or “protective stupidity.” People murkily grasp that asking a series of hard questions might lead you to thoughts that could get you in big career trouble. So, why bother?

        • suntzuanime says:

          I think people do badly enough on politically-non-radioactive scientific questions that political considerations can’t be the main source of the problem.

          • JK says:

            I agree. What Sailer says is just a small part of the problem. See Greg Francis’s work, the P-curve approach, or the Replication Index, for example. Social science research on topics with no clear political implications is as biased as any other.

          • HeelBearCub says:

            Steve’s got an axe, and by gum he will grind it…

            When 1) professors/researchers are expected to self fund via grants, 2) grant money is far more likely to go towards “successful” research because 3) the broad populace doesn’t want to hear about money being “wasted” on unsuccessful research, and 4) money available for research is under downward stress, what outcome should we expect?

          • Steve Sailer says:

            But look how much better amateurs have done in the politically safe field of baseball statistics than professionals in the social sciences. You don’t need to cultivate Crimestop in baseball stats (except on steroids), but it is deleterious to science: “Crimestop refers to the ability to stop short of any thought that might be heretical or unorthodox before it is even thought, as if by instinct. It is the ability to misunderstand analogies, fail to perceive logical errors, and be repelled or bored by any train of thought or conversation that might be inimical to Ingsoc.”

          • ddreytes says:

            @ Steve Sailer: I’m not sure how I feel that analogy, because I think you’re overestimating how well amateurs have actually done in baseball statistics.

            I think baseball stats look pretty good from the outside, in large part, because the state of the field was so dismal for so long – when you were in a situation where the conventional wisdom was totally and fundamentally wrong about the basic nature of the game of baseball, it’s not that hard to look good. And amateur work has been pretty decent at finding good measures for one specific part of baseball (offense).

            On the other hand, offense is perhaps the easiest thing to measure in baseball, and amateur work has been much worse at measuring the harder things – and this has become more and more true as there has developed more and more of an actual professional field of baseball statistics. At this point, if you’re talking about things like defense or a catcher’s ability to frame pitches, the statistics that are good at measuring that are almost certainly going to be professional and non-public. We don’t actually have good public ways of measuring that.

            And what’s more, I think that there are actually a lot of similar dynamics in the field that we’re talking about here – when you look at (for instance) Fangraphs and UZR, there’s certainly a narrative you can tell there about things with shitty outcomes being promoted anyway. UZR is frankly a crummy stat for measuring defense, and yet it gets a lot of play because it’s publicly available, and specifically associated with a well-known blog. It’s not ‘crimestop’ maybe but it’s not a great situation either.

            Or…. maybe I just want to talk about baseball a bunch. Could be.

          • John Schilling says:

            I think baseball stats look pretty good from the outside, in large part, because the state of the field was so dismal for so long – when you were in a situation where the conventional wisdom was totally and fundamentally wrong about the basic nature of the game, it’s not that hard to look good

            Certainly. Might not the social sciences be another field which has long been in a dismal state because conventional wisdom was totally and fundamentally wrong about the basic nature of the game?

          • ddreytes says:

            @ john schilling

            Uh? Maybe. I don’t really know and I’m kind of confused about where you’re going with the line of argumentation.

            I want to clarify the point of my comment, which was not to make a judgment on social science as it exists one way or the other. It was in the context of the argument over why the social sciences often seem to do badly, and the argument as to whether that was because of political bias in social sciences experimentation and reporting. Now, it may or may not be because of political bias – but my point was, I don’t think a comparison to baseball statistics proves anything either way, because (publicly available) baseball statistics aren’t as good as you might think and seem to have many of the same issues despite being apolitical. And the point I was making in passing was that one of the reasons, in my view, that baseball statistics seem to be so successful is because they were starting from such a low point. Whereas we’re able to see the flaws of social sciences much more easily, I think, because they’ve not had as easy a time of it.

            Your point seems (and correct me if I’m misreading here) to be making an argument about the merits of the social sciences as against their critics – and whatever opinion I have about that issue, it’s not really all that connected to the point that I was originally making, which was pretty specifically about whether baseball statistics were more successful because apolitical. Which is fine – it just kind of confused me for a second, so lmk if I am misreading you here.

            All of those caveats said… it’s certainly possible that there are basic and fundamental errors in the social sciences comparable to the thing. I don’t think I’m qualified to say either way. It does seem to me unlikely that there’s going to be anything comparable in the study of people, because I think human society is fundamentally substantially more complex than baseball. But I would certainly not say anything definite either way.

          • 27chaos says:

            Baseball has better data and is a simpler game.

          • haishan says:

            (oooh a chance to argue about baseball statistics on SSC)

            I think baseball statistics are very good! Take the correlation between team WAR and wins. It’s not perfect, but r = .86, and when you factor in how much of a team’s win-loss record is pure luck (sequencing of hits, winning and losing close games, etc.), there’s not a whole lot of room to improve. Defense in particular lags behind, and catcher framing is sort of included in WAR but is misattributed to pitchers… but these are relative nitpicks compared to the clusterfuck that is much of social science.

            That said, I will respectfully disagree with Steve here; the salient difference is that there is a vast trove of public data about baseball; to a certain granularity we know everything that happened on a major league field in the past four decades. Furthermore baseball’s structure makes it unusually amenable to statistical analysis; compare basketball and American football. Social science has vastly more variables, interacting in vastly more complex ways.

            I do wish there were something analogous to the sabermetric community — a community of statistically-savvy amateurs, interested in social science but not concerned with whether their work has The Wrong Implications, ideologically. Inasmuch as Steve does this kind of thing, he’s doing the Lord’s work, but it’s not clear whether this approach can scale in the current political environment.

    • suntzuanime says:

      Or, rather than forcing them to ignore one result for every 19 failed tests, just correct for the number of failed tests by making the p-value necessary to establish significance harder and harder for each one.

      • Tom Womack says:

        Do you lose much by insisting on a p-value of 0.000001? It only roughly doubles the size of the study group you need, because the normal distribution drops off so fast; you can sort of argue that this is unethical if the study group is made of mice and you have to blend their brains to figure out what’s happening, but not if your cost is giving $25 amazon vouchers to undergraduates.

        • Peter says:

          I remember that the particle physics like their p-values nice and small. Let’s see… ah yes, 0.003 = “evidence of”, 0.0000003 = “discovery”. I played around with some randomly gaussians and t-tests, and I think “roughly doubles” is a bit of an understatement – I think it’s more like a factor of four or five.

          Standard error scales with the square root of the sample size – I think that’s it.

          Quibbles aside, I like the suggestion.

        • suntzuanime says:

          If you think people in the social sciences are hostile to attempted replications now, just imagine how bad it would get if they didn’t have the 1/20 fluke rate to fall back on when their effects failed to replicate!

          • Dude Man says:

            Well, if you use a familywise error rate and only one of your hypothesis tests is significant, you could still say that there was a 1/20 fluke rate that one of them wouldn’t fail overall and it just so happened that the significant result is whatever the other person tried to replicate.

          • suntzuanime says:

            Yes; this was specifically in response to the idea of just insisting on a p-value of 0.000001.

  11. magicman says:

    One study will show an effect for only low GPA, another for only high GPA, one for minority students, one for non-minority, one for women, one for men etc. Soon or later someone will come along and say hey it works for everybody.

  12. Timothy says:

    If the interventions helped students with low GPA’s, but had a neutral effect overall, does that mean they made good students do worse?

    • Anonymous says:

      It wasn’t neutral overall. It was neutral on “students not at risk for dropping out of high school” (or “ordinary students”), the complement of the population on which it was positive.

    • Peter says:

      The overall effect was not statistically significant. Doesn’t mean it didn’t exist, just that if there was one, it was small enough to be lost in the noise.

      Statistical significance is weird and defies a lot of standard arithmetic. There can be no statistically significant difference between groups A and B, no statistically significant differences between groups B and C, and yet there can be a statistically significant difference between groups A and C.

  13. TomA says:

    I think you are ignoring the elephant in the room. This study title (and the revealed manipulation of the analysis) are deliberate and intentional acts of deception as opposed to inadvertent incompetence. Being mislead about the efficacy of growth mindset is a minor level of harm compared to the damage done to the integrity of the scientific method. The former is malfeasance, the latter is criminal.

  14. Julie K says:

    How can any 15-minute intervention have an effect, when they tell us that what a parent does over the course of 18 years hardly makes a difference?
    (Irony alert: I’m sitting at the breakfast table with my daughter, ignoring her while I write this. Just think how great an effect I could have if I stopped reading SSC to give her a 15-minute pep talk on growth mindset!)

    • Harald K says:

      It’s usually different people who say those things, though. And while it’s easy to cheat and squint and peek at the data to find “evidence” by e.g. looking at cherry picked subgroups, it’s even easier to NOT find evidence if that’s your goal.

    • Scott Alexander says:

      This subtlety tends to get lost, but when people say parents can’t affect their kids, what they mean is “can’t affect their kids’ personalities in a way that lasts past their contact with their children and carries over into different domains of life.”

      So there might be some wiggle room to say that an intervention at school affects performance at school, at least in the short-term and at the same school where it was given.

      You’re right that this is a big contradiction, and I am getting increasingly frustrated with twin studies showing completely opposite results to everything else, but I still don’t have a good solution.

  15. I was rather puzzled when I saw that the title of the paper was about “academic overachievement” – like over-achievement is a bad thing?? But then I read the paper and noticed the title uses the word “underachievement”, which makes more sense.

    On to more serious matters. The whole collapse the interventions regarding satisfactory course completion into one analysis thing seems incredibly dodgy, especially as they report results for each of the three interventions when GPA was the dependent variable. However, the results for each intervention regarding satisfactory course completion are reported in the supplementary materials. The truth is not pretty. I’ll just quote a chunk of text describing the results for the benefit of the statistically inclined:

    We focused on the “collapsed treatment” regression because this analysis is less statistically powerful than the analysis of semester GPA due to the smaller sample (only at-risk students) and the less sensitive binary outcome. However, we also conducted a “by condition” regression; this was equivalent except that it used individual treatment contrasts in place of collapsed treatment effects. The model revealed a significant Time x Purpose interaction, OR=1.58, Z=1.97, p=0.048, 95% CI [1.00, 2.48], a trending Time x Mindset interaction, OR=1.38, Z=1.52, p=0.13, 95% CI [0.91, 2.10], and a marginal Time x Combined interaction, OR=1.52, Z=1.78, p=0.075, 95% CI [0.95, 2.41].

    The take away message: the sense of purpose intervention produced a statistically significant result, the growth mindset and the combined interventions each did not. The result for the growth mindset was not even close to significance, yet they describe p =.13 as trending. Come on, really! The combined intervention was closer to a conventional level of significance with p = .075.

    So what can be seen in this paper is that a result for growth mindset intervention specifically is reported in the body of the paper when it is statistically significant, yet shunted off into the supplementary material (which fewer people will be inclined to read as it usually contains the more boring stuff) when it is not significant. I suppose this is not technically dishonest, yet it does present a distorted and misleading picture of what was actually found.

    • Scott Alexander says:

      Thanks for that. It was just what I was looking for and I should have known there’d be a supplement somewhere.,

  16. Markus Ramikin says:

    People calling everything a “deconstruction” is the only thing in the modern world more annoying than MLP fans.

    • ShardPhoenix says:

      At this point the reflexive snarky backlash is worse than the original overuse IMO.

      • James says:

        Will I be ahead of the curve if I complain here about the reflexive backlash against the reflexive backlash embodied by, say, ShardPhoenix’s comment?

    • Deiseach says:

      Derrida is actually better than his fanboys; the tiny, tiny exposure to his work that I had* showed that there is real mental activity, real thinking, going on.

      You don’t need to agree with it, but he does show evidence of having an intellect and using it 🙂

      *Through a book on the uncanny that I made the misfortune of purchasing. Written by an obviously third-rate** English professor of English at an English red-brick university, who fanboyed all over the place about Derrida and Derridaism until I was ready to burn Derrida – about whom I knew nothing but his name and what this prat was telling me he and his chums understood Derrida to be doing – at the stake from sheer annoyance, and whose version of what Derrida was on about struck me as incoherent and trite.

      He then made the error of quoting some actual Derrida, rather than “Here’s my second-hand version” and the difference was astounding Derrida may be a typical French intellectual, but it’s undeniable that he is an intellectual. His fanboys? That’s a different matter.

      ** It may be that I am very, very stupid; the reviews laud this to the skies, but I think it’s dreadful and not half as clever as it thinks it is.

      • James says:

        > there is real mental activity, real thinking, going on

        Which is, crucially, different from being correct. I’ve no doubt that Derrida is a genius of some kind, but it’s possible to be a genius and for almost all of your claims to still be wrong. Compare theologians making brilliantly reasoned, impeccably-cited arguments about how many angels can dance on the head of a pin. Or, for a more concrete and more SSC-familiar example, Chesterton.

        • Jaskologist says:

          I’m going to have to be pedantic here and point out that there’s no evidence people actually seriously debated angels dancing on pins.

          • Nisan says:

            As your link points out, Aquinas did seriously consider whether multiple angels could occupy the same point in space. It’s pretty clear to me that “Scholars debated how many angels may dance on the head of a pin” is a colorful or pop-theology way of saying “Scholars studied and debated questions like whether multiple angels can occupy the same point in space”. Am I the only one who thinks this?

            It’s kind of like how cosmologists today can be said to debate whether an astronaut falling into a black hole would turn into spaghetti or encounter a wall of fire. They aren’t really talking about what happens to astronauts, they’re talking about the properties of black holes. But as a layperson I’m happy to use the more colorful language.

          • Anonymous says:

            As your link points out, Aquinas did seriously consider whether multiple angels could occupy the same point in space. It’s pretty clear to me that “Scholars debated how many angels may dance on the head of a pin” is a colorful or pop-theology way of saying “Scholars studied and debated questions like whether multiple angels can occupy the same point in space”. Am I the only one who thinks this?

            It’s kind of like how cosmologists today can be said to debate whether an astronaut falling into a black hole would turn into spaghetti or encounter a wall of fire. They aren’t really talking about what happens to astronauts, they’re talking about the properties of black holes. But as a layperson I’m happy to use the more colorful language.

          • HeelBearCub says:

            @anonymous: “Philosophers of yesteryear debated such bizarre arcanna as whether pushing a fat man from a bridge would stop a trolley and whether this was moral…”

          • Nornagest says:

            TIL: angels are bosons.

          • James says:

            For the record, I’m aware that the angels-on-the-head-of-a-pin lies somewhere between ‘egregious oversimplification’ and ‘made up’. But I went for it anyway because it expressed what I wanted.

            (Believing you might get away with using a pop simplification as a figure of speech on a forum full of rationalists? Talk about naive…)

          • Anonymous says:

            @Nornagest, on the contrary, Aquinas concludes that two angels can’t occupy the same place and therefore are fermions.

            Honestly, after reading that passage from Aquinas this seems like a perfectly fair thing to mock.

        • XerxesPraelor says:

          But if angels existed, that would be one of the first things that rationalists would investigate – is there some resource of “space” that angels inhabit, or is it a completely different plane.

          The actual question there is valid and relevant.

          • James says:

            Right. But my point is exactly that it’s all wasted effort because the underlying axiom is false; angels don’t exist. (Fleshing out the rest of the analogy to the current case is left as an exercise for the reader.)

        • Deiseach says:

          I’ve tried tracking down the “angels on pinhead” quote, and the nearest I’ve come to an answer is that it may originate with, or at least was popularised by, Benjamin Disraeli’s father, Isaac Disraeli, as one of the Victorian progressive jibes about the musty old Middle Ages – though it may date as far back as the 17th century.

          Alas, it is less a “pop culture version” of a real question and more one of those “everybody knows” type things (you know: everybody knows Columbus set sail to prove the world was round, etc.)

          As to Chesterton, I can but answer by getting Hilaire Belloc to speak for me, in the verse he wrote on this very topic – but before we start, we do all know what a don is, don’t we? (Have patience with me; I often see a lack of knowledge about things I’d consider commonplace, but then I’m one of the dinosaurs) :

          Lines to a Don
          By Hilaire Belloc

          Remote and ineffectual Don
          That dared attack my Chesterton,
          With that poor weapon, half-impelled,
          Unlearnt, unsteady, hardly held,
          Unworthy for a tilt with men —
          Your quavering and corroded pen;
          Don poor at Bed and worse at Table,
          Don pinched, Don starved, Don miserable;
          Don stuttering, Don with roving eyes
          Don nervous, Don of crudities;
          Don clerical, Don ordinary,
          Don self-absorbed and solitary;
          Don here-and-there, Don epileptic;
          Don puffed and empty, Don dyspeptic,
          Don middle-class, Don sycophantic,
          Don dull, Don brutish, Don pedantic;
          Don hypocritical, Don bad,
          Don furtive, Don three-quarters mad;
          Don (since a man must make an end),
          Don that shall never be my friend.

          Don different from those regal Dons!
          With hearts of gold and lungs of bronze,
          Who shout and bang and roar and bawl
          The Absolute across the hall,
          Or sail in amply billowing gown
          Enormous through the Sacred Town,
          Bearing from College to their homes
          Deep cargoes of gigantic tomes;
          Dons admirable! Dons of Might!
          Uprising on my inward sight
          Compact of ancient tales, and port
          And sleep — and learning of a sort.
          Dons English, worthy of the land;
          Dons rooted; Dons that understand.
          Good Dons perpetual that remain
          A landmark, walling in the plain —
          The horizon of my memories —
          Like large and comfortable trees.

          Don very much apart from these,
          Thou scapegoat Don, thou Don devoted,
          Don to thine own damnation quoted,
          Perplexed to find thy trivial name
          Reared in my verse to lasting shame.
          Don dreadful, rasping Don and wearing,
          Repulsive Don — Don past all bearing.
          Don of the cold and doubtful breath,
          Don despicable, Don of death;
          Don nasty, skimpy, silent, level;
          Don evil; Don that serves the devil.
          Don ugly — that makes fifty lines.
          There is a Canon which confines
          A Rhymed Octosyllabic Curse
          If written in Iambic Verse
          To fifty lines. I never cut;
          I far prefer to end it — but
          Believe me I shall soon return.
          My fires are banked, but still they burn
          To write some more about the Don
          That dared attack my Chesterton.

        • ” but it’s possible to be a genius and for almost all of your claims to still be wrong. … Or, for a more concrete and more SSC-familiar example, Chesterton.”

          At least one of GKC’s claims was wrong—his refutation of evolution depended on his not understanding it. But I don’t think almost all of them were. Compared to his more in fashion opponents, Shaw and Wells, he looks pretty good.

  17. Shmi Nux says:

    Clearly the best mindset interventions are universal love and transcendent joy.

  18. ShardPhoenix says:

    Mostly thanks to this blog I’ve largely stopped believing in “social science” altogether. The field seems riddled with intellectual dishonesty, conscious or not.

    • Carl Shulman says:

      The preregistered audits that show most social science studies are false also show a large fraction are true, enough that you should still update significantly off the existence of social science claiming X, even though you shouldn’t yet think it likely.

      And for old well-established findings most are true (at least in that the original experiments replicate). See the Reproducibility Project and Many Labs Project.

      • Steve Sailer says:

        I’ve been a social science aficionado since 1972. I love the social sciences.

        You just have to read social science papers very carefully, just like while watching POW Jeremiah Denton explain to the TV camera how nicely his North Vietnamese captors were treating him, you had to notice he was blinking in an odd way.

        • Peter says:

          I sometimes compare a research paper to a sandwich; the introduction and conclusion are the bread, the methods and results are the filling. There seem to be quite a few papers which resemble prime steak between two bits of stale Chorleywood-process “bread”.

        • Murphy says:

          I remember when I realized how distorted things could get talking to a grad student in my old uni when we were out drinking.

          She was complaining about her supervisor being a dick and doing things like sharing her data with others before she could publish it herself under her own name. Her thesis was on risk taking behavior in extreme sports and she’d spent many months collecting data. I remember a conversation along the lines of

          “So are you thinking of switching supervisors?”

          “I would in a heartbeat, except the only other lecturer who could be my supervisor in the department is [academics name] and the only way I’d *ever* be able to get anything whatsoever published with her would be to make it all about how women are discriminated against in extreme sport”

          “Oh, so your data showed women were discriminated against?”

          *snort of laughter* “Nope, but if I want to get away from [dickhead academic’s name] that’s what it’s going to have to show”

      • Steve Sailer says:

        For example, Raj Chetty’s big Harvard project on “Where are the lands of opportunity” has gotten tons of national publicity. Hillary Clinton is begging Raj for his insights.

        But the more I’ve looked at Chetty’s famous map and its supporting data, the more apparent it is that practically nobody has thought hard about what Chetty has found.

        For example, a big effect on his map is that being a blue collar teenager on northern Great Plains in 1996 correlated with making a lot more money in 2011 than your parents made in 1996. Well, when I dug into Chetty’s online databases, it became obvious that large parts of his map are dominated by sparsely populated sections of the northern part of the center of the country from which a lot of early 30s blue collar guys who can stand working outdoors in North Dakota winters have been recently recruited to make big bucks in the fracking boom.

        In contrast, their equivalents in the Southeast, who typically had their best years working construction in the exurbs, got killed by the slowdown in construction after 2008. (Another factor is that the Southeast is filling up with illegal aliens who will work construction cheap, so white blue collar guys in the Southeast are getting their wages pounded down by Latin Americans who find the Southeastern climate acceptable, while North Dakota’s fracking fields are mostly worked by white guys used to frigid weather.)

        So, Chetty’s attempt to come up with some enduring explanations for some of his patterns — the Legacy of Segregation? Sprawl? — that don’t mention race are largely pointless because much of the effects he observed are temporary: North Dakota isn’t quite as booming today as in 2011 and North Carolina home construction isn’t as dead in the water anymore either.

        It mostly just takes some critical reading skills to turn the current morass of social science papers into something useful and informative.

    • Shieldfoss says:

      “Mostly thanks to a site that deliberately points out bad examples, I have come to believe that there are only bad examples.”

      I’m not sure that’ll work out for you. What are you going to go on instead of peer reviewed science? Your gut feeling?

      • ShardPhoenix says:

        If I had the impression that this kind of thing was a rare exception obviously I’d feel differently, but I don’t. And most reasonably intelligent people seem to get on fine mostly ignoring all of social science and relying on tradition, anecdotal evidence, and instinct. It’s not that absolutely all social science studies are wrong, but that they’re wrong enough often enough that there’s low-to-negative value in paying attention to them, especially at the pop-science headline level.

        • haishan says:

          “especially at the pop-science headline level.”

          This is key. Read the papers. Read the papers. Read the papers. Not the articles about the papers, not the abstracts of the papers, read the whole thing. Or at least the Methods section and the most important numbers and tables in the Results section.

          Even our esteemed host fails this sometimes; a couple months back he linked to a parenting-doesn’t-matter study which had the problem that its authors apparently forgot that power was a thing.

          • randy m says:

            Does it matter if we dress the papers? Policy is made based on headlines, and executive summaries, if were lucky. The real skill on social science is abstract writing to get a conclusion shock connects to the data if you squint a bit. . The research is nearly busy work.

          • haishan says:

            “For if ye read only the headlines, what reward have ye? Do not even the policymakers the same?

            “And if ye take the abstract at its face, what do ye more than others? do not even the science journalists so?”

          • Randy M says:

            Well, the point was that knowing the truth is nice, but only really matters if it is actionable. Some of it might be, especially for Scott or someone in a similar profession, but mostly would only matter in terms of wide scale policy, and few have much impact on such.
            Perhaps one could push for more or less growth mindset mantras in the local school district, and more power to you but these are increasingly centralized anyway.

        • Troy says:

          I think avoiding social science altogether is the wrong reaction. There are important, robust results in social science that people should know about. Some, such as Milgram’s obedience experiments, are widely (and rightly) trumpeted. Others, especially those offending progressive “sacred values,” are not.

          So long as you have a critical eye, I think a good introductory textbook in, say, social psychology, can be very informative — you just have to be able to test claims against prior plausibility and where you would expect researchers’ bias to lie. Also informative can be reading the writing of contrarian social scientists, such as Lee Jussim or Jonathan Haidt.

          • JK says:

            Milgram’s experiments and the way he portrayed and interpreted them have been strongly criticized, and Milgram’s work may suffer from the same kind of problems as many other social science experiments.

          • Troy says:

            Fair enough. I’ve read some of the criticisms, though not the book being discussed here. I am confident that Milgram’s findings are by and large correct because of repeated replications of the experiment. But I’m quite willing to grant that some of his particular claims were ill-supported, based on bad experimental design, or both.

    • Scott Alexander says:

      https://slatestarcodex.com/2014/12/17/the-toxoplasma-of-rage/.

      I mostly just write about social science findings I’m skeptical of – the good ones get a links post if they get mentioned at all. I think overall there’s a lot of good there, especially if you read between the lines as is mentioned elsewhere on this thread.

  19. Aleph says:

    Note that this is only the fuckery that you see. There’s probably more fuckery going on that you don’t see. So update in the opposite direction of these people’s claims.

  20. Steve Sailer says:

    Personally, I don’t think the Motivational Speaking/Writing Industry is wholly a hoax.

    For example, there are a lot of high-earning salesmen out there who will spend their own money to hear motivational speakers. (Granted, a lot of salesmen want to get into the motivational speaking business themselves because it beats what they’re currently doing for a living, so a lot of salesmen are at motivational seminars to study the pros so they can get cushy careers giving motivational seminars … But, still …)

    But if we think of what Dweck is trying to do as being part of the venerable American tradition of motivational speaking/writing, we can get a more realistic perspective on its potential even when well executed. I have no idea if Dweck herself is good at motivating students, but I also don’t doubt that some people out there are, on the whole, pretty good and some are pretty bad.

    The problem with Dweckism is that people are hoping she’s going to discover some eternal scientific truth that can be used to make students more motivated forever and ever, like Newton’s law of gravity keeps working and working.

    But if we look at the motivational speaker industry, we see constant churn and little agreement. Sure, there are classics like Dale Carnegie and Napoleon Hill, but there is a huge amount of effort put into generating some level of novelty. Motivational speakers don’t want to be too original, but they also don’t want to be too repetitious of what everybody else is doing.

    Another thing we see is that different motivational speakers work best for different people. For example, my first roommate in college was devoted to playing Zig Ziglar tapes at all hours. I got a different roommate. On the other hand, I find Paul Graham’s essays on how to get rich in Silicon Valley admirable. Granted, Graham hasn’t yet inspired me to get rich in Silicon Valley yet, but, still …

    So, what we see is that motivating people isn’t like coming up with Maxwell’s equations; motivating people is like the motivational speaker business: a constant tumult of small successes dogged by declining impact as boredom and skepticism sets, offset by new fads and new personalities. In fact, that sounds a lot like the educational research business, as well.

    • Deiseach says:

      Yeah, the problem is that various bodies are going to seize on this as “Let the classes do two online sessions and bob’s your uncle – no need for teachers, support or intervention, the kids can monitor themselves and do it all online!”

  21. JK says:

    Aside from what Scott said, it’s also suspicious that they report all their effect sizes after adjustment for various covariates. “Zero-order” effects are not reported. They also use data only from “core courses”, and admit in a footnote that including data from all courses would have weakened the results. It’s not clear how they decided which courses were “core.”

    • JK says:

      It certainly seems that way. There are red flags all over that paper.

    • HeelBearCub says:

      Core courses are identified by the school system as such. The core curriculum is the required curriculum. Absent some reason to think otherwise, I don’t find that that aspect of the study to be unclear.

      • Douglas Knight says:

        Nope, that isn’t their definition of “core.”

        And by providing an alternative definition, you demonstrate that there is room to wiggle, proving JK’s point.

        • HeelBearCub says:

          Nope, that isn’t their definition of “core.”

          The only sentence I can find that further describes the word core in the paper is the following:
          “We calculated each student’s end-of-semester GPA in core academic courses (i.e., math, English, science, and social studies)”

          In what way does that make you think the core courses are not the courses that the school has identified as core courses?

          To my knowledge, every US public school system has a set of courses that they refer to as core courses. Those are typically in the subjects mentioned. Different school systems use different curricula for these courses, but they are broadly similar, so much so that the “common core” curricula standard has been developed by a confederation of states at the national level (which has now become a political football).

  22. Deiseach says:

    I have to say, I am extremely sceptical of the results they are claiming. I’ve worked in a school designated as underprivileged under the criteria of a scheme from our Department of Education (a DEIS school https://www.education.ie/en/Schools-Colleges/Services/DEIS-Delivering-Equality-of-Opportunity-in-Schools-/) and in an associated programme for early-school leavers, which is exactly the target of this study.

    Low grades and eligibility for school lunches in half the schools surveyed are exactly the kind of school where I have experience, and I find it very, very hard to believe that two 45-minute online courses, a fortnight apart, had such a significant effect on the “at-risk” students. If an intervention like this could work wonders on some of the gurriers little darlings I encountered in Youthreach, it would be nothing short of miraculous.

    Because low grades (I’m taking it that the GPA of 2.0 and below corresponds to C and below) aren’t the only problem; they’re probably a symptom. There’s behavioural problems, learning difficulties, psychological problems, difficult family backgrounds, drugs, petty crime, and all kinds of upheaval. Truancy and absenteeism are real problems. Risk of suicide is a real problem (the example that comes to mind is that of two young sisters, lovely girls, to look at you would have no idea of any problems, and they were being monitored by social workers because both of them had made suicide attempts at ages 14 and 15). One of the students, aged 15, died because of a one-time bout of solvent abuse.

    Some students want to get on in school and will work hard, and will take advantage of any help they get. Often they haven’t been given any help until they start attending secondary school.

    Others don’t want anything to do with school and don’t care if they drop out and don’t care about “You are unprepared for work”. They’re on the fringes of petty crime and they’re not one bit concerned about that; in the Youthreach programme they were blatant about smoking weed right outside the main door at break times. It was not uncommon for lawyers to argue in court when they were up on charges “Your honour, my client is a young man/woman from difficult circumstances who has been offered a place on Youthreach blah blah blah” – in other words, the programme was being used as an alternative to a jail sentence or fine. I got accustomed to reading the court reports in the local paper and recognising the names of our ‘graduates’ turning up time after time for escalating seriousness of crime.

    I really, really would be fascinated to find out how two 45-minute sessions (and even there I’m amazed they got the kids to sit still and pay attention for 45 minutes at a time) could make such a change.

    I’m not saying that change is impossible; the single biggest thing that made the largest difference in helping raise literacy levels, involving pupils, getting them interested in class work and homework, and helping retention, was the Library Programme where we got a proper school library and a trained librarian on a pilot programme which our principal begged, cajoled and politicked to get. There are students who, despite their disadvantages, will probably get benefit from a study like this one, where there are sessions tellling them “you’re not stupid, getting low grades does not mean you’re stupid, working hard means you can improve and staying in school means you can do things like ‘make your family proud’ and help yourself and others”. There are, as I’ve mentioned, others who don’t care and are only passing the time until they can get out as early as possible.

    The study seems to be aware that the results are a little cherry-picked; they mention that not all the students followed through or completed the two sessions and that not all the schools were able to comply with all the conditions, and they do point out that it’s a little odd that the students who went through the combined-interventions element did worse.

    So I would guess that this study is getting its good results from the more motivated, more able element of the students; the ones able to sit down and pay attention for a prolonged period, who completed both sessions, and who took the self-improvment message seriously and tried the ‘hard work, growth mindset’ lessons for themselves.

    • Irrelevant says:

      That finishing the study was functionally a test of which at-risk students were open to being motivated is plausible as a mechanism by which this intervention “works” at all, but then you have to explain why the control group does worse.

    • Kaj Sotala says:

      Now I’m curious. What exactly did that library program do? The link was uninformative.

      • Deiseach says:

        Basically it got kids reading who would not think of looking at a book; once they stopped treating books (including textbooks) as horrible devices of torture, and that they could find out information about things they liked and enjoyed, and that reading for pleasure was a thing, schoolwork stopped being this incomprehensible mass of demands and started making some sense to them.

        Irish schools don’t tend to have dedicated libraries; what generally happens is that one room gets designated “the library” and a spare teacher is roped into occasionally overseeing it, but then it starts getting used as a classroom and the library aspect is dropped.

        Having a dedicated library, with a trained librarian, who was participating in a pilot programme aimed at increasing literacy, in a school where the intake was for students of lower academic level (I’m not saying stupid, I’m saying that ours was slanted more towards vocational education – it used to be the Technical School), students from disadvantaged backgrounds, and students with learning difficulties/behavioural problems made a huge difference.

        Getting to use the library during the lunch hour became a privilege. You could lose that privilege by bad behaviour or not doing your work. Kids wanted to get into the library, so they could see the advantages of not time-wasting and messing when they were in there, and they pulled their socks up (so to speak). The librarian arranged things like getting Darren Shan in to give a talk (because of course they were devouring all the Cirque du Freak books), things like animal visits, storytelling sessions, art contests, etc.

        The kids could see how using books and online resources fit in with what they were doing in classes – looking for that motorbike or car logo you wanted to use in art class? history resources? videos and demos online? – and since it was all done “with a spoonful of sugar” way, it didn’t feel like “now you are learning and developing useful skills”. And of course, it made the idea of using the main library in town less scary, now that they knew the ropes (as it were) and knew what they wanted to do and what might be available.

        It was also tied in closely with the JCSP programme and our resource teacher worked hand-in-hand with the librarian. Stuff like using Dragon voice-recognition software with the headsets and computer equipment for pupils with learning difficulties may be routine in American schools (I have no idea) but this was all fruits of the largesse of the Celtic Tiger years when there was money flowing and the government was investing in all these initiatives.

        Getting the parents involved was a huge part of it also. Many of the parents had low-literacy or other problems stemming from their schooldays, so a culture of learning at home was not really on the cards. The librarian and resource teacher and other teachers involved in the JCSP programme would have the awards day (when the portfolios of work assembled by the students were awarded certificates) and they consciously made A Big Deal out of it; the library space was set up for the students and their parents, the students decorated the library with samples of their work, there were speeches (not too long!) and presentations of the certificates for completing the modules, the Home Economics class prepared the goodies for the buffet-style lunch afterwards, the whole nine yards 🙂

        Setting up a school library as a working resource is more than just having a room of books and/or maybe a few PCs the kids can access. You really do need a trained librarian who knows their stuff, who knows how to work in education, and who is collaborating with the rest of the teachers.

  23. Alsadius says:

    Public service announcement: This is exactly how global warming skeptics feel.

    • Wesley says:

      Except there’s actual *physics* involved in the best-guess theories for how global warming is happening. And data. Reams and reams of data. Binders full of [data]!

      And the skeptics against global warming, in a science sense (that is, trained and competent scientists who actually know the research) are a tiny, tiny minority. Less than 3%. And even those are mostly in agreement that human-generated pollution (largely CO2 and sulfates) are influencing climate; they just disagree that the warming is causally linked to the pollution, or disagree that warming is happening at all (although all evidence points to a significant rise in mean surface temperature over the last 150 years).

      At this point, there’s such a preponderance of evidence *for* global warming that the only things left to figure out are: a) the exact mechanisms that are linking the various climate systems, and b) accurate estimates of just how bad it’s going to be (which estimates, admittedly, are currently very rough).

      Plus, the risk factors for agreeing with AGW are … what, exactly? If we transition to a renewable energy economy before we run out of fossil fuels and clean up the atmosphere in the process, and it turns out that the warming observed was actually due to tiny green space aliens hiding under the oceans … so what? We still leave a cleaner, more sustainable world for our children and our children’s children.

      • Tom Womack says:

        “We still leave a cleaner, more sustainable world for our children and our children’s children.”

        Cleaner, more sustainable, and *poorer*; a world which is clean, sustainable, and in which electricity is sufficiently horribly expensive that all Web sites have to charge subscriptions to pay for the electricity to run their servers, is not obviously superior to a slightly dirtier one with electricity too cheap to meter.

        • Onshore wind is cheaper than coal, gas or nuclear energy when the costs of ‘external’ factors like air quality, human toxicity and climate change are taken into account, according to an EU analysis.The report says that for every megawatt hour (MW/h) of electricity generated, onshore wind costs roughly €105 (£83) per MW/h, compared to gas and coal which can cost up to around €164 and €233 per MW/h, respectively.Nuclear power, offshore wind and solar energy are all comparably inexpensive generators, at roughly €125 per MW/h.“This report highlights the true cost of Europe’s dependence on fossil fuels,” said Justin Wilkes, the deputy CEO of the European Wind Energy Association (EWEA). “Renewables are regularly denigrated for being too expensive and a drain on the taxpayer. Not only does the commission’s report show the alarming cost of coal but it also presents onshore wind as both cheaper and more environmentally-friendly.”

          • John Schilling says:

            …when the costs of ‘external’ factors like air quality, human toxicity and climate change are taken into account

            Applying some of that “deconstruction” stuff Scott was talking about, it is probably reasonable to assume that the EU analysis would conclude that on-shore wind power is more expensive if we don’t take climate change into account. Otherwise they’d have made the broader, stronger claim.

            Since the question at hand is, “what’s the cost if we assume the AGW consensus is real and it turns out not to be?”, the EU study seems to suggest that the cost is we wind up needlessly shifting to more expensive forms of power generation – a perfect example of what Tom Womack was claiming.

            I’m also inherently suspicious of any study that needs to invoke “air quality” and “human toxicity” to assert superiority over gas and nuclear power, and then there’s the bit where you come to us bearing one study. Part of the danger of overreliance on the AGW consensus is that it encourages scientists and activists to conduct bogus studies when they find that the honest ones aren’t persuasive enough to avert the predicted catastrophe.

      • DrBeat says:

        Plus, the risk factors for agreeing with AGW are … what, exactly? If we transition to a renewable energy economy before we run out of fossil fuels and clean up the atmosphere in the process, and it turns out that the warming observed was actually due to tiny green space aliens hiding under the oceans … so what? We still leave a cleaner, more sustainable world for our children and our children’s children.

        If you ever find yourself saying “The only cost of going along with my side is having more of a universally good thing!” then you have drastically fucked up your reasoning along the way.

      • Alsadius says:

        There’s good observational science involved in the growth-mindset stuff too. It’s not physics, but then climatology is almost as complex as psychology, so while physical effects underlie both, neither can really be called “physics”.

        And you’ll note one of Scott’s points was “I bet its more sinister cousin “all experimenters believe the same thing and have the same experimenter effects” bias is alive and well.”

        FWIW, I generally believe in the climatology of global warming(though I think they have an annoying habit of omitting any mention of possible benefits, like the fact that the planet’s forests and crop yields are going up significantly due to the increased CO2), I just find the economics to be completely insane. The correct response to AGW is a small carbon tax to fund things like levees for low-lying areas, and business as usual otherwise. Every study that says it’s a big deal is based on assumptions that should get it laughed out of the room. People have a tendency to assume that there’s no ongoing costs to green energy, just an expensive transition, but that’s patently false – fossil fuels are vastly cheaper on an ongoing basis, and doubling or tripling the cost of energy for all of civilization is the sort of thing that causes worldwide depressions. That is not to be brushed off as “It’s progress, what’s the big deal?”.

      • Wrong Species says:

        Do you know what made me incredibly skeptical about catastrophic global warming? “There is no pause, don’t be stupid.” “Oh yeah, we figured out why there is a pause, now that we have an ad hoc explanation for why our predictions failed to materialize it’s obvious we’re right. Man, aren’t those climate deniers stupid and evil.” I guess you could say that the estimates are “rough”.

        • James Picone says:

          This is not nearly as mutally-contradictory as you claim.

          So, first off: there is no statistical evidence of a change in trend at any point after the 1970s. There’s several ways of demonstrating this, but Tamino had the most elegant, in my opinion. The perception of a pause is caused almost entirely by starting from a high year in a noisy dataset, and usually also by picking a satellite dataset that responds more strongly to el-nino/la-nina, making the high points higher.

          But, scientists do sometimes care about that noise – so you get people doing research and going “Hmm, 2011 was probably lower than average because of a strong La Nina” or what-have-you.

          Finally, we can get a better picture of what the actual warming-due-to-GHGs signal is by compensating for various ‘noise’ influences that aren’t expected to vary on large timescales – like ENSO, TSI changes, vulcanism, etc.. If you do that, you get something like Foster & Rahmstorf 2011, discussed here by Tamino/Foster https://tamino.wordpress.com/2011/12/06/the-real-global-warming-signal/, which notes that various noise influences tend to reduce the trend to present at the moment.

          And, finally, temperature to present is still very much within the model envelope, partially because the model envelope is huge (I’m not sure you’ve noticed, but surface temperature is crazy-noisy).

          But hey, let’s not let any actual statistics get in the way of a great meme for beating scientists over the head with.

          • “The perception of a pause is caused almost entirely by starting from a high year in a noisy dataset”

            That’s a good reason not to start with 1998. Instead, after eyeballing the data to see when the trend seemed to change, I fitted global temperature from a NASA web page from 2002 through 2013. The slope of the line was very slightly negative.

            I’m not sure you are allowing for how easy it is, if you don’t want to find a pause, to think of some way of measuring it that won’t show one. Consider how many judgment calls Tamino gets to make in constructing his test.

          • James Picone says:

            2002 is another higher-than-average point. You’re making the exact same mistake as starting at 1998, just with a different year. I can confirm that your 11-year-long trend is slightly negative, though, sure. Error bars include the trend from the mid-1970s, though. And seriously, look at it, it’s meaningless. Especially compared to the prior trend.

            Tamino’s visual demonstration doesn’t even have that many variables to fiddle – I count maybe four. Starting year, year the ‘pause’ begins, dataset used, and confidence interval to show. For those values, Tamino chooses a value that close-to-maximises the pre-‘pause’ trend (which is the opposite of what he wants if he wants to give a false impression), or takes the start of the dataset if it’s satellite (because they only start in 1979). For the pause start date, he uses the value that’s been shouted off the rooftops for years, and besides it works equally well for any pause start date. He’s done it with all four major datasets, and it works equally well (although the trend is different, of course. I think the original post was prior to the most recent serious cooling bias in RSS, too). And he uses 95% confidence interval, which is arbitrary but doesn’t seem particularly Evil here.

            There’s just no statistical evidence of a change in temperature trend after the mid 1970s. I’ve never seen any analysis showing one – just “Well it looks like it pauses” or “A linear trend from here is negative!” which aren’t the same thing.

          • Wrong Species says:

            I agree that cherry picking one year is a bad idea. But are you going to look at this picture of the 5 year running average and tell me that there is absolutely no pause?

            http://upload.wikimedia.org/wikipedia/commons/a/ab/Warming_since_1880_yearly.jpg

          • James Picone says:

            Yes. Yes I am. Because I’ve seen the statistics, and because I’m entirely aware that eyeball-mark-1 is terrible at time series analysis.

            (also running averages do weird things to trends at the endpoints. How much data went into the last point on the graph?)

          • Wrong Species says:

            @James

            Lets say that the the five year running average stays at roughly the same place it is now. How long would it have to stay there before you would agree that there is indeed a pause? 10 years? 15? 30?

          • James Picone says:

            If a linear trend between any two points after the 1970s, in any surface temperature dataset (i.e., not satellite), has a 95% uncertainty range (compensating for autocorrelation and also the number of tests being run) that is entirely below the 1970s-to-pause-point trend, I would consider that extremely strong evidence of a change in surface temperature trend. I don’t know what that would be offhand in terms of five-year moving averages, especially as it would depend on the actual temperature value. Probably a lot.

            There are probably other things that could convince me – the scenario I just sketched is very strong evidence. A similar test looking at whether the 1975-onwards trend is different to the 1940-1975 trend doesn’t bite until around 1990 at 2-sigma confidence, probably not until 2000s at higher confidence. For another example, I would consider a changepoint analysis that found a changepoint after the 1970s in a surface-temperature dataset pretty good evidence.

            Another rough one is that the trend 1970s->1998 is ~0.16c/decade. There’s ~0.6 Celsius variability from year to year. So I would expect T(year) > T(year – 25) for every year after 2000.

            Here’s a graph showing trend from year N to present with 95% uncertainty range (autocorrelation-compensated). Notice how large the uncertainty ranges get close to the present? That’s the problem. I’d suggest a test based on a lesser-variation dataset – like one of the Foster&Rahmstorf ones I’ve been mentioning – but I’m not certain how to pin down the properties of the natural-variation-removed dataset with sufficient specificity. You can pretty much assume that any of the three tests I’ve mentioned above applied to something like Foster&Rahmstorf would be evidence for me, given that I had no problems with the underlying variation-removed dataset.

          • Troy says:

            James Picone: May I just say that as someone with little background in climate science, I feel like I have learned a lot by your comments in this and other threads.

        • Sewing-Machine says:

          If for some reason I had to defend the shabby growth mindset papers being criticized by Scott recently, I would do it exactly thus: with an opaque discussion of “compensating for various noise factors” followed by a sarcastic “don’t let actual statistics get in the way!”

          • James Picone says:

            You don’t actually need to try and remove the various short-term effects to demonstrate there’s no statistical evidence of a ‘pause’ though – that’s my point. Any analysis more sophisticated than calculating a linear trend from a recent high point (usually without the error bars that would make it obvious why that is a Bad Idea) just plain says there’s no change in trend.

            I’ll try to be less opaque. There are lots of things that affect how warm a given year is – how much sunlight there was (Total Solar Insolation), how much vulcanism there’s been for the last few years (even small volcanoes spit out enough particulates to have some influence), and some weird metastable oscillator things like El Nino/La Nina (‘ENSO’), where, broadly speaking, heat is dumped into either the ocean or the atmosphere and the system varies erratically between each.

            All three of those are expected to be trendless over the (human-scale) long-term. Over the actual-long-term, TSI trends upwards but has large cyclic variations that are going to trend downwards over the next few thousand years, and ENSO is a feature of Earth’s current geography and climate that will probably change or move or do something in response to the next glaciation, but that’s not very relevant for projecting global warming.

            CO2, meanwhile, has a seasonal pattern but is pretty consistently going upwards, and the warming trend caused by that doesn’t vary on a year-to-year basis.

            So if you’re trying to determine what the warming trend actually is, trying to remove the effects of TSI, vulcanism, and ENSO from the dataset can make for a better estimate by reducing the year-to-year fluctuations without changing the underlying trend. Foster&Rahmstorf did it by multivariate regression on indices for those three variables, so it’s not ad-hoc.

            (Another relevant factor – the Arctic is interpolated by basically all the temperature datasets, because getting data from there is hard. The Arctic is also warming faster than the rest of the planet, because less water vapour there so the marginal effect of additional CO2 is larger (the same thing happens in Antarctica, IIRC). There’s some evidence that the interpolation is underestimating how much warmer the Arctic is getting, which would reduce the underlying warming trend, Cowtan&Way got a paper out of it).

        • Scott Alexander says:

          I’m not that skeptical about global warming, but I admit the way the media covered “the pause”, with alternating articles of “there was never a pause” and “we’ve explained the pause” and “the pause is going to stop soon”, was a gigantic disaster area.

          The “starting from a high point in the noise” explanation is one of the better ones I’ve heard, but then other people say there is a real pause but it’s explained by the oceans soaking up CO2 (or heat, I can’t remember), and I think some other people say something else.

          • Anonymous says:

            Sure, but isn’t that pretty much the way the media covers *all* science? (Most issues are less volatile, so those instances get less attention.)

          • James Picone says:

            Heat. The oceans soak up ~50% of our CO2 emissions, and have been since roughly we started emitting, because we’re increasing the partial pressure of CO2 in the atmosphere faster than we’re increasing the partial pressure of CO2 in the ocean (by increasing its temperature).

            El Nino/La Nina can be kind of thought about as shifting heat between the atmosphere and the oceans. During El Nino, heat that was in the oceans gets dumped into the atmosphere. During La Nina, the inverse happens.

            1998 was a particularly warm year because one of the strongest El Nino’s ever recorded happened during it. A fair few of the warmer-than-average years have El Nino events in them, because it’s such a big swing (imagine rolling d20 + d8 + n where n is the number of rolls you’ve made – whenever you roll a new highest value, it probably has a high roll on the d20). To an extent, the noise that visually masks the underlying trend is a function of El Nino/La Nina, so heat going into the ocean is part of it.

            And yes, the media is fucking /awful/ at covering global warming (and probably every science. Also maybe everything else?)

          • Wrong Species says:

            I don’t know enough about global warming to comment on the idea of co2 being stuck in the oceans. But looking at the running 5 year average, it’s pretty clear the pause is real.

            http://upload.wikimedia.org/wikipedia/commons/a/ab/Warming_since_1880_yearly.jpg

      • “And even those are mostly in agreement that human-generated pollution (largely CO2 and sulfates) are influencing climate”

        I just want to point out the way in which your rhetoric assumes your conclusion. CO2 isn’t automatically “pollution.” Increased CO2 has both good and bad consequences. Labeling it “pollution” and assuming without argument that warming is bad are ways of avoiding the hard arguments over policy.

        “We still leave a cleaner, more sustainable world for our children and our children’s children.”

        I like to point at the cartoon version of this argument as a reason not to trust people arguing for policies against global warming—because they are, as you are, admitting that they are policies they would still be in favor of if they didn’t believe in AGW. Most of us are pretty good at convincing ourselves of things we want to believe.

    • James Picone says:

      It’s also how creationists feel.

      • Alsadius says:

        Less so, because evolution is more of an observational science than an experimental one. But fair. I suspect anti-vaccine types also feel the same way too.

        • Randy M says:

          Isn’t “observational science” an oxymoron?

          • Anonymous says:

            No. Experimental astronomy or cosmology would be terrifying.

          • Alsadius says:

            Anonymous’ answer was better than mine, but I’ll just say that the important aspect of science is “Look at the world to find truth”. You can do experiments to help you find it faster, but the root of it is the looking, not the doing.

          • Ilya Shpitser says:

            Can’t learn causality just by looking.

          • Jaskologist says:

            Disagree strongly. The Scientific Method of actually running experiments to test your hypothesis was science’s major breakthrough and distinguishing feature.

            Yes, not all fields lend themselves to this, and they probably shouldn’t be properly classed as “science.” This is a shame, but we already accepted inferior methods of proof when we abandoned mathematical certainty for mere statistically significant empirical backup. What harm will sliding a little further down that slope do?

          • Troy says:

            Can’t learn causality just by looking.

            You can’t learn it at all, if by “learn” you mean “discover with certainty.” If by “learn” you mean “get evidence for it,” then yes, you can get evidence for causality just by looking.

            Yes, not all fields lend themselves to this, and they probably shouldn’t be properly classed as “science.” This is a shame, but we already accepted inferior methods of proof when we abandoned mathematical certainty for mere statistically significant empirical backup. What harm will sliding a little further down that slope do?

            Because the term “science” denotes prestige, and so emphasizing controlled experiments as the pinnacle of science pushes psychologists and medical researchers (say) to use them even when they’re not appropriate or not necessary, in the hopes of being called scientists.

            Don’t misunderstand either of the above two comments: I think experiments are great. But their value is contingent, depending on the circumstances and the object under study, and when they are more valuable than observation, the difference between experiments and observations is one of degree, not of kind. Rather than holding them up as the only way to do Science, I want scientists (especially in the soft sciences) to think in a more nuanced way about when and how to use them.

          • suntzuanime says:

            You can, in fact, learn causality just by looking. Experimentation is basically the art of producing situations where causality is especially clear to look at, that’s all. Sometimes those situations will arise on their own, where they are termed “natural experiments”.

            Eliezer Yudkowsky was right, if you want to understand causality, you’ve gotta read Causality.

        • It's Funny says:

          >I suspect anti-vaccine types also feel the same way too.

          Reminder that the CIA conducted a fake vaccination campaign in Pakistan to harvest DNA.

          • FJ says:

            The CIA used a *real* vaccination program *as cover* for a covert DNA-collection program. They didn’t inject people with saline and say, “Now you’re immune to polio.” That really would undercut the efficacy of vaccines!

          • Anonymous says:

            They didn’t inject saline, but neither did they complete the schedule of 3 doses, which is pretty close. (hep B, not polio)

        • “The perception of a pause is caused almost entirely by starting from a high year in a noisy dataset”

          That’s a good reason not to start with 1998. Instead, after eyeballing the data to see when the trend seemed to change, I fitted global temperature from a NASA web page from 2002 through 2013. The slope of the line was very slightly negative.

          I’m not sure you are allowing for how easy it is, if you don’t want to find a pause, to think of some way of measuring it that won’t show one. Consider how many judgment calls Tamino gets to make in constructing his test.

        • I am more sympathetic to creationists and the like than most people who believe they are wrong, because I think figuring out what is true is harder than most people believe. The world is a complicated place, most of us get most of our beliefs not by looking at evidence and arguments but by believing sources of information we trust, and it isn’t easy to know which sources to trust. One of the implications of a good deal of Scott’s pieces is that the “official authorities,” i.e. academics writing papers, quite often should not be trusted.

          I suggest, as a useful exercise for almost anyone, considering some area of your own expertise, comparing what you believe to be true with the generally accepted view of the subject, and thinking about how easy it would be for someone with only a casual interest in the question to figure out which is correct.

          My standard example is the economics of trade. Most public discussion implicitly assumes a theory, absolute advantage, and associated implications (such as the term “unfavorable balance of trade”), that have been obsolete for two hundred years.

          My guess is that of people who are familiar with the term “comparative advantage,” only a minority actually know what it means—and that the same is true for “the theory of relativity.” It’s too easy to assume that labels are self-explanatory. It’s the same mistake GKC made in assuming that “the theory of evolution” was the theory that organisms gradually change—without, so far as I could tell, any idea of the mechanism that is the central element of the theory.

        • 27chaos says:

          My perception is that creationism is more intellectually rigorous than anti-vaccinationism, but I admit I’ve never read any anti-vaccination studies.

      • suntzuanime says:

        Is it? This feeling is about engaging with the literature and finding it wanting, while I feel like the usual behavior of creationists is to invalidate the literature with impossibility arguments. And the anti-vaxxers have their own literature they engage with positively rather than being driven by flaws in pro-vaccine studies.

        • James Picone says:

          It was the fundamental argument behind Expelled – there’s a conspiracy of scientists keeping creationism out of biological sciences.

          • suntzuanime says:

            Is anybody positing a conspiracy of scientists keeping anti-growth mindsetters out of social sciences? I’m confused.

          • James Picone says:

            An explicit conspiracy is the strong version of ‘everyone in the field agrees and so nobody publishes against some idea’.

          • suntzuanime says:

            I took the original comment to be referring to “we keep looking at the literature and it keeps being deeply flawed” not just “everyone in the field agrees on the basic premise”.

    • Will says:

      The difference here is that for global warming to not be happening, the laws of physics would have to be incorrect- which is a much bigger deal than such-and-such social science experiment is flawed.

      • Sewing-Machine says:

        Some of the laws of physics inform all climate models, but if those models are mistaken it would not cast any doubt on the laws of physics at all.

        • Will says:

          The core of global warming is that CO2 causes warming, and burning more stuff that releases CO2 increases CO2.

          That is the big picture. Everything else is fiddly details. The second of these isn’t really in doubt (or shouldn’t be) and repealing the first would require repealing the laws of physics.

          • Ilya Shpitser says:

            Ok, but “fiddly details” matter a lot for policy implications.

            Compare: in the long run we are all dead due to heat death, the rest is “fiddly details.”

            Do you think physics or social science matters more for understanding the AGW community?

          • James Picone says:

            Lukewarmers argue climate sensitivity – raw CO2 forcing is ~1c temperature increase for a doubling of CO2 content, which isn’t incredibly significant (if that was all we expected, we could get to >1000ppmv CO2 content without getting more than 2c warming over preindustrial temperatures!)

            IPCC’s climate sensitivity range as of AR5 is 1.5-4 Celsius for a doubling. Lukewarmers will argue it’s in the low end, which is allowed by the laws of physics (although I think the evidence that had them re-expand the lower end is questionable – it’s almost entirely energy-balance calculations on instrumental records, which are rather sensitive to the chosen starting/ending points).

          • Jaskologist says:

            The laws of physics don’t dictate that there be no feedback taking CO2 back out of the atmosphere. Suppose, for example, that we had a large number of organisms that eat CO2…

          • Whether or not CO2 causes warming, climate is a complicated system. If it turned out that something else was going on that more than outweighed the effect of CO2, no laws of physics would be contradicted.

            Take a look at the graph of global temperature for the mid-20th century. There was a period of about thirty years when temperature was constant to falling—although CO2 was rising. That demonstrates that there are other factors strong enough to outweigh AGW, at least for a while.

            By your account, that means we have to repeal the laws of physics.

            The reason to be confident of warming is not the laws of physics but the data.

      • Wrong Species says:

        Or just maybe climate modelling is a tad bit more complicated than that.

    • Anonymous says:

      The global warming debate is NOT about SCIENCE — it is about POLICY. The science is being used to build a very shaky scaffolding for policy implementation. The problem is with the interpretation of and response to the science.

      e.g. pollution is bad. global warming occurs and is caused by some pollution. therefore, global warming is casus belli for anti-pollution regulation

      There’s a great deal of money in global warming — tax revenue, carbon offsets, handouts for “green” energy, regulatory capture, …

      Those who say “Of course there’s global warming! It’s science!” miss the point by about four miles.

  24. jseliger says:

    I’ll be the first to admit I don’t really understand it,

    Fortunately, no one understands deconstruction: that’s why it’s such a handy way to write papers and books to get tenure!

    In other words, not understanding deconstruction is a feature, not a bug.

    • Irrelevant says:

      I’d argue people do understand deconstruction, and that the mistake comes from miscategorizing it and thinking there’s more to it than there is. Just as “postmodernism” will be misunderstood by anyone who fails to place it within rather than apart from modernism, deconstruction is misunderstood if you don’t consider it contained by structuralism.

      This sort of misunderstanding is, of course, convenient if you want to found your own discipline rather than end up a footnote.

      • Peter says:

        Sometimes I’ve been known to write “(post)structuralism” to mean “either poststructuralism or structuralism or both; the features that are relevant (usually: that I’d like to complain about) are ones that poststructuralism has inherited from structuralism”.

      • Anonymous says:

        What does categorization have to do with it? The supposed “claims” and “methods” of deconstruction make no sense, period. Why think hard about its historical context when this is the case?

        • Irrelevant says:

          What does categorization have to do with it? Why think hard about its historical context?

          NOW YOU’RE GETTING IT! 😀

      • Anonymous says:

        >“postmodernism” will be misunderstood by anyone who fails to place it within rather than apart from modernism

        waaaat

        It’s misunderstood if people fail to place it within modernism? Rather than misunderstood if people fail to place it apart from modernism???

        • Protagoras says:

          Sounds right to me. Like most movements that are reactions to something else, there’s a lot of over-reaction and attempts to emphasize and exaggerate differences, as well as a lot of shared assumptions that just aren’t mentioned (because they’re shared, so why talk about them? The differences are surely what matter!) So post-modernism looks a lot crazier if you look at it in isolation, and don’t realize the extent and the ways it is building on modernism.

        • Anonymous says:

          Never mind my comment, I had trouble parsing the sentence but understand it now.

        • Irrelevant says:

          i.e. postmodernism = late modernism.

          Not sure what actually happens after modernism, we’re too close to the subject. If the people who argue culture has accelerated to the point of singularity and we’ll no longer have well-defined artistic movements are right, then maybe the term postmodernism will be salvaged to describe that, but I think they’re wrong.

  25. Kiya says:

    What if the control intervention actively lowered the at-risk-of-dropping-out kids’ grades? I could see taking “Different parts of the brain do different things” as a message that “your parts of the brain responsible for being good at school obviously suck, maybe you should focus on other stuff that you are good at.” Then the “control” intervention isn’t actually a no-op, and everything else does nothing.

    lol post-hoc hypotheses

    • Irrelevant says:

      Then you’re in the awkward position of explaining why 2 hours of technobabble total has an effect versus the 30 hours a week of instruction.

    • weareastrangemonkey says:

      There should have been a control group where students didn’t have any treatment i.e. no sessions.

  26. shemtealeaf says:

    I’m a bit confused as to why we’re just ignoring the ‘sense of purpose’ intervention, and only focusing on the growth mindset intervention. If we take the stated purpose of the study at face value, they’re interested in the efficacy of mindset interventions in general.

    While I have some trouble believing that a 45 minute intervention has that much of an effect, I’m not really seeing any huge flaws in this study. They did a decent job of providing evidence in favor of the idea that mindset interventions have some positive effect on the performance of students who are at risk for dropping out.

    I don’t think it’s quite fair to bash this as a bad growth mindset study when it actually does a decent job of supporting its actual thesis. If we actually used Scott’s proposed title of “Growth Mindset Intervention Totally Fails To Affect GPA In Any Way”, we’d be ignoring half the data.

    Also, I think it’s pretty harsh to accuse this study of the ‘Elderly Hispanic Woman Effect’. They broke their results down into two groups based on totally relevant criteria. It’s completely reasonable to think that the effect of these interventions might be systematically different in the most at-risk students.

    • Scott Alexander says:

      First of all, thanks a lot of this comment. The glowing uncritical praise from everyone else was starting to make me feel weird.

      To answer your question, I personally am interested in growth mindset and have written about it before here and here, so I focused on the result relevant to my interests and to previous discussions.

      If someone wants to push sense-of-purpose-mindset, they should expand it into a gigantic multi-million dollar industry that influences education around the world, and then I’ll look into it more closely.

      • shemtealeaf says:

        Scott,

        I understand and share your interest in growth mindset. I share your suspicion that growth mindset is not all it’s purported to be. I also agree that this study does not provide much evidence in favor of growth mindset.

        However, I feel like you’re using that to suggest that the study is being disingenuous or misleading, and that’s where I disagree. If I take the claims in this study at face value, it doesn’t make me think that there’s tremendous evidence in favor of growth mindset. It makes me think that these 45 minute interventions seem to have a surprisingly large effect, and we should be investigating exactly which type of interventions have the biggest effect.

        If this were just a growth mindset study, I would agree that it’s a bad one. I just think that viewing it through that lens ignores their most interesting conclusion, which is the one that they state right up front in the title and the abstract.

        • Scott Alexander says:

          The study is called “Mindset Interventions Are A Scalable Treatment For Academic Overachievement”. If one of the two mindset interventions they tried, which was the only one that had “mindset” in the name, wasn’t a scalable treatment for academic overachievement, I feel like they’re making dubious claims. I also think a bunch of people will use this paper as evidence that growth mindset works, and that the paper was geared to support doing that, so if it doesn’t work there’s something dishonest about that.

          • shemtealeaf says:

            You might be right that the paper is geared to support growth mindset, but I don’t think I would have really noticed that without being primed to think about the study as a potential piece of evidence in favor of growth mindset.

            If it were me, I might have titled the study “SOME Mindset Interventions Are A Scalable Treatment For Academic Overachievement”, but I don’t have a huge issue with their omission of the qualifier ‘some’. Even if they had tested five different interventions and only one of them worked, I still think that would be decent evidence in favor of the general concept of mindset interventions.

            Of course, I’m giving them the benefit of the doubt and hoping that the next study from these researchers investigates the potential benefits of the ‘sense of purpose’ intervention. If we just get more growth mindset research, then your suspicion will be largely vindicated.

          • Anonymous says:

            One of the interventions has mindset in the name. The other doesn’t.

    • Did I miss it, or did they explain how they identified the most at-risk students? Who are the students who are at-risk of dropping out? Normally, unless they say otherwise, this means the Black kids, regardless of their academic success or family education levels and the kids who are thought to be poor.

  27. Christopher Mullins says:

    “Old black men…nothing. Middle-aged Asian transgender people…nothing. Newborn Australian aboriginal butch lesbians…nothing. Elderly Hispanic women…p = 0.049…aha!”

    Really like your essay on the elderly hispanic women affect. Is this the same thing as “data dredging” ? [1]

    [1] http://en.wikipedia.org/wiki/Data_dredging

  28. Joe says:

    Why would the intervention automatically show up in the academic results?

    Convince a disinterested schoolkid that they can do anything if they put their minds to it and they are just as likely to not come to school tomorrow but instead travel the world or something, no?

    Or is the claim that growth mindset can be whittled down to just academic achievement through the power of words?

    That cannot be right, either way. Either growth mindset is real and will transcend the school environment, or growth mindset can be limited deliberately to a particular environment just because an academic wants it to be that way for a study, in which case growth mindset is bullshit on its own terms.

    • Steve Sailer says:

      Or maybe “growth mindset” propagandizing works for interactions of certain students and certain teachers and certain school cultures etc.

      Think about the careers of football coaches. A big part of their job is motivational speaking. Some have great success everywhere, some succeed at one level and then fail at the next level. Some have success initially but then their player’s get bored and burned out with Coach’s mind tricks. This isn’t the General Theory of Relativity that is valid forever, this is more like entertainment, and entertainment wears off.

      A lot of social scientists are trying to get in on the big money in the marketing business with their “priming” studies. Marketers love “priming” because that’s what they do and they’re hoping that if some Scientists study it, they’ll discover unchanging laws of marketing and make their jobs easier. But customers get bored, so the struggle to come up with effective marketing goes on forever.

      • Joe says:

        The thing that strikes me as I read your “football coaches” explanation is this –

        Just like football, the academic world is a competition. If you could design an intervention that would let all current students pass the GPA, then you’d have to redesign the GPA so that mostly, they couldn’t.

        That’s because the purpose of the GPA (like most academic exams) isn’t to let kids do things, it’s to block them from doing things.

        In football, it’s obvious – the process is win/lose. In academia, they have to pretend there are other values other than “you cannot come in based on arbitary criteria”

        Growth mindset is a blame the victim game. “Oh, I’m sorry little johnny, this plywood door to the land of opportunity is closed to you, but if you believe real hard it will open. Honest.”

        “No doors opened for you? I guess you just think like a loser then. That’s your own fault, don’t even think about redistribution or fairness. p.s. my toilet needs cleaning”

        Looks to me like just another “just world” fallacy with the twist that it’s for the losers, not the winners in life. How american!

        • GPA isn’t an exam. It’s short for “grade point average,” and represents the average of grades in some earlier stage of schooling—most often high school.

          • Joe says:

            Lets add the word “system” to wherever I wrote GPA. The point will then make sense, hopefully……:)

            Sidenote – beer. 😀

  29. Albatross says:

    GPA?

    I’m very suspicious of GPA in any study. If a student has a high GPA they have nowhere to go but down. Even if growth mindset turned one student into Spiderman a 4.0 is a 4.0. And if growth mindset takes a student who got 5 out of 100 on a test to 45 out of 100, the student still fails the class.

    And yes, expect GPAs to fall. In math you start with a review of prior year’s lessons and gradually introduce more complicated concepts. In English you write about what you did over the summer like you do every year and tackle progressively more difficult assignments. GPA also falls over the years, 3rd graders have better GPAs than 12th graders… and that isn’t even wrong since standardized test scores also fall. The material gets harder.

    Even worse, teachers might challenge high achieving students or pity at risk students and grade accordingly.

    I expect GPA to relate to growth mindset roughly the same as height: not at all.

    • Irrelevant says:

      I expect GPA to relate to growth mindset roughly the same as height.

      But you should expect GPA to be related to height, since IQ is related to height.

    • Scott Alexander says:

      Would you expect the study’s actual finding – percent of at-risk students who passed core classes – to relate to growth mindset?

      • Keep in mind, the “at-risk” students may be high achieving well behaved children of heart surgeons. All it takes to be labeled at risk if to be Black. Did they define at risk? If not, this is what they mean. Black, or I think they are poor. We always ask why they think students are poor, now that the federal laws on sharing lunch status data are being enforced in most school districts. (They always had laws, but never enforced them because so many academic programs were designed for poor kids, and they used the “free lunch” list to target students.) Educators are now guessing who is poor. This makes it worse for minorities than when the laws for data sharing were not enforced. They tell us they use race and things they see, like the bus the kid rides, the mother’s pocket book quality, etc.

        • Anonymous says:

          Yes, they did define “at risk,” and, no, it doesn’t mean black or poor.

          But of course you just keep saying exactly the same things about this study as you said about everything else, even though you claim to have read this study.

  30. Alex Mennen says:

    > Why would the control group’s GPA suddenly decline?

    This is really weird; the opposite is supposed to happen. http://en.wikipedia.org/wiki/Regression_toward_the_mean

  31. Mary says:

    Hmmm. . . was thinking. . . one thing I recommend to aspiring writers is reading lots and lots of primary sources and (because sources are limited) reading between the lines to try to figure out the opposing side.

    For instance, early works on courtly love insist that it must be outside marriage; marriage ends courtly love. But they insist so at great length and vehemently, with many arguments. Which shows it was not generally accepted.

    However, that’s a far stretch from deconstruction.

  32. Emily says:

    Want to wade through another highly politicized research subfield? Could be fun? http://www.theatlantic.com/politics/archive/2015/02/using-pseudoscience-to-undermine-same-sex-parents/385604/

  33. Corey says:

    ‘Because our primary research question concerned the efficacy of academic mindset interventions in general when delivered via online modules, we then collapsed the intervention conditions into a single intervention dummy code (0 = control, 1 = intervention).’

    As a practicing statistician, I license this analysis choice as reasonable given the bar plot of GPA residuals of at-risk students. What I see when I look at that plot is three interventions that have about the same effect size, and that effect size is statistically discernable vs the control treatment, and possibly vs zero.

    Is this a cherry-picking post hoc judgment? Nope. My default model for studies like this is a hierarchical model. This kind of model adaptively interpolates between the no-pooling model Scott Alexander is implicitly assuming and the complete-pooling model Dweck et al. decided to use for the pass-fail analysis. What I’m doing is making an educated guess as to where the hierarchical model would fall on that scale; my guess is that it would be pretty close to complete-pooling.

    • Douglas Knight says:

      The passage you quoted contains the word “because.” They gave their putative decision process and it is not the one you endorse.

      • Corey says:

        Yup: they don’t know what they’re doing — their justification is bogus. The procedure they arrive at and the result it shows remain reasonable.

  34. Of course, a “pox” IS a growth.

  35. I know this post is about deconstructing the research, but I am going to put it in context for you. This is education/social science research. You folks here seem to not live in this social science world. This world, especially education, is never rigorous with science. They value credentials, not rigor of the study. So, how did it get published? It got published in a social science journal. That is how.

    For some context to understand that the study is probably more meaningless than you realize… First of all, what does at-risk mean? It usually means the students are minority (but not Asian or Indian), or they ride a bus that goes near a low-income neighborhood. I do educational program evaluations for a living. My clients often want me to evaluate the success of an intervention for “at-risk” students. When I ask them how they targeted students for interventions, it generally boils down to they guessed who was at risk by their race or something that made them think they might be poor. (How is that for scientific?)

    In addition to having no quantifiable definition for “at risk,” course grades are completely subjective. I never use them in program evaluations to measure student learning. Some teachers grade on effort, some on behavior, some on demonstrating what you’ve learned. Some kids are given below grade level work and then graded as if it is on grade level. In subjects where tracking exists, like math and language arts, an A in one track is equivalent to a D in another track as far was content mastered. So, grades are meaningless. I realize in this study, they measured the GPA of the rest of the semester, so there should be some consistency even though completely meaningless.

    I also have a problem with “at risk of dropping out.” I did a study for a large school district on who is “at risk of dropping out.” They had believed that students who were members of subgroups with the highest dropout rates were “at risk of dropping out.” So, a straight A Hispanic girl whose father is an MD is targeted for dropout interventions, while an illiterate white male is not. This is how they understand “at risk of dropping out.” When I see educators use the term “at risk of dropping out,” I assume they mean “member of a subgroup that I believe has a higher rate of dropping out than high income white students.” This is usually what they mean.

    Our K-12 education system is structured on the fixed mindset. Teachers believe that some kids are better learners and they are selected at about 3rd grade, and then tracked into a more enriched rigorous curriculum. We don’t need to change the kids, we need to change the teachers’ beliefs and the system structure. Many kids already do believe they are very capable, but they get no opportunities from the schools because the school believes they are “at risk of dropping out,” and generally at risk of all things bad.

    And, education goes through fads where experts make a name and people make lots of money. This is like a fad in education. They treat it as if it is a temporary thing that will pass, but right now they are all getting their mindset training and some people are making lots of money from it.

    The world of education is very difficult to understand if you are not in it. It is so different from the world outside education.

  36. Oh my. I just read the paper. It doesn’t matter that it makes no sense and the research is questionable. You folks are not the audience. The audience is educators. Here is how education works:
    1. Their online 45 min course will now be considered “research-based” because they published this paper.
    2. Whatworksclearninghouse will review this study and say it does not meet standards for being done correctly. But that is how nearly all education research of products is classified and it doesn’t keep anyone from using it.
    3. The authors of this paper will start selling their online course to school districts.
    4. School districts and non-profits that are trying to get federal grant money for dropout prevention, or getting more minorities in the STEM pipeline need a research-based intervention to provide. They’ll write the purchase of this online course into their grant proposals. This raised the math grades of 7th graders, so it is STEM related.

    That federal grant money pays a lot more bills than paying for the intervention. It pays overhead, salaries, purchase of other things, like computers to take the course on, etc. So, they don’t really care if the program works. They just care if they can get a grant by using it.

    What I noticed is that they do not define in a quantitative way what they mean by “at risk of dropping out.” We don’t know anything about the grading policies of the schools in the study. They could be in North Carolina where Mindset training is all the rage right now. I know in NC, the State Board of Ed changed the grading system this year, mid year, to be a 10 point scale instead of a 7 point scale. Also, some schools we work with in the same school systems have “no zero” policies, and some do not. That means some kids get 0 for missing work and others do not. Generally, our high income schools have “no 0” policies because the parents demand it and the schools are afraid of high income parents.

    As is true for all other interventions like this, it can’t have any impact on school success if low-income and minority students have no access to rigorous courses. Here in NC, we have adopted Common Core, which is supposed to be the rigorous curriculum for all students. We cannot deliver that. So, some schools take what was Algebra, Geo, and Alg II and offer them as two-courses each, watered down and remedial. They put kids who have no advocates into these courses because we don’t have teachers for giving all kids rigor. So, they will put these “at-risk” kids (meaning kids with no advocates who we can screw over and get away with it) into the Mindset online courses, to make it look like they are trying to help them. Then they will continue to enroll them into remedial courses no matter what.

    Another caution I have is about their use of “residuals.” We have worked with school systems that have invented their own statistical methods for computing residuals, that included adjusting predictions for different subgroups. This resulted in positive residuals for low-income students who lost ground. They explained this by saying that low-income students are expected to do poorly, so when they are not, they adjust their expected scores down to where they think they would be. Then if they lose ground, and are still ahead of their computed expectation, the residual is positive. They make adjustments based on the number of poor kids in the school and the individual’s poverty level.

    But, as I said, this study is about now selling the online intervention as a dropout prevention program, and a STEM promoting program for poor kids.

    • Anonymous says:

      What I noticed is that they do not define in a quantitative way what they mean by “at risk of dropping out.”

      Yes, they did. They defined it as a fall GPA of 2.0 or a failed core class.

  37. Reading your desconstructing comments makes me think about some of the programs we have evaluated, because we tend to desconstruct as we consider how to report on them. One that comes to mind is one that my writer considers the winner when we deconstruct. She has to write our reports and we never lie in our reports. Here is what she had to work with on this project:

    A university got a big federal grant based on these premises:
    1. Children of military are not successful in school because they feel “disconnected.” (No evidence was provided that children of military were doing poorly in school or that they felt disconnected.)
    2. So, to change their feelings of disconnectedness, the grant would fund after-school programs and academic support in school for children who go to schools in areas where we have military bases.
    3. The programs would use a research-based curriculum designed for middle school kids to make them feel “more connected.” It came with pre- and post surveys to measure “feelings of connectedness.” These surveys included questions about your connectedness to God and Jesus. (These are public schools.)
    4. They were also going to use on online program from about the 1950s, (I know we had no online robots then, but as early as we did.) that taught kids to move a robot by typing in commands. They were going to give deployed parents computers so that they could so these online robot-moving lessons with their kids.

    When they hired us to evaluate the effectiveness of this, they had already given the pre-survey for connectedness. But they forgot to ask the kids if their parents were in the military. They couldn’t find any middle schools that wanted the program, so they were in an elementary school. When we went to interview staff, they told us that the curriculum for improving connectedness was so bad that they made up their own activities to do with the kids. They had never given the parents computers while deployed, because they had no relationship with the military base. So, kids were just commanding robots to move without input from their parents.

    We were to use the participants whose parents were not in the military as a control on the “connectedness” scale. So, they re-administered the pre-survey at the end of the program, a few days before the post-survey, and asked the kids if their parents were in the military. Fewer than half kids served had parents in the military. We omitted the religion questions because I a not going to do that and have my name on the report. We didn’t ask. We figured no one would notice. And they didn’t. The children of military had higher feelings of connectedness than the other kids, on this invalid measure of who knows what. No one had any information about whether any of the kids were successful in school or not, which was the main objective of this grant.

    I can see that in the near future, I will be hired to evaluate programs where “at risk” kids are being given a 45 min online lessons that will help them have a growth mindset and now they are less likely to drop out, and more likely to take advanced math. I see this becoming very popular. Right now, only adult educators are being trained. It looks like they’ve developed a product to sell on a per-pupil price, and federal grants will pay for it because it is for “at risk” kids.

    There is probably a research paper on the program I described. The university was doing research on this project. And it probably got published in a peer reviewed education journal. If so, they wouldn’t tell all these things that went wrong, like pre-survey given at the end, they didn’t use the curriculum, and they served the wrong age kids. Not to mention they never knew anything about participants’ success in school.

    • Yow. That is pretty awful.

      I feel bad for N.C., but I realize it’s probably not much better anywhere else in the U.S. And how-to-educate-kids is not an easy problem.

      I really appreciate seeing comments from someone like yourself who is actually out there, dealing with these issues and this crappy research in the real world.

  38. C ariss says:

    Reading that was a waste of time.

    • Aaron Brown says:

      The comments policy is that comments should be at least two of true, kind, and necessary. Your comment might be true in the sense that you didn’t like the post, but it’s not kind, and it can’t be necessary considering you didn’t even say why you thought the post was a waste of time.

  39. MW says:

    Despite the sense to prove that the growth mindset is invalid, I think this is a good opportunity to dive deeper into the lives of these students. In order to actualize in the growth mindset, one must rise above the duality of good/bad, aka the fixed mindset. This sense of good/bad starts when we our young, being told we are good and bad based on behaviors that inconvenience our parents/caregivers (i.e. spilled milk, etc). How many of us had healthy experiences were we spilled milk and our parents took a step back, showed us how to clean it up and allowed us to practice on how to pour out of the same container, in a tone that was not angry or resentful? Now I’m curious on how deep rooted the fixed mindset is with the students, seems to me the deeper the fixed, less likelihood of understanding of growth.