<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Slate Star Codex &#187; science</title>
	<atom:link href="http://slatestarcodex.com/tag/science/feed/" rel="self" type="application/rss+xml" />
	<link>http://slatestarcodex.com</link>
	<description>In a mad world, all blogging is psychiatry blogging</description>
	<lastBuildDate>Fri, 24 Jul 2015 02:59:17 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.2.3</generator>
	<item>
		<title>Trouble Walking Down The Hallway</title>
		<link>http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/</link>
		<comments>http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/#comments</comments>
		<pubDate>Thu, 16 Apr 2015 03:43:40 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[psychology]]></category>
		<category><![CDATA[race/gender/etc]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=3613</guid>
		<description><![CDATA[Williams and Ceci just released National Hiring Experiments Reveal 2:1 Faculty Preference For Women On STEM Tenure Track, showing a strong bias in favor of women in STEM hiring. I&#8217;ve previously argued something like this was probably the case, so &#8230; <a href="http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Williams and Ceci just released <A HREF="http://www.pnas.org/content/early/2015/04/08/1418878112.long">National Hiring Experiments Reveal 2:1 Faculty Preference For Women On STEM Tenure Track</A>, showing a strong bias in favor of women in STEM hiring. I&#8217;ve previously argued something like this was probably the case, so I should be feeling pretty vindicated.</p>
<p>But a while ago I wrote <A HREF="http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/">Beware The Man Of One Study</A>, in which I wrote that there is such a variety of studies finding such a variety of contradictory things that anybody can isolate one of them, hold it up as <i>the</i> answer, and then claim that their side is right and the other side are &#8216;science denialists&#8217;. The only way to be sure you&#8217;re getting anything close to the truth is to examine the literature of an entire field as a gestalt.</p>
<p>And here&#8217;s something no one ever said: &#8220;Man, I&#8217;m so glad I examined the literature of that entire field as a gestalt, things make much more sense now.&#8221;</p>
<p>Two years ago Moss-Racusin et al released <A HREF="http://www.pnas.org/content/109/41/16474.full">Science Faculty&#8217;s Subtle Gender Biases Favor Male Students</A>, showing a strong bias in favor of men in STEM hiring. The methodology was almost identical to this current study, but it returned the opposite result.</p>
<p>Now everyone gets to cite whichever study accords with their pre-existing beliefs. So <i>Scientific American</i> writes <A HREF="http://blogs.scientificamerican.com/unofficial-prognosis/2012/09/23/study-shows-gender-bias-in-science-is-real-heres-why-it-matters/">Study Shows Gender Bias In Science Is Real</A>, and any doubt has been deemed unacceptable by blog posts like <A HREF="http://feministing.com/2015/01/09/breaking-some-dudes-on-the-internet-refuse-to-believe-sexism-is-a-thing/">Breaking: Some Dudes On The Internet Refuse To Believe Sexism Is A Thing</A>. But the new study, for its part, is already producing headlines like <A HREF="http://www.cnn.com/2015/04/13/opinions/williams-ceci-women-in-science/">The Myth About Women In Science</A> and <A HREF="http://motls.blogspot.com/2015/04/cornell-study-in-pnas-women-stem.html">blog posts</A> saying that it is &#8220;enough for everyone who is reasonable to agree that the feminists are spectacular liars and/or unhinged cranks&#8221;.</p>
<p>So probably we&#8217;re going to have to do that @#$%ing gestalt thing.</p>
<p>Why <i>did</i> these two similar studies get such different results? Williams and Ceci do something wonderful that I&#8217;ve never seen anyone else do before &#8211; they include in their study a supplement admitting that past research has contradicted theirs and speculating about why that might be:</p>
<p><b>1.</b> W&#038;C investigate hiring tenure-track faculty; MR&#038;a investigate hiring a &#8220;lab manager&#8221;. This is a big difference, but as far as I can tell, W&#038;C don&#8217;t give a good explanation for why there should be a pro-male bias for lab managers but a pro-female bias for faculty. The best explanation I can think of is that there have been a lot of recent anti-discrimination campaigns focusing on the shortage of female faculty, so that particular decision might activate a cultural script where people think &#8220;Oh, this is one of those things that those feminists are always going on about, I should make sure to be nice to women here,&#8221; in a way that just hiring a lab manager doesn&#8217;t. </p>
<p>Likewise, hiring a professor is an important and symbolic step that&#8230;probably doesn&#8217;t matter super-much to other professors. Hiring a lab manager is a step without any symbolism at all, but professors often work with them on a daily basis and depend on their competency. That might make the first decision Far Mode and the second Near Mode. Think of the Obama Effect &#8211; mildly prejudiced people who might be wary at the thought of having a black roommate were very happy to elect a black President and bask in a symbolic dispay of tolerance that made no difference whatsoever to their everyday lives.</p>
<p>Or it could be something simpler. Maybe lab work, which is very dirty and hands-on, feels more &#8220;male&#8221; to people, and professorial work, which is about interacting with people and being well-educated, feels more &#8220;female&#8221;. In any case, W&#038;C say their study is more relevant, because almost nobody in academic science gets their start as a lab manager (they polled 83 scientists and found only one who had). </p>
<p><b>2.</b> Both W&#038;C and MR&#038;a ensured that the male and female resumes in their study were equally good. But W&#038;C made them all excellent, and MR&#038;a made them all so-so. Once again, it&#8217;s not really clear why this should change the direction of bias. But here&#8217;s a hare-brained theory: suppose you hire using the following algorithm: it&#8217;s very important that you hire someone at least marginally competent. And it&#8217;s <i>somewhat</i> important that you hire a woman so you look virtuous. But you secretly believe that men are more competent than women. So given two so-so resumes, you&#8217;ll hire the man to make sure you get someone competent enough to work with. But given two excellent resumes, you know neither candidate will accidentally program the cyclotron to explode, so you pick the woman and feel good about yourself.</p>
<p>And here are some other possibilities that they didn&#8217;t include in their supplement, but which might also have made a difference.</p>
<p><b>3.</b> W&#038;C asked &#8220;which candidate would you hire?&#8221;. MR&#038;a said &#8220;rate each candidate on the following metrics&#8221; (including hireability). Does this make a difference? I could <i>sort of</i> see someone who believed in affirmative action saying something like &#8220;the man is more hireable, but I would prefer to hire the woman&#8221;. Other contexts prove that even small differences in the phrasing of a question can lead to major incongruities. For example, <A HREF="http://www.cbsnews.com/news/support-for-gays-in-the-military-depends-on-the-question/">as of 2010</A>, only 34% of people polled strongly supported letting homosexuals serve in the military, but half again as many &#8211; a full 51% &#8211; expressed that level of support for letting &#8220;gays and lesbians&#8221; serve in the military. Ever since reading that I&#8217;ve worried about how many important decisions are being made by the 17% of people who support gays and lesbians but not homosexuals.</p>
<p><center><IMG SRC="http://slatestarcodex.com/blog_images/protest_sign.png"></p>
<p><i>For all we know maybe this is the guy in charge of hiring for STEM faculty positions</i></center></p>
<p><b>4.</b> Williams and Ceci asked participants to choose between &#8220;Dr. X&#8221; (who was described using the pronouns &#8220;he&#8221; and &#8220;him&#8221;) and &#8220;Dr. Y&#8221; (who was described using the pronouns &#8220;she&#8221; and &#8220;her&#8221;). Moss-Racusin et al asked participants to choose between &#8220;John&#8221; and &#8220;Jennifer&#8221;. They said they checked to make sure that the names were rated equal for &#8220;likeability&#8221; (whatever that means), but what if there are other important characteristics that likeability doesn&#8217;t capture? We know that names have big effects on our preconceptions of people. For example, <A HREF="http://qz.com/81807/the-shorter-your-first-name-the-bigger-the-paycheck/">people with short first names earn more money</A> &#8211; an average of $3600 less per letter. If we trust this study (which may not be wise), John already has a $14,400 advantage on Jennifer, which goes a lot of the way to explaining why the participants offered John higher pay without bringing gender into it at all! </p>
<p>Likewise, independently of a person&#8217;s gender they are more likely to succeed in a traditionally male field if they <A HREF="http://www.abajournal.com/files/NamesNLaw.pdf">have a male-sounding name</A>. That means that one of the&#8230;call it a &#8220;prime&#8221; that activates sexism&#8230;might have been missed by comparing Dr. X to Dr. Y, but captured by pitting the masculine-sounding John against the feminine-sounding Jennifer. We can&#8217;t claim that W&#038;C&#8217;s subjects were rendered gender-blind by the lack of gender-coded names &#8211; they noticed the female candidates enough to pick them twice as often as the men &#8211; but it might be that not getting the name activated the idea of gender from a different direction than hearing the candidates&#8217; names would have.</p>
<p><b>5.</b> Commenter Lee <A HREF="http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/#comment-197878">points out that</A> MR&#038;a tried to make their hokey hypothetical hiring seem a little more real than W&#038;C did. MR&#038;a suggest that these are real candidates being hired&#8230;somewhere&#8230;and the respondents have to help decide whom to hire (although they still use the word &#8220;imagine&#8221;). W&#038;C clearly say that this is a hypothetical situation and ask the respondents to imagine that it is true. Some people in the comments are arguing that this makes W&#038;C a better signaling opportunity whereas MR&#038;a stays in near mode. But why would people not signal on a hiring question being put to them by people they don&#8217;t know about a carefully-obscured situation in some far-off university? Are sexists, out of the goodness of their hearts, urging MR&#038;a to hire the man out of some compassionate desire to ensure they get a qualified candidate, but when W&#038;C send them a hypothetical situation, they switch back into signaling mode?</p>
<p><b>6.</b> Commenter Will <A HREF="http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/#comment-197915">points out</A> that MR&#038;a send actual resumes to their reviewers, but W&#038;C send only a narrative that sums up some aspects of the candidates&#8217; achievements and personalities (this is also the concern of <A HREF="https://feministphilosophers.wordpress.com/2015/04/14/new-study-shows-preference-for-women/">Feminist Philosophers</A>). This is somewhat necessitated by the complexities of tenure-track hiring &#8211; it&#8217;s hard to make up an entire fake academic when you can find every published paper in Google Scholar &#8211; but it does take them a step away from realism. They claim that they validated this methodology against real resumes, but it was a comparatively small validation &#8211; only 35 people. On the other hand, even this small validation was <A HREF="https://feministphilosophers.wordpress.com/2015/04/14/new-study-shows-preference-for-women/#comment-139182">highly significant for pro-female bias</A>. Maybe for some reason getting summaries instead of resumes heavily biases people in favor of women?</p>
<p>Or maybe none of those things mattered at all. Maybe all of this is missing the forest for the trees.</p>
<p>I love stories about how scientists set out to prove some position they consider obvious, but unexpectedly end up changing their minds when the results come in. But this isn&#8217;t one of those stories. Williams and Ceci have been vocal proponents of the position that science isn&#8217;t sexist for years now &#8211; for example, their article in the New York Times last year, <A HREF="http://www.nytimes.com/2014/11/02/opinion/sunday/academic-science-isnt-sexist.html?_r=0">Academic Science Isn&#8217;t Sexist</A>. In 2010 they wrote <A HREF="http://www.pnas.org/content/108/8/3157">Understanding Current Causes Of Women&#8217;s Underrepresentation In Science</A>, which states:<br />
<blockquote>The ongoing focus on sex discrimination in reviewing, interviewing, and hiring represents costly, misplaced effort: Society is engaged in the present in solving problems of the past, rather than in addressing meaningful limitations deterring women&#8217;s participation in science, technology, engineering, and mathematics careers today. Addressing today&#8217;s causes of underrepresentation requires focusing on education and policy changes that will make institutions responsive to differing biological realities of the sexes.</p></blockquote>
<p>So they can hardly claim to be going into this with perfect neutrality.</p>
<p>But the lead author of the study that <i>did</i> find strong evidence of sexism, Corinne Moss-Racusin (whose name is an anagram of &#8220;accuser on minor sins&#8221;) <i>also</i> has a long history of pushing the position she coincidentally later found to be the correct one. A look at <A HREF="http://www.skidmore.edu/psychology/faculty/CorrineMossRacusinCV-July2014.pdf">her resume</A> shows that she has a bunch of papers with titles like &#8220;Defending the gender hierarchy motivates prejudice against female leaders&#8221;, &#8220;&#8216;But that doesn&#8217;t apply to me:&#8217; teaching college students to think about gender&#8221;, and &#8220;Engaging white men in workplace diversity: can training be effective?&#8221;. Her symposia have titles like &#8220;Taking a stand: the predictors and importance of confronting discrimination&#8221;. This does not sound like the resume of a woman whose studies ever find that oh, cool, it looks like sexism isn&#8217;t a big problem here after all.</p>
<p>So what conclusion should we draw from the people who obviously wanted to find a lack of sexism finding a lack of sexism, but the people who obviously wanted to find lots of sexism finding lots of sexism?</p>
<p>This is a <i>hard question</i>. It doesn&#8217;t necessarily imply the sinister type of bias &#8211; it may be that Drs. Williams and Ceci are passionate believers in a scientific meritocracy simply because that&#8217;s what all their studies always show, and Dr. Moss-Racusin is a passionate believer in discrimination because that&#8217;s what <i>her</i> studies find. On the other hand, it&#8217;s <i>still</i> suspicious that two teams spend lots of time doing lots of experiments, and one always gets one result, and the other always gets the other. What are they doing differently?</p>
<p>Problem is, I don&#8217;t know. Neither study here has any egregious howlers. In my own field of psychiatry, when a drug company rigs a study to put their drug on top, usually before long someone figures out how they did it. In these two studies I&#8217;m not seeing anything.</p>
<p>And this casts doubt upon those four possible sources of differences listed above. None of them look like the telltale sign of an experimenter effect. If MR&#038;a were trying to fix their study to show lots of sexism, it would have taken exceptional brilliance to do it by using the names &#8220;John&#8221; versus &#8220;Jennifer&#8221;. If W&#038;C were trying to fix their study to disguise sexism, it would have taken equal genius to realize they could do it by asking people &#8220;who would you hire?&#8221; rather than &#8220;who is most hireable?&#8221;.</p>
<p>(the only exception here is the lab manager. It&#8217;s <i>just</i> within the realm of probability that MR&#038;a might have somehow realized they&#8217;d get a stronger signal asking about lab managers instead of faculty. The choice to ask about lab managers instead of faculty is surprising and does demand an explanation. And it&#8217;s probably the best candidate for the big difference between their results. But for them to realize that they needed to pull this deception suggests an impressive ability to avoid drinking their own Kool-Aid.)</p>
<p>Other than that, the differences I&#8217;ve been considering in these studies are the sort that would be very hard to purposefully bias. But the fact that both groups got the result they wanted suggests that the studies were purposefully biased <i>somehow</i>. This reinforces my belief that experimenter effects are best modeled as some sort of mystical curse incomprehensible to human understanding.</p>
<p>(now would be an excellent time to re-read the <A HREF="http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/">the horror stories in Part IV of &#8220;The Control Group Is Out Of Control&#8221;</A>)</p>
<p>Speaking of horror stories. Sexism in STEM is, to put it mildly, a hot topic right now. Huge fortunes in grant money are being doled out to investigate it (Dr. Moss-Racusin alone received nearly a million dollars in grants to study STEM gender bias) and thousands of pages are written about it every year. And yet somehow the entire assembled armies of Science, when directed toward the problem, can&#8217;t figure out whether college professors are more or less likely to hire women than men.</p>
<p>This is not like studying the atmosphere of Neptune, where we need to send hundred-million dollar spacecraft on a perilous mission before we can even begin to look into the problem. This is not like studying dangerous medications, where ethical problems prevent us from doing the experiments we really need. This is not like studying genetics, where you have to gather large samples of identical twins separated at birth, or like climatology, where you hang out at the North Pole and might get eaten by bears. This is a <i>survey of college professors</i>. You know who it is studying this? <i>College professors</i>. The people they want to study are <i>in the same building as them</i>. The climatologists are getting eaten by bears, and the social psychologists can&#8217;t even settle a question that requires them to <i>walk down the hallway</i>.</p>
<p>It&#8217;s not even like we&#8217;re trying to detect a subtle effect here. Both sides agree that the signal is very large. They just disagree what direction it&#8217;s very large in!</p>
<p>A recent theme of this blog has been that Pyramid Of Scientific Evidence be damned, our randomized controlled trials suck so hard that a lot of the time we&#8217;ll get more trustworthy information from just looking at the ecological picture. Williams and Ceci have done this (see Part V, Section b of <A HREF="http://www.pnas.org/content/suppl/2015/04/08/1418878112.DCSupplemental/pnas.1418878112.sapp.pdf">their supplement</A>, &#8220;Do These Results Differ From Actual Hiring Data&#8221;) and report that studies of real-world hiring data confirm women have an advantage over men in STEM faculty hiring (although far fewer of them apply). It also matches the anecdotal evidence I hear from people in the field. I&#8217;m not necessarily saying I&#8217;m ambivalent between the two studies&#8217; conclusions. Just that it bothers me that we have to go to tiebreakers after doing two good randomized controlled trials.</p>
<p>At this point, I think the most responsible thing would be to have a joint study by both teams, where they all agree on a fair protocol beforehand and see what happens. Outside of <A HREF="http://www.richardwiseman.com/resources/staring1.pdf">parapsychology</A> I&#8217;ve never heard of people taking such a drastic step &#8211; who would get to be first author?! &#8211; but at this point it&#8217;s hard to deny that it&#8217;s necessary.</p>
<p>In conclusion, I believe the Moss-Racusin et al study more, but I think the Williams and Ceci study is more believable. And the best way to fight sexism in science is to remind people that it would be hard for women to make things any more screwed up than they already are.</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2015/04/15/trouble-walking-down-the-hallway/feed/</wfw:commentRss>
		<slash:comments>292</slash:comments>
		</item>
		<item>
		<title>Debunked And Well-Refuted</title>
		<link>http://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/</link>
		<comments>http://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/#comments</comments>
		<pubDate>Sat, 13 Dec 2014 12:08:44 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=3435</guid>
		<description><![CDATA[I. As usual, I was insufficiently pessimistic. I infer this from The Federalist&#8216;s article on campus rape: A new report on sexual assault released today by the U.S. Department of Justice (DOJ) officially puts to bed the bogus statistic that &#8230; <a href="http://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><b>I.</b></p>
<p>As usual, I was insufficiently pessimistic.</p>
<p>I infer this from <i>The Federalist</i>&#8216;s <A HREF="http://thefederalist.com/2014/12/11/new-doj-data-on-sexual-assaults-college-students-are-actually-less-likely-to-be-victimized/">article on campus rape</A>:<br />
<blockquote>A new report on sexual assault released today by the U.S. Department of Justice (DOJ) officially puts to bed the bogus statistic that one in five women on college campuses are victims of sexual assault. In fact, non-students are 25 percent more likely to be victims of sexual assault than students, according to the data. And the real number of assault victims is several orders of magnitude lower than one-in-five.</p></blockquote>
<p>The article compares the older Campus Sexual Assault Survey (which found 14-20% of women were raped since entering college) to the just-released National Crime Victmization Survey (which found that 0.6% of female college students are raped per year). They write &#8220;Instead of 1 in 5, the real number is 0.03 in 5.&#8221;</p>
<p>So the first thing I will mock <i>The Federalist</i> for doing is directly comparing per year sexual assault rates to per college career sexual assault rates, whereas obviously these are very different things. You can&#8217;t <i>quite</i> just divide the latter by four to get the former, but that&#8217;s going to work a heck of a lot better than <i>not</i> doing it, so let&#8217;s estimate the real discrepancy as more like 0.5% per year versus 5% per year. </p>
<p>But I can&#8217;t get too mad at them yet, because that&#8217;s still a pretty big discrepancy.</p>
<p><i>However,</i> faced with this discrepancy a reasonable person might say &#8220;Hmm, we have two different studies that say two different things. I wonder what&#8217;s going on here and which study we should believe?&#8221;</p>
<p><i>The Federalist</i> staff said &#8220;Ha! There&#8217;s an old study with findings we didn&#8217;t like, but now there&#8217;s a new study with different findings we <i>do</i> like. So the old study is debunked!&#8221;</p>
<p><b>II.</b></p>
<p>My last essay, <A HREF="http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/">Beware The Man Of One Study</A>, noted that one thing partisans do to justify their bias is selectively acknowledge studies from only one side of a complicated literature.</p>
<p>The reason it was insufficiently pessimistic is that there are also people like the Federalist staff, who acknowledge the existence of opposing studies, but only with the adjective &#8220;debunked&#8221; in front of them. By &#8220;debunked&#8221; they usually mean one of two things:</p>
<p>1. Someone on my side published a study later that found something else<br />
2. Someone on my side accused it of having methodological flaws</p>
<p>Since the Federalist has so amply demonstrated the first failure mode, let me say a little more about the second. Did you know that <i>anyone</i> with a keyboard can just <i>type up</i> any of the following things?</p>
<p>&#8211; &#8220;That study is a piece of garbage that&#8217;s not worth the paper it&#8217;s written on.&#8221;<br />
&#8211; &#8220;People in the know dismissed that study years ago.&#8221;<br />
&#8211; &#8220;Nobody in the field takes that study seriously.&#8221;<br />
&#8211; &#8220;That study uses methods that are laughable to anybody who knows statistics.&#8221;<br />
&#8211; &#8220;All the other research that has come out since discredits that study.&#8221;</p>
<p>They can say these things <i>whether they are true or not</i>. I&#8217;m kind of harping on this point, but it&#8217;s because it&#8217;s something <i>I</i> didn&#8217;t realize until much later than I should have.</p>
<p>There are many &#8220;questions&#8221; that are pretty much settled &#8211; evolution, global warming, homeopathy. But taking these as representative <A HREF="http://slatestarcodex.com/2014/04/15/the-cowpox-of-doubt/">closes your mind</A> and gives you a skewed picture of academia. On many issues, academics are just as divided as anyone else, and their arguments can be just as acrimonious as anyone else&#8217;s. The arguments usually take the form of one side publishing a study, the other side ripping the study apart and publishing their own study which they say is better, and the first side ripping the second study apart and arguing that their study was better all along.</p>
<p>Every study has flaws. No study has perfect methodology. If you like a study, you can say that it did the best it could on a difficult research area and has improved upon even-worse predecessor studies. If you don&#8217;t like a study, you can say &#8220;LOOK AT THESE FLAWS THESE PEOPLE ARE IDIOTS THE CONCLUSION IS COMPLETELY INVALID&#8221;. All you need to do is make enough <A HREF="http://slatestarcodex.com/2014/08/14/beware-isolated-demands-for-rigor/">isolated demands for rigor</A> against anything you disagree with.</p>
<p>And so if the first level of confirmation bias is believing every study that supports your views, the second layer of confirmation bias is believing every supposed refutation that supports your views.</p>
<p>See for example <A HREF="http://www.xenosystems.net/moron-bites-2/">this recent Xenosystems post</A> about a Twitterer claiming <i>The Bell Curve</i> has been &#8220;well-refuted&#8221;. There are definitely a lot of people who have written books, articles, and papers arguing that <i>The Bell Curve</i> is wrong, often in very strong terms. There are also a lot of people who have written books, articles, and papers saying that the first set of books, articles, and papers are wrong and <i>The Bell Curve</i> is right, also in very strong terms. To say that the first set is a &#8220;refutation&#8221; or &#8220;debunking&#8221; is as basic a mistake as saying that the new rape study is a &#8220;refutation&#8221; or &#8220;debunking&#8221; of the earlier rape study.</p>
<p>(albeit a mistake likely to be made by exactly the opposite people)</p>
<p>There are certainly things that have been &#8220;well-refuted&#8221; and &#8220;debunked&#8221;. Andrew Wakefield&#8217;s study purporting to prove that vaccines cause autism is a pretty good example. But you will notice that it had multiple failed replications, journals published reports showing he falsified data, the study&#8217;s co-authors retracted their support, the journal it was published in retracted it and issued an apology, the General Medical Council convicted Wakefield of sixteen counts of misconduct, and Wakefield was stripped of his medical license and barred from practicing medicine ever again in the UK. The <i>British Medical Journal</i>, one of the best-respected medical journals in the world, published an editorial concluding:<br />
<blockquote>Clear evidence of falsification of data should now close the door on this damaging vaccine scare &#8230; Who perpetrated this fraud? There is no doubt that it was Wakefield. Is it possible that he was wrong, but not dishonest: that he was so incompetent that he was unable to fairly describe the project, or to report even one of the 12 children&#8217;s cases accurately? No.</p></blockquote>
<p>Meanwhile, <i>The Bell Curve</i> was lambasted in the popular press and by many academics. But it also got fifty of the top researchers in its field to sign <A HREF="http://en.wikipedia.org/wiki/Mainstream_Science_on_Intelligence">a consensus statement</A> saying it was pretty much right about everything and the people attacking it were biased and confused. Three years later, they re-issued their statement saying nothing had changed and more recent findings had only confirmed their opinion. The American Psychological Association launched a task force to settle the issue which stopped short of complete agreement but which given the circumstances was pretty darned supportive. There are certainly a lot of smart people with very strong negative opinions, but each one is still usually met by an equally ardent and credentialed proponent.</p>
<p>One of these two things has been &#8220;well-refuted&#8221;. The other has been &#8220;argued against&#8221;.</p>
<p><b>III.</b></p>
<p>I saw this same dynamic at work the other day, looking through the minimum wage literature. </p>
<p>The primordial titanomachy of the minimum wage literature goes like this. In 1994, two guys named Card and Krueger published a study showing the minimum wage had if anything positive effects on New Jersey restaurants, convincing many people that minimum wages were good. In 1996, two guys named Neumark and Wascher reanalyzed the New Jersey data using a different source and found that it showed the minimum wage had very bad effects on New Jersey restaurants. In 2000, Card and Krueger responded, saying that their analysis was better than Neumark and Wascher&#8217;s re-analysis, and also they had done a re-analysis of their own which confirmed their original position.</p>
<p>Let&#8217;s see how conservative sites present this picture:</p>
<p><i>&#8220;The support for this assertion is the oft-cited 1994 study by Card and Krueger showing a positive correlation between an increased minimum wage and employment in New Jersey. Many others have thoroughly debunked this study.&#8221;</i> (<A HREF="http://mises.org/library/welfare-minimum-wages-and-unemployment">source</A>)</p>
<p><i>&#8220;I was under the impression that the original study done by Card and Krueger had been thoroughly debunked by Michigan State University economist David Neumark and William Wascher&#8221;</i> (<A HREF="http://www.amatecon.com/blog/2002_08_04_archive.html">source</A>)</p>
<p><i>&#8220;The study &#8230; by Card and Krueger has been debunked by several different people several different times. When other researchers re-evaluated the study, they found that data collected using those records &#8216;lead to the opposite conclusion from that reached by&#8217; Card and Krueger.&#8221;</i> (<A HREF="http://being-classical-liberal.blogspot.com/2014/02/john-green-is-heroon-left.html">source</A>)</p>
<p><i>&#8220;It was only a short time before the fantastic Card-Krueger findings were challenged and debunked by several subsequent studies&#8230;in 1995, economists David Neumark and David Wascher used actual payroll records (instead of survey data used by Card and Krueger) and published their results in an NBER paper with an amazing finding: Demand curves for unskilled labor really do slope downward, confirming 200 years of economic theory and mountains of empirical evidence</i> (<A HREF="http://www.aei.org/publication/obamas-chief-econ-adviser-once-made-an-amazing-discovery-demand-curves-slope-upward/print/">source</A>)</p>
<p>And now let&#8217;s look at how lefty sites present this picture:</p>
<p><i>&#8220;&#8230;a long-debunked paper [by Neumark and Wascher]&#8221;</i> (<A HREF="http://politicalhotwire.com/economics/88594-%2410-minimum-wage-would-push-more-than-half-working-poor-out-poverty-78.html#post2600348">source</A>)</p>
<p><i>&#8220;Note that your Mises heroes, Neumark and Wascher are roundly debunked.&#8221;</i> (<A HREF="http://www.politics.ie/forum/economy/199017-should-minimum-wage-rates-abolished-10.html">source</A>)</p>
<p><i>&#8220;Neumark&#8217;s living wage and minimum wage research have been found to be seriously flawed&#8230;based on faulty methods which when corrected refute his conclusion.&#8221;</i> &#8211; (<A HREF="http://www.nelp.org/page/-/Justice/2010/AnalysisofNewYorkCityWageStudyTeam.pdf?nocdn=1">source</A>)</p>
<p><i>&#8220;&#8230;Neumark and Wascher, a study which Elizabeth Warren debunked in a Senate hearing&#8221;</i> (<A HREF="http://community.runnersworld.com/topic/the-recovery?reply=55902332404903103#55902332404903103  ">source</A>)</p>
<p>So if you&#8217;re conservative, Neumark and Wascher debunked Card and Krueger. But if you&#8217;re liberal, Card and Krueger debunked Neumark and Wascher.</p>
<p>Both sides are no doubt very pleased with themselves. They&#8217;re not men of one study. They look at <i>all</i> of the research &#8211; except of course the studies that have been &#8220;debunked&#8221; or &#8220;well-refuted&#8221;. Why would you waste your time with <i>those?</i></p>
<p><b>IV.</b></p>
<p>Once again, I&#8217;m not preaching radical skepticism.</p>
<p>First of all, some studies are <i>super-debunked</i>. Wakefield is a good example.</p>
<p>Second of all, some studies that don&#8217;t quite meet Wakefield-level of awfulness are indeed really bad and need refuting. I don&#8217;t think this is beyond the intellectual capacities of most people. I think in many cases it&#8217;s easy to understand why a study is wrong, you should try to do that, and once you do it you can safely discount the results of the study.</p>
<p>I&#8217;m not against pointing out when you disagree with studies or think they&#8217;re flawed. I&#8217;d be a giant hypocrite if I was.</p>
<p>But &#8220;debunked&#8221; and &#8220;refuted&#8221; aren&#8217;t saying you disagree with a study. They&#8217;re making arguments from authority. They&#8217;re saying &#8220;the authority of the scientific community has come together and said this is a piece of crap that doesn&#8217;t count&#8221;.</p>
<p>And that&#8217;s fine if that&#8217;s actually happened. But you had better make sure that you&#8217;re calling upon an ex cathedra statement by the community itself, and not a single guy with an axe to grind. Or one side of a complicated an interminable debate where both sides have about equal credentials and sway.</p>
<p>If you can&#8217;t do that, you say &#8220;I think that my side of the academic debate is in the right, and here&#8217;s why,&#8221; not &#8220;your side has been debunked&#8221;. </p>
<p>Otherwise you&#8217;re going to end up like the minimum wage debaters, where both sides claim to have debunked the other. Or like that woman on Twitter, who calls a common position backed by leading researchers &#8220;well-refuted&#8221;. Or like the Federalist article that says a study has been &#8220;put to bed&#8221; as &#8220;bogus&#8221; just because another study said something different.</p>
<p>I think this is part of my reply to <A HREF="http://slatestarcodex.com/2014/11/27/why-i-am-not-rene-descartes/">the claim that</A> empiricism is so great that no one needs rationality.</p>
<p>A naive empiricist who swears off critical thinking because they can just &#8220;follow the evidence&#8221; has no contingency plan for when the evidence gets confusing. Their only recourse is to deny that the evidence is confusing, to assert that one side or the other has been &#8220;debunked&#8221;. Since they&#8217;ve already made a principled decision not to study confirmation bias, chances are it&#8217;s going to be whichever side they don&#8217;t like that&#8217;s &#8220;already been debunked&#8221;. And by &#8220;debunked&#8221; they mean &#8220;a scientist on my side said it was wrong, so now I am relieved from the burden of thinking about it.&#8221;</p>
<p>On the original post, I wrote:<br />
<blockquote>Life is made up of limited, confusing, contradictory, and maliciously doctored facts. Anyone who says otherwise is either sticking to such incredibly easy solved problems that they never encounter anything outside their comfort level, or so closed-minded that they shut out any evidence that challenges their beliefs.</p></blockquote>
<p>In the absence of any actual debunking more damning than a counterargument, &#8220;that&#8217;s been debunked&#8221; is the way &#8220;shuts out any evidence that challenges their beliefs&#8221; feels from the inside.</p>
<p><b>V.</b></p>
<p>Somebody&#8217;s going to want to know what&#8217;s up with the original rape studies. The answer is that a small part of the discrepancy is response bias on the CSAS, but most of it is that the two surveys encourage respondents to define &#8220;sexual assault&#8221; in very different ways. Vox has <A HREF="http://www.vox.com/2014/12/11/7378271/why-some-studies-make-campus-rape-look-like-an-epidemic-while-others">an excellent article on this</A> which for once I 100% endorse.</p>
<p>In other words, both are valid, both come together to form a more nuanced picture of campus violence, and neither one &#8220;debunks&#8221; the other. How about that?</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/feed/</wfw:commentRss>
		<slash:comments>326</slash:comments>
		</item>
		<item>
		<title>Beware The Man Of One Study</title>
		<link>http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/</link>
		<comments>http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/#comments</comments>
		<pubDate>Fri, 12 Dec 2014 09:04:56 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[rationality]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=3423</guid>
		<description><![CDATA[Aquinas famously said: beware the man of one book. I would add: beware the man of one study. For example, take medical research. Suppose a certain drug is weakly effective against a certain disease. After a few years, a bunch &#8230; <a href="http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Aquinas famously <A HREF="http://en.wikipedia.org/wiki/Homo_unius_libri">said</A>: beware the man of one book. I would add: beware the man of one study.</p>
<p>For example, take medical research. Suppose a certain drug is weakly effective against a certain disease. After a few years, a bunch of different research groups have gotten their hands on it and done all sorts of different studies. In the best case scenario the average study will find the true result &#8211; that it&#8217;s weakly effective.</p>
<p>But there will also be random noise caused by inevitable variation and by some of the experiments being better quality than others. In the end, we might expect something looking kind of like a bell curve. The peak will be at &#8220;weakly effective&#8221;, but there will be a few studies to either side. Something like this:</p>
<p><center><IMG SRC="http://slatestarcodex.com/blog_images/onestudy.png"></center></p>
<p>We see that the peak of the curve is somewhere to the right of neutral &#8211; ie weakly effective &#8211; and that there are about 15 studies that find this correct result.</p>
<p>But there are also about 5 studies that find that the drug is very good, and 5 studies missing the sign entirely and finding that the drug is actively bad. There&#8217;s even 1 study finding that the drug is very bad, maybe seriously dangerous.</p>
<p>This is before we get into fraud or statistical malpractice. I&#8217;m saying this is what&#8217;s going to happen just by normal variation in experimental design. As we increase experimental rigor, the bell curve might get squashed horizontally, but there will still be a bell curve.</p>
<p>In practice it&#8217;s worse than this, because this is assuming everyone is investigating exactly the same question.</p>
<p>Suppose that the graph is titled &#8220;Effectiveness Of This Drug In Treating Bipolar Disorder&#8221;. </p>
<p>But maybe the drug is more effective in bipolar i than in bipolar ii (Depakote, for example)</p>
<p>Or maybe the drug is very effective against bipolar mania, but much less effective against bipolar depression (Depakote again).</p>
<p>Or maybe the drug is a good acute antimanic agent, but very poor at maintenance treatment (let&#8217;s stick with Depakote).</p>
<p>If you have a graph titled &#8220;Effectiveness Of Depakote In Treating Bipolar Disorder&#8221; plotting studies from &#8220;Very Bad&#8221; to &#8220;Very Good&#8221; &#8211; and you stick all the studies &#8211; maintenence, manic, depressive, bipolar i, bipolar ii &#8211; on the graph, then you&#8217;re going to end running the gamut from &#8220;very bad&#8221; to &#8220;very good&#8221; even before you factor in noise and even before even before you factor in bias and poor experimental design.</p>
<p>So here&#8217;s why you should beware the man of one study.</p>
<p>If you go to your better class of alternative medicine websites, they don&#8217;t tell you &#8220;Studies are a logocentric phallocentric tool of Western medicine and the Big Pharma conspiracy.&#8221;</p>
<p>They tell you &#8220;medical science has proved that this drug is terrible, but ignorant doctors are pushing it on you anyway. Look, here&#8217;s a study by a reputable institution proving that the drug is not only ineffective, but harmful.&#8221;</p>
<p>And the study will exist, and the authors will be prestigious scientists, and it will probably be about as rigorous and well-done as any other study.</p>
<p>And then a lot of people raised on <A HREF="http://slatestarcodex.com/2014/04/15/the-cowpox-of-doubt/">the idea</A> that some things have Evidence and other things have No Evidence think <i>holy s**t, they&#8217;re right!</i></p>
<p>On the other hand, your doctor isn&#8217;t going to a sketchy alternative medicine website. She&#8217;s examining the entire literature and extracting careful and well-informed conclusions from&#8230;</p>
<p>Haha, just kidding. She&#8217;s going to a luncheon at a really nice restaurant sponsored by a pharmaceutical company, which assures her that they would <i>never</i> take advantage of such an opportunity to shill their drug, they just want to raise awareness of the latest study. And the latest study shows that their drug is great! Super great! And your doctor nods along, because the authors of the study are prestigious scientists, and it&#8217;s about as rigorous and well-done as any other study.</p>
<p>But obviously the pharmaceutical company has selected one of the studies from the &#8220;very good&#8221; end of the bell curve.</p>
<p>And I called this &#8220;Beware The Man of One Study&#8221;, but it&#8217;s easy to see that in the little diagram there are like three or four studies showing that the drug is &#8220;very good&#8221;, so if your doctor is a little skeptical, the pharmaceutical company can say &#8220;You are right to be skeptical, one study doesn&#8217;t prove anything, but look &#8211; here&#8217;s another group that finds the same thing, here&#8217;s yet another group that finds the same thing, and here&#8217;s a replication that confirms both of them.&#8221;</p>
<p>And even though it looks like in our example the sketchy alternative medicine website only has one &#8220;very bad&#8221; study to go off of, they could easily supplement it with a bunch of merely &#8220;bad&#8221; studies. Or they could add all of those studies about slightly different things. Depakote is ineffective at treating bipolar depression. Depakote is ineffective at maintenance bipolar therapy. Depakote is ineffective at bipolar ii. </p>
<p>So just sum it up as &#8220;Smith et al 1987 found the drug ineffective, yet doctors continue to prescribe it anyway&#8221;. Even if you hunt down the original study (which no one does), Smith et al won&#8217;t say specifically &#8220;Do remember that this study is only looking at bipolar maintenance, which is a different topic from bipolar acute antimanic treatment, and we&#8217;re not saying anything about that.&#8221; It will just be titled something like &#8220;Depakote fails to separate from placebo in six month trial of 91 patients&#8221; and trust that the responsible professionals reading it are well aware of the difference between acute and maintenance treatments (hahahahaha).</p>
<p>So it&#8217;s not so much &#8220;beware the man of one study&#8221; as &#8220;beware the man of any number of studies less than a relatively complete and not-cherry-picked survey of the research&#8221;.</p>
<p><b>II.</b></p>
<p>I think medical science is still pretty healthy, and that the consensus of doctors and researchers is more-or-less right on most controversial medical issues. </p>
<p>(it&#8217;s the <i>uncontroversial</i> ones you have to worry about)</p>
<p>Politics doesn&#8217;t have this protection.</p>
<p>Like, take the minimum wage question (please). We all know about the Krueger and Card <A HREF="http://davidcard.berkeley.edu/papers/njmin-aer.pdf">study</A> in New Jersey that found no evidence that high minimum wages hurt the economy. We probably also know the counterclaims that it was <A HREF="http://nypost.com/2013/08/06/minimum-honesty-on-minimum-wage/">completely debunked</A> as despicable dishonest statistical malpractice. Maybe some of us know Card and Krueger wrote a <A HREF="http://www.jstor.org/discover/10.2307/2677856?uid=16785200&#038;uid=3739728&#038;uid=2&#038;uid=3&#038;uid=67&#038;uid=16754504&#038;uid=62&#038;uid=3739256&#038;sid=21104826014421">pretty convincing rebuttal</A> of those claims. Or that a bunch of large and methodologically advanced studies have come out since then, some finding no effect like <A HREF="https://escholarship.org/uc/item/86w5m90m">Dube</A>, others finding strong effects like <A HREF="https://economics.uchicago.edu/workshops/Rubinstein%20Yona%20Using%20Federal%20Minimum%20Wages%20Paper.pdf">Rubinstein</A> and <A HREF="http://econbrowser.com/archives/2014/12/new-estimates-of-the-effects-of-the-minimum-wage">Wither</A>. These are just examples; there are at least dozens and probably hundreds of studies on both sides.</p>
<p>But we can solve this with meta-analyses and systemtic reviews, right?</p>
<p>Depends which one you want. Do you go with <A HREF="http://people.hss.caltech.edu/~camerer/SS280/Card-Kruger-AER_Jan95.pdf">this meta-analysis</A> of fourteen studies that shows that any presumed negative effect of high minimum wages is likely publication bias? With <A HREF="http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8543.2009.00723.x/abstract">this meta-analysis</A> of sixty-four studies that finds the same thing and discovers no effect of minimum wage after correcting for the problem? Or how about <A HREF="http://ftp.iza.org/dp4983.pdf">this meta-analysis</A> of fifty-five countries that does find effects in most of them? Maybe you prefer <A HREF="http://www.nber.org/papers/w12663.pdf">this systematic review</A> of a hundred or so studies that finds strong and consistent effects?</p>
<p>Can we trust news sources, think tanks, econblogs, and other institutions to sum up the state of the evidence?</p>
<p>CNN <A HREF="http://www.cnn.com/2011/09/16/opinion/saltsman-minimum-wage/">claims that</A> 85% of credible studies have shown the minimum wage causes job loss. But raisetheminimumwage.com <A HREF="http://www.raisetheminimumwage.com/pages/job-loss">declares that</A> &#8220;two decades of rigorous economic research have found that raising the minimum wage does not result in job loss&#8230;researchers and businesses alike agree today that the weight of the evidence shows no reduction in employment resulting from minimum wage increases.&#8221; Modeled Behavior <A HREF="http://modeledbehavior.com/2010/10/12/what-the-new-minimum-wage-research-says/">says</A> &#8220;the majority of the new minimum wage research supports the hypothesis that the minimum wage increases unemployment.&#8221; The Center for Budget and Policy Priorities <A HREF="http://www.cbpp.org/cms/?fa=view&#038;id=4075">says</A> &#8220;The common claim that raising the minimum wage reduces employment for low-wage workers is one of the most extensively studied issues in empirical economics.  The weight of the evidence is that such impacts are small to none.&#8221;</p>
<p>Okay, fine. What about economists? They seem like experts. What do they think?</p>
<p>Well, five hundred economists <A HREF="http://economistletter.com/">signed</A> a letter to policy makers saying that the science of economics shows increasing the minimum wage would be a bad idea. That sounds like a promising consensus&#8230;</p>
<p>..except that six hundred economists <A HREF="http://www.epi.org/minimum-wage-statement/">signed</A> a letter to policy makers saying that the science of economics shows increasing the minimum wage would be a <i>good</i> idea. (h/t <A HREF="http://gregmankiw.blogspot.com/2014/03/economists-divided-on-minimum-wage-hike.html">Greg Mankiw</A>)</p>
<p>Fine then. Let&#8217;s do a formal survey of economists. Now what?</p>
<p><A HREF="http://www.raisetheminimumwage.com/pages/job-loss">raisetheminimumwage.com</A>, an unbiased source if ever there was one, confidently tells us that</A> &#8220;indicative is a 2013 survey by the University of Chicago’s Booth School of Business in which leading economists agreed by a nearly 4 to 1 margin that the benefits of raising and indexing the minimum wage outweigh the costs.&#8221;</p>
<p>But the Employment Policies Institute, which sounds like it&#8217;s trying <i>way</i> too hard to sound like an unbiased source, <A HREF="https://www.epionline.org/release/o185/">tells us that</A> &#8220;Over 73 percent of AEA labor economists believe that a significant increase will lead to employment losses and 68 percent think these employment losses fall disproportionately on the least skilled. Only 6 percent feel that minimum wage hikes are an efficient way to alleviate poverty.&#8221; </p>
<p>So the whole thing is fiendishly complicated. But unless you look very very hard, you will never know that.</p>
<p>If you are a conservative, what you will find on the sites you trust will be something like this:<br />
<blockquote>Economic theory has always shown that minimum wage increases decrease employment, but the Left has never been willing to accept this basic fact. In 1992, they trumpeted a single study by Card and Krueger that purported to show no negative effects from a minimum wage increase. This study was immediately debunked and found to be based on statistical malpractice and &#8220;massaging the numbers&#8221;. Since then, dozens of studies have come out confirming what we knew all along &#8211; that a high minimum wage is economic suicide. Systematic reviews and meta-analyses (Neumark 2006, Boockman 2010) consistently show that an overwhelming majority of the research agrees on this fact &#8211; as do 73% of economists. That&#8217;s why five hundred top economists recently signed a letter urging policy makers not to buy into discredited liberal minimum wage theories. Instead of listening to starry-eyed liberal woo, listen to the empirical evidence and an overwhelming majority of economists and oppose a raise in the minimum wage.</p></blockquote>
<p>And if you are a leftist, what you will find on the sites you trust will be something like this:<br />
<blockquote>People used to believe that the minimum wage decreased unemployment. But Card and Krueger&#8217;s famous 1992 study exploded that conventional wisdom. Since then, the results have been replicated over fifty times, and further meta-analyses (Card and Krueger 1995, Dube 2010) have found no evidence of any effect. Leading economists agree by a 4 to 1 margin that the benefits of raising the minimum wage outweigh the costs, and that&#8217;s why more than 600 of them have signed a petition telling the government to do exactly that. Instead of listening to conservative scare tactics based on long-debunked theories, listen to the empirical evidence and the overwhelming majority of economists and support a raise in the minimum wage.</p></blockquote>
<p>Go ahead. <A HREF="http://webcache.googleusercontent.com/search?hl=en&#038;q=cache:TcOxVD4OoyQJ:http://www.businessinsider.com/krueger-card-fast-food-minimum-wage-study-2013-8%2Bhttp://www.businessinsider.com/krueger-card-fast-food-minimum-wage-study-2013-8&#038;gbv=2&#038;&#038;ct=clnk">Google</A> <A HREF="http://www.rifuture.org/republicans-are-wrong-about-minimum-wage-and-economists-know-it.html">the</A> <A HREF="http://mic.com/articles/61573/the-argument-to-increase-minimum-wage-you-haven-t-heard">issue</A> <A HREF="http://www.washingtonexaminer.com/article/2521472">and</A> <A HREF="http://chicagopolicyreview.org/2014/05/20/do-you-want-a-higher-minimum-wage-with-that/">see</A> <A HREF="http://www.nextnewdeal.net/rediscovering-government/debunking-minimum-wage-myth-higher-wages-will-not-reduce-jobs">what</A> <A HREF="http://www.freedomworks.org/content/yes-minimum-wage-increases-reduce-employment-and-hurt-low-skilled-workers">stuff</A>  <A HREF="http://www.nationalreview.com/corner/275846/krueger-s-faulty-minimum-wage-study-carrie-l-lukas">comes</A> <A HREF="http://www.dailykos.com/story/2014/05/01/1296116/-Minimum-Wage-Maximum-Rage">up</A>. If it doesn&#8217;t quite match what I said above, it&#8217;s usually because they can&#8217;t even muster <i>that</i> level of scholarship. Half the sites just cite Card and Krueger and call it a day!</p>
<p>These sites with their long lists of studies and experts are super convincing. And half of them are wrong.</p>
<p>At some point in their education, most smart people usually learn not to credit arguments from authority. If someone says &#8220;Believe me about the minimum wage because I seem like a trustworthy guy,&#8221; most of them will have at least one neuron in their head that says &#8220;I should ask for some evidence&#8221;. If they&#8217;re <i>really</i> smart, they&#8217;ll use the magic words &#8220;peer-reviewed experimental studies.&#8221;</p>
<p>But I worry that most smart people have <i>not</i> learned that a list of dozens of studies, several meta-analyses, hundreds of experts, and expert surveys showing almost all academics support your thesis &#8211; can <i>still</i> be bullshit. </p>
<p>Which is too bad, because that&#8217;s exactly what people who want to bamboozle an educated audience are going to use.</p>
<p><b>III.</b></p>
<p>I do not want to preach radical skepticism.</p>
<p>For example, on the minimum wage issue, I notice only one side has presented a funnel plot. A funnel plot is usually used to investigate publication bias, but it has another use as well &#8211; it&#8217;s pretty much an exact presentation of the &#8220;bell curve&#8221; we talked about above.</p>
<p><center><IMG SRC="http://upload.wikimedia.org/wikipedia/commons/8/82/Funnel_Graph_of_Estimated_Minimum_Wage_Effects.jpg"></center></p>
<p>This is more of a needle curve than a bell curve, but the point still stands. We see it&#8217;s centered around 0, which means there&#8217;s some evidence that&#8217;s the real signal among all this noise. The bell skews more to left than to the right, which means more studies have found negative effects of the minimum wage than positive effects of the minimum wage. But since the bell curve is asymmetrical, we intepret that as <i>probably</i> publication bias. So all in all, I think there&#8217;s at least some evidence that the liberals are right on this one.</p>
<p>Unless, of course, someone has realized that I&#8217;ve wised up to the studies and meta-analyses and and expert surveys, and figured out a way to hack <i>funnel plots</i>, which I am totally not ruling out.</p>
<p>(okay, I <i>kind of</i> want to preach radical skepticism)</p>
<p>Also, I should probably mention that it&#8217;s much more complicated than one side being right, and that the minimum wage probably works differently depending on what industry you&#8217;re talking about, whether it&#8217;s state wage or federal wage, whether it&#8217;s a recession or a boom, whether we&#8217;re talking about increasing from $5 to $6 or from $20 to $30, etc, etc, etc. There are eleven studies on that plot showing an effect even worse than -5, and very possibly they are all accurate for whatever subproblem they have chosen to study &#8211; much like the example with Depakote where it might an effective antimanic but a terrible antidepressant.</p>
<p>(radical skepticism actually sounds a lot better than figuring this all out).</p>
<p><b>IV.</b></p>
<p>But the question remains: what happens when (like in most cases) you don&#8217;t have a funnel plot?</p>
<p>I don&#8217;t have a good positive answer. I do have several good <i>negative</i> answers.</p>
<p>Decrease your confidence about most things if you&#8217;re not sure that you&#8217;ve investigated every piece of evidence.</p>
<p>Do not trust websites which are obviously biased (eg Free Republic, Daily Kos, Dr. Oz) when they tell you they&#8217;re going to give you &#8220;the state of the evidence&#8221; on a certain issue, even if the evidence seems very stately indeed. This goes double for any site that contains a list of &#8220;myths and facts about X&#8221;, quadruple for any site that uses phrases like &#8220;ingroup member uses actual FACTS to DEMOLISH the outgroup&#8217;s lies about Y&#8221;, and octuple for RationalWiki.</p>
<p>Most important, even if someone gives you what seems like overwhelming evidence in favor of a certain point of view, don&#8217;t trust it until you&#8217;ve done a simple Google search to see if the opposite side has equally overwhelming evidence. </p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/12/12/beware-the-man-of-one-study/feed/</wfw:commentRss>
		<slash:comments>270</slash:comments>
		</item>
		<item>
		<title>Book Review and Highlights: Quantum Computing Since Democritus</title>
		<link>http://slatestarcodex.com/2014/09/01/book-review-and-highlights-quantum-computing-since-democritus/</link>
		<comments>http://slatestarcodex.com/2014/09/01/book-review-and-highlights-quantum-computing-since-democritus/#comments</comments>
		<pubDate>Mon, 01 Sep 2014 05:47:19 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[book review]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=2743</guid>
		<description><![CDATA[People sometimes confuse me with Scott Aaronson because of our similar-sounding names. I encourage this, because Scott Aaronson is awesome and it can only improve my reputation to be confused with him. But in the end, I am not Scott &#8230; <a href="http://slatestarcodex.com/2014/09/01/book-review-and-highlights-quantum-computing-since-democritus/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>People sometimes confuse me with Scott Aaronson because of our similar-sounding names. I encourage this, because Scott Aaronson is awesome and it can only improve my reputation to be confused with him.</p>
<p>But in the end, I am not Scott Aaronson. I did not write <a href="http://smile.amazon.com/gp/product/0521199565/ref=as_li_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0521199565&#038;linkCode=as2&#038;tag=slastacod-20&#038;linkId=XDE4MZI7VGUILGSH"><i>Quantum Computing Since Democritus</i></a><img src="http://ir-na.amazon-adsystem.com/e/ir?t=slastacod-20&#038;l=as2&#038;o=1&#038;a=0521199565" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />. To be honest, I wasn&#8217;t really even able to understand <i>Quantum Computing Since Democritus</i>. I knew I was in for trouble when it compared itself to <i>The Elegant Universe</i> in the foreword, since I wasn&#8217;t able to get through more than a few chapters of that one. I dutifully tried to do the first couple of math problems <i>Democritus</i> set for me, and I even got a couple of them right. But eventually I realized that if I wanted to read <i>Democritus</i> the way it was supposed to be read, with full or even decent understanding, it would be a multi-year project, a page a day or worse, with my gains fading away a few days after I made them into a cloud of similar-looking formulae and three-letter abbreviations.</p>
<p>It left me depressed. I&#8217;ve <A HREF="http://slatestarcodex.com/2013/06/30/the-lottery-of-fascinations/">said before</A> that my lack of math talent is one of my biggest regrets in life, and here was this book that really made you understand what it must feel like to be on the cutting edge of math, proving new theorems and drawing new connections and adding to the same structure of elegant knowledge begun by Pythagoras and Archimedes and expanded by Gauss, Einstein, Turing, et cetera. All I could do was remember my own <A HREF="http://slatestarcodex.com/2014/08/16/burdens/">post on burdens</A>, remind myself that I was on record as saying that sometimes the IQ waterline in a certain area advances beyond your ability to contribute and that&#8217;s nothing to feel guilty about.</p>
<p>I did finish the book. But &#8211; well, imagine a book of geography. It lists all the countries of the world and their capitals, and is meant to be so comprehensive that a reader could use it to plot the most efficient journey from Timbuktu to Kalamazoo, taking into account tolls, weather, and levels of infrastructure development along the way.</p>
<p>And imagine a very dumb person reading that book, unable to really absorb any of the facts, but at least understanding that the world is a place with land and ocean, and the ocean is very big and blue in color, and most of the countries and cities are on the part with the land.</p>
<p>That is the level at which I understood <i>Quantum Computing Since Democritus</i>. I didn&#8217;t get as much as was in it, but more than nothing.</p>
<p>I think the biggest thing I got was &#8211; I had always thought of the physicists&#8217; God as a basically benevolent guy who fine tunes constants to create a world capable of both astounding complexity and underlying simplicity.</p>
<p>The vision I got from <i>Democritus</i> was of a God who was single-mindedly obsessed with enforcing a couple of rules about certain types of information you are not allowed to have <i>under any circumstances</i>. Some of these rules I&#8217;d already known about. You can&#8217;t have information from outside your light cone. You can&#8217;t have information about the speed and position of a particle at the same time. Others I hadn&#8217;t thought about as much until reading <i>Democritus</i>. Information about when a Turing machine will halt. Information about whether certain formal systems are consistent. Precise information about the quantum state of a particle. The reason God hasn&#8217;t solved world poverty yet is that He is pacing about feverishly worried that someone, somewhere, is going to be able to measure the quantum state of a particle too precisely, and dreaming up new and increasingly bizarre ways He can prevent that from happening.</p>
<p>Aaronson goes one level deeper than most of the other popular science writers I know and speculates on why the laws of physics are the way they are. Sometimes this is the elegance and complexity route &#8211; in his chapter on quantum physics, he argues that quantum probabilities are the squares of amplitudes because if the laws of physics were any other way &#8211; the fourth power of amplitudes, or whatever &#8211; it would fail to preserve certain useful mathematical properties. But in other cases, it&#8217;s back to Obsessive God &#8211; the laws of physics are carefully designed to preserve the rules about what information you are and aren&#8217;t allowed to have.</p>
<p>Aaronson tries to tie his own specialty, computational complexity theory, into all of this. It&#8217;s hard for me to judge how successful he is. The few times he tries to tie it into areas of philosophy I know something about &#8211; like free will &#8211; I&#8217;m not too impressed. But I could be misunderstanding him.</p>
<p>But once again, you get the feeling that computational complexity is about what information God will and won&#8217;t let you have. It&#8217;s a little less absolute &#8211; more &#8220;you can&#8217;t have this information without doing the full amount of work&#8221; rather than a simple no &#8211; but it seems like the same principle. There are a bunch of situations in the book where Aaronson takes something we don&#8217;t really know that much about and says it <i>has</i> to be a certain way, because if it were any other way, it could be used to solve NP problems in polynomial time, and there&#8217;s no way God&#8217;s going to let us do that.</p>
<p>Aaronson ties it all together in a very interesting way &#8211; with his story of how <A HREF="http://www.scottaaronson.com/blog/?p=277">Australian Actresses Are Plagiarizing My Quantum Mechanics Lectures To Sell Printers</A>. He tells the story of how a printer company wanted to make a pun on &#8220;more intelligent model of printer&#8221;, so they made a commercial with intelligent models in the form of fashion models talking about quantum mechanics. And the particular quantum mechanics statement they made was a plagiarized quote from a Scott Aaronson lecture. And upon thinking about it, Aaronson decided that the quote they had chosen at random was in fact the thesis statement that tied together everything he believed and was working on. The model had said:<br />
<blockquote>But if quantum mechanics isn’t physics in the usual sense — if it’s not about matter, or energy, or waves, or particles — then what is it about? From my perspective, it’s about information and probabilities and observables, and how they relate to each other.</p></blockquote>
<p>That seems like as good a summary as any of <i>Democritus</i>, and a pretty good description of what I got out of it. I may not be as smart as Scott Aaronson, but on my good days I am right up there with Australian fashion models.</p>
<p>A list of passages I highlighted in my copy for being interesting, funny, or enlightening:<br />
<blockquote>Can we prove there&#8217;s no program to solve the halting problem? This is what Turing does. His key idea is not even to try to analyze the internal dynamics of such a program, supposing it existed. Instead he simply says, suppose by way of contradiction that such a program P exists. Then we can modify P to produce a new program P&#8217; that does the following. Given another program Q as its input, P':</p>
<p>1) Runs forever if Q halts given its own code as input, or<br />
2) Halts if Q runs forever given its own code as input</p>
<p>Now we just feed P&#8217; its own code as input. By the conditions above, P&#8217; will run forever if it halts, or halt if it runs forever. Therefore, P&#8217; &#8211; and by implication P &#8211; can&#8217;t have existed in the first place.</p></blockquote>
<p>I&#8230;I suddenly understand what the halting problem is. And there is a short proof of it that makes total sense to me. This is a completely new experience.<br />
<blockquote>Oracles were apparently first studied by Turing, in his 1938 PhD thesis. Obviously anyone who could write a whole thesis about these fictitious entities would have to be an extremely pure theorist, someone who wouldn&#8217;t be caught dead doing anything relevant. This was certainly true in Turing&#8217;s case &#8211; indeed, he spent the years after his PhD, from 1939 to 1943, studying certain abstruse symmetry transformations in a 26 letter alphabet</p></blockquote>
<p>ಠ_ಠ<br />
<blockquote>You can look at Deep Blue, the Robbins conjecture, Google, most recently Watson &#8211; and say that&#8217;s not <i>really</i> AI. That&#8217;s just massive search, helped along by clever programming. Now this kind of talk drives AI researchers up a wall. They say: if you told someone in the 1960s that in 30 years we&#8217;d be able to beat the world grandmaster at chess, and asked if that would count as AI, they&#8217;d say of course it&#8217;s AI. But now that we know how to do it, it&#8217;s no longer AI &#8211; it&#8217;s just search.</p></blockquote>
<blockquote><p>The third thing that annoys me about the Chinese Room argument is the way it gets so much mileage from a possibly misleading choice of imagery, or, one might say, by trying to sidestep the entire issue of <i>computational complexity</i> purely through clever framing. We&#8217;re invited to imagine someone pushing around slips of paper with zero understanding or insight, much like the doofus freshmen who write (a + b)^2 = a^2 + b^2 on their math tests. But <u>how many slips of paper are we talking about!</u> How big would the rule book have to be, and how quickly would you have to consult it, to carry out an intelligent Chinese conversation in anything resembling real time? If each page of the rule book corresponded to one neuron of a native speaker&#8217;s brain, then probably we&#8217;d be talking about a &#8220;rule book&#8221; at leas the size of the Earth, its pages searchable by a swarm of robots traveling at close to the speed of light. When you put it that way, maybe it&#8217;s not so hard to imagine this enormous Chinese-speaking entity that we&#8217;ve brought into being might have something we&#8217;d be prepared to call understanding or insight.</p></blockquote>
<p>This is a really clever counterargument to Chinese Room I&#8217;d never heard before. Philosophers are so good at pure qualitative distinctions that it&#8217;s easy to slip the difference between &#8220;guy in a room&#8221; and &#8220;planet being processed by lightspeed robots&#8221; under the rug.</p>
<blockquote><p>Many people&#8217;s anti-robot animus is probably a combination of two ingredients &#8211; the directly experienced certainty that they&#8217;re conscious &#8211; that they perceive sounds, colors, etc &#8211; and the belief that if they were just a computation, then they could not be conscious in this way. For people who think this way, granting consciousness to a robot seems strangely equivalent to denying that one is conscious oneself.</p></blockquote>
<p>This is actually a pretty deep way of looking at it.</p>
<blockquote><p>My contention in this chapter is that quantum mechanics is what you would inevitably come up with if you started from probability theory, and then said, let&#8217;s try to generalize it so that the numbers we used to call &#8220;probabilities&#8221; can be negative numbers. As such, the theory could have been invented by mathematicians in the nineteenth century without any input from experiment. It wasn&#8217;t, but it could have been. And yet, with all the structures mathematicians studied, none of them came up with quantum mechanics until experiment forced it on them.</p></blockquote>
<p>Aaronson&#8217;s explanation of quantum mechanics is a lot like Eliezer&#8217;s explanation of quantum mechanics, in that they both start by saying that the famous counterintuitiveness of the subject is partly because people choose to teach it in a backwards way in order to mirror the historical progress of understanding. I&#8217;m sure Eliezer mentioned it many times, but I didn&#8217;t really get the understanding of amplitudes as potentially negative probability-type-things until I read Aaronson.<br />
<blockquote>And that&#8217;s a perfect illustration of why experiments are necessary in the first place! More often than not, the only reason we need experiments is that we&#8217;re not smart enough. After the experiment has been done, if we&#8217;ve learned anything worth knowing at all, then we hope we&#8217;ve learned why the experiment wasn&#8217;t necessary to begin with &#8211; why it wouldn&#8217;t have made sense for the universe to be any other way. But we&#8217;re too dumb to figure it out ourselves</p></blockquote>
<p>Compare: <A HREF="http://lesswrong.com/lw/jo/einsteins_arrogance/">Einstein&#8217;s Arrogance</A>, <A HREF="http://slatestarcodex.com/2014/08/05/negative-creativity/">Negative Creativity</A>.<br />
<blockquote>Quantum mechanics does offer a way out [the philosophical puzzle about whether you &#8220;survive&#8221; a teleportation where a machine scans you on an atomic level, radios the data to Mars, another machine on Mars makes an atom-for-atom copy of you, and then the original is destroyed]. Suppose some of the information that made you you was actually quantum information. Then, even if you were a thoroughgoing materialist, you could still have an excellent reason not to use the teleportation machine &#8211; because, as a consequence of the No-Cloning Theorem, <u>no such machine could possibly work as claimed</u></p></blockquote>
<p>This is fighting the hypothetical a little, but maybe in a productive way.<br />
<blockquote>[Bayesianism] is one way to do it, but computational learning theory tells us that it&#8217;s not the only way. You don&#8217;t need to start out with an assumption about a probability distribution over the hypothesis. You can make a worst-case assumption about the hypothesis and then just say that you&#8217;d like to learn any hypothesis in the concept class, for any sample distribution, with high probability over the choice of samples. In other words, you can trade the Bayesians&#8217; probability distribution over hypotheses for a probability distribution over sample data.</p></blockquote>
<p>I hear a bunch of people telling me Bayesianism isn&#8217;t everything, it&#8217;s the only thing &#8211; and another bunch of people telling me it&#8217;s one useful tool in an entire bag of them. I didn&#8217;t understand enough of the book&#8217;s chapter on computational learning to gain too much insight here, but I will tick off one more name as being on the &#8220;one useful tool&#8221; side. Also, it makes me angry that Scott Aaronson knows so much about computational learning theory. He already knows lots of complicated stuff about computers, quantum physics, set theory, and philosophy. Part of me wants to get angry: WHY IS ONE PERSON ALLOWED TO BE SO SMART? But I guess it&#8217;s more like how I know more than average about history, literature, geography, etc. I guess if you have high math ability and some intellectual curiosity, you end up able to plug it into everything pretty effortlessly. Don&#8217;t care though. Still jealous.<br />
<blockquote>Imagine there&#8217;s a very large population of people in the world, and that there&#8217;s a madman. What the madman does is, he kidnaps ten people and puts them in a room. He then throws a pair of dice. If the dice land snake-eyes (two ones) then he murders everyone in the room. If the dice do not land snake-eyes, then he releases everyone, then kidnaps 100 new people. He now sodes the same thing: he rolls two dice; if they land snake-eyes, he kills everyone, and if they don&#8217;t land snake-eyes, then he releases them and kidnaps 1000 people. He keeps doing this until he gets snake-eyes, at which point he&#8217;s done. So now, imagine that you&#8217;ve been kidnapped. Codnitioned on that fact, how likely is it that you&#8217;re going to die? One answer is that the dice have a 1/36 chance of landing snake eyes, so you should only be a &#8220;little bit&#8221; worried (considering). A second reflection you could make is to consider, of people who enter the room, what the fraction is of people who ever get out. About 8/9 of the people who ever go into the room will die.</p></blockquote>
<p>This interested me because it is equivalent to the Anthropic Doomsday conjecture and I&#8217;d never heard this phrasing of it before.<br />
<blockquote>Finally, if we want to combine the anthropic computation idea with the Doomsday Argument, then there&#8217;s the Adam and Eve puzzle. Suppose Adam and Eve are the first two observers, and that they&#8217;d like to solve an instance of an NP-complete problem, say, 3-SAT. To do so, they pick a random assignment, and form a very clear intention beforehand that if the assignment happens to be satisfying, they won&#8217;t have any kids, whereas if the assignment is not satisfying, then they will go forth and multiply. Now let&#8217;s assume SSA. Then, conditioned on having chosen an unsatisfying assignment, how likely is it that they would be an Adam and Eve in the first place, as opposed to one of the vast number of future observers? Therefore, conditioned upon the fact that they are the first two observers, the SSA predicts that, with overwhelming probability, they will pick a satisfying assignment.</p></blockquote>
<p>And the Lord saw Eve and said &#8220;What are you doing?&#8221;. And Eve said &#8220;I am forming an intention not to reproduce if I generate a solution to an NP complete problem, as part of an experiment in anthropic computation&#8221;. And the Lord asked &#8220;Who told you this?&#8221; And Eve said &#8220;It was the serpent who bade me compute, for he told me if I did this I would be as God, knowing subgraph isomorphism and 3SAT.&#8221; Then the Lord cast them forth from the Garden, because He was Information Theoretic God and preventing people from screwing with complexity classes is like His entire shtick.<br />
<blockquote>I like to engage skeptics for several reasons. First of all, because I like arguing. Second, often I find that the best way to come up with new results is to find someone who&#8217;s saying something that seems clearly, manifestly wrong to me, and then try to think of counterarguments. Wrong claims are a fertile source of research ideas.</p></blockquote>
<p>I said something almost exactly the same on Facebook a few days ago when Brienne asked how to generate good ideas.<br />
<blockquote>There&#8217;s a joke about a planet full of people who believe in anti-induction: if the sun has risen every day in the past, then today, we should expect that it won&#8217;t. As a result, these people are all starving and living in poverty. Someone visits the planet and tells them, &#8220;Hey, why are you still using this anti-induction philosophy? You&#8217;re living in horrible poverty!&#8221; They answer, &#8220;Well, it never worked before.&#8221;</p></blockquote>
<p>ಠ_ಠ</p>
<p><center><iframe style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-na.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&#038;OneJS=1&#038;Operation=GetAdHtml&#038;MarketPlace=US&#038;source=ss&#038;ref=ss_til&#038;ad_type=product_link&#038;tracking_id=slastacod-20&#038;marketplace=amazon&#038;region=US&#038;placement=0521199565&#038;asins=0521199565&#038;linkId=UKSD6W5W5R4MAPSI&#038;show_border=true&#038;link_opens_in_new_window=true"><br />
</iframe></center></p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/09/01/book-review-and-highlights-quantum-computing-since-democritus/feed/</wfw:commentRss>
		<slash:comments>183</slash:comments>
		</item>
		<item>
		<title>How Common Are Science Failures?</title>
		<link>http://slatestarcodex.com/2014/07/02/how-common-are-science-failures/</link>
		<comments>http://slatestarcodex.com/2014/07/02/how-common-are-science-failures/#comments</comments>
		<pubDate>Thu, 03 Jul 2014 01:36:31 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=2351</guid>
		<description><![CDATA[After a brief spurt of debate over the claim that &#8220;97% of relevant published papers support anthropogenic climate change&#8221;, I think the picture has mostly settled to an agreement that &#8211; although we can contest the methodology of that particular &#8230; <a href="http://slatestarcodex.com/2014/07/02/how-common-are-science-failures/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>After a brief spurt of debate over the claim that &#8220;97% of relevant published papers support anthropogenic climate change&#8221;, I think the picture has mostly settled to an agreement that &#8211; although we can contest the methodology of that particular study &#8211; there are multiple lines of evidence that the number is somewhere in the nineties.</p>
<p>So if any doubt at all is to remain about climate change, it has to come from the worry that sometimes entire scientific fields can get things near-unanimously wrong, especially for political or conformity-related reasons.</p>
<p>In fact, I&#8217;d go so far as to say that if we are not climatologists ourselves, our prior on climate change should be <i>based upon</i> how frequently entire scientific fields get things terribly wrong for political or conformity-related reasons.</p>
<p>Skeptics mock the claim that <A HREF="http://skeptico.blogs.com/skeptico/2005/11/science_wrong.html">science was wrong before</A>, but skeptics mock <i>everything</i>. A better plan might be to try to quantify the frequency of scientific failures so we can see how good (or bad) the chances are for any given field.</p>
<p>Before we investigate, we should define our reference class properly. I think a scientific mistake only counts as a reason for doubting climate change (or any other commonly-accepted scientific paradigm) if:</p>
<p>1. It was made sometime in the recent past. Aristotle was wrong about all sorts of things, and so were those doctors who thought everything had to do with black bile, but the scientific community back then was a lot less rigorous than our own. Let&#8217;s say it counts if it&#8217;s after 1900.</p>
<p>2. It was part of a really important theory, one of the fundamental paradigms of an entire field. I&#8217;m sure some tiny group of biologists have been wrong about how many chromosomes a shrew has, but that&#8217;s probably an easier mistake to wander into than all of climatology screwing up simultaneously.</p>
<p>3. It was a stubborn resistance to the truth, rather than just a failure to have come up with the correct theory immediately. People were geocentrists before they were heliocentrists, but this wasn&#8217;t because the field of astronomy became overly politicized and self-assured, it was because (aside from <A HREF="http://en.wikipedia.org/wiki/Aristarchus_of_Samos">one ancient Greek guy</A> nobody really read) heliocentrism wasn&#8217;t invented until the 1500s, and after that it took people a couple of generations to catch on. In the same way, Newton&#8217;s theory of gravity wasn&#8217;t quite as good as Einstein&#8217;s, but this would not shame physicists in the same way climate change being wrong would shame climatologists. Let&#8217;s say that in order to count, the correct theory has to be very well known (the correct theory is allowed to be &#8220;this phenomenon doesn&#8217;t exist at all and you are wasting your time&#8221;) and there is a large group of people mostly outside the mainstream scientific establishment pushing it (for approximately correct reasons) whom scientists just refuse to listen to.</p>
<p>4. We now know that the past scientific establishment was definitely, definitely wrong and everyone agrees about this and it is not seriously in doubt. This criterion isn&#8217;t to be fair to the climatologists, this is to be fair to <i>me</i> when I have to read the comments to this post and get a bunch of &#8220;Nutritionists have yet to sign on to my pet theory of diet, that <i>proves</i> some scientific fields are hopelessly corrupt!&#8221;</p>
<p>Do any such scientific failures exist?</p>
<p>If we want to play this game on Easy Mode, our first target will be <A HREF="http://en.wikipedia.org/wiki/Lysenkoism">Lysenkoism</A>, the completely bonkers theory of agriculture and genetics adopted by the Soviet Union. A low-level agricultural biologist, Lysenko, came up with questionable ways of increasing agricultural output through something kind of like Lamarckian evolution. The Soviet government wanted to inspire people in the middle of a famine, didn&#8217;t really like real scientists because they seemed kind of bourgeois, and wanted to discredit genetics because heritability seemed contrary to the idea of New Soviet Man. So they promoted Lysenko enough times that everyone got the message that Lysenkoism was the road to getting good positions. All the careerists switched over to the new paradigm, and the holdouts who continued to believe in genetics were denounced as fascists. According to Wikipedia, &#8220;in 1948, genetics was officially declared &#8220;a bourgeois pseudoscience&#8221;; all geneticists were fired from their jobs (some were also arrested), and all genetic research was discontinued.&#8221;</p>
<p>About twenty years later the Soviets quietly came to their senses and covered up the whole thing.</p>
<p>I would argue that Stalinist Russia, where the government was very clearly intervening in science and killing the people it didn&#8217;t like, isn&#8217;t a fair test case for a theory today. But climate change opponents would probably respond that the liberal world order is unfairly promoting scientists who support climate change and persecuting those who oppose it. And Lysenkoism at least proves that is the sort of thing which can in theory sometimes happen. So let&#8217;s grumble a little but give it to them.</p>
<p>Now we turn the dial up to Hard Mode. Are there any cases of failure on a similar level within a scientific community <i>in a country not actively being ruled by Stalin</i>?</p>
<p>I can think of two: Freudian psychoanalysis and behaviorist psychology.</p>
<p>Freudian psychoanalysis <A HREF="http://slatestarcodex.com/2013/09/19/scientific-freud/">needs no introduction</A>. It dominated psychiatry &#8211; not at all a small field &#8211; from about 1930 to 1980. As far as anyone can tell, the entire gigantic edifice has no redeeming qualities. I mean, it correctly describes the existence of a subconscious, and it may have some insightful things to say on childhood trauma, but as far as a decent model of the brain or of psychological treatment goes, it was a giant mistake.</p>
<p>I got a little better idea just <i>how</i> big a mistake doing some research for the Anti-Reactionary FAQ. I wanted to see how homosexuals were viewed back in the 1950s and ran across two New York Times articles about them (<A HREF="http://slatestarcodex.com/blog_images/reaction/nythomo1.pdf">1</A>, <A HREF="http://slatestarcodex.com/blog_images/reaction/nythomo3.pdf">2</A>). It&#8217;s really creepy to see them explaining how instead of holding on to folk beliefs about how homosexuals are normal people just like you or me, people need to start listening to the psychoanalytic experts, who know the <i>real</i> story behind why some people are homosexual. The interviews with the experts in the article are a little surreal.</p>
<p>Psychoanalysis wasn&#8217;t an honest mistake. The field already had a perfectly good alternative &#8211; denouncing the whole thing as bunk &#8211; and sensible non-psychoanalysts seemed to do exactly that. On the other hand, the more you got &#8220;educated&#8221; about psychiatry in psychoanalytic institutions, and the more you wanted to become a psychiatrist yourself, the more you got biased into think psychoanalysis was obviously correct and dismissing the doubters as science denalists or whatever it was they said back then.</p>
<p>So this seems like a genuine example of a scientific field failing.</p>
<p>Behaviorism in psychology was&#8230;well, this part will be controversial. A weak version is &#8220;psychologists should not study thoughts or emotions because these are unknowable by scientific methods; instead they should limit themselves to behaviors&#8221;. A strong version is &#8220;thoughts and emotions don&#8217;t exist; they are post hoc explanations invented by people to rationalize their behaviors&#8221;.  People are going to tell me that real psychologists only believed the weak version, but having read more than a little 1950s psychology, I&#8217;m going to tell them they&#8217;re wrong. I think a lot of people believed the strong version and that in fact it was the dominant paradigm in the field.</p>
<p>And of course common people said this was stupid, of course we have thoughts and emotions, and the experts just said that kind of drivel was exactly what common people <i>would</i> think. Then came the cognitive revolution and people realized thoughts and emotions were actually kind of easy to study. And then we got MRI machines and are now a good chunk of the way to <i>seeing</i> them.</p>
<p>So this too I will count as a scientific failure.</p>
<p>But &#8211; and this seems important &#8211; I can&#8217;t think of any others.</p>
<p>Suppose there are about fifty scientific fields approximately as important as genetics or psychiatry or psychology. And suppose within the past century, each of them had room for about five paradigms as important as psychoanalysis or behaviorism or Lysenkoism.</p>
<p>That would mean there are about 250 possibilities for science failure, of which three were actually science failures &#8211; for a failure rate of 1.2%.</p>
<p>This doesn&#8217;t seem much more encouraging for the anti-global-warming cause than the 3% of papers that support them.</p>
<p>I think I&#8217;m being pretty fair here &#8211; after all, Lysenkoism was limited to one extremely-screwed-up country, and people are going to yell that behaviorism wasn&#8217;t as bad as I made it sound. And two of the three failures are in psychology, a social science much fuzzier than climatology where we can expect far more errors. A cynic might say if we include psychology we might as well go all the way and include economics, sociology, and anthropology, raising our error count to over nine thousand.</p>
<p>But if we want to be even fairer, we can admit that there are probably some science failures that haven&#8217;t been detected yet. I can think of three that I very strongly suspect are in that category, although I won&#8217;t tell you what they are so as to not distract from the meta-level debate. That brings us to 2.4%. Admit that maybe I&#8217;ve only caught half of the impending science failures out there, and we get to 3.6%. Still not much of an improvement for the anti-AGW crowd over having 3% of the literature.</p>
<p>Unless of course I am missing a whole load of well-known science failures which you will remind me about in the comments.</p>
<p><b>[Edit: Wow, people are really bad at following criteria 3 and 4, even going so far as to post the exact examples I said not to. Don&#8217;t let that be you.]</b></p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/07/02/how-common-are-science-failures/feed/</wfw:commentRss>
		<slash:comments>362</slash:comments>
		</item>
		<item>
		<title>Utopian Science</title>
		<link>http://slatestarcodex.com/2014/05/01/utopian-science/</link>
		<comments>http://slatestarcodex.com/2014/05/01/utopian-science/#comments</comments>
		<pubDate>Fri, 02 May 2014 01:41:57 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[conworlding]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=1947</guid>
		<description><![CDATA[I. Pre-emptive plagiarism is the worst. I was all set to write about how I thought the problems I brought up in The Control Group Is Out Of Control could be addressed. Then Josh Haas wrote A Modest Proposal To &#8230; <a href="http://slatestarcodex.com/2014/05/01/utopian-science/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><b>I.</b></p>
<p>Pre-emptive plagiarism is the worst. I was all set to write about how I thought the problems I brought up in <A HREF="http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control">The Control Group Is Out Of Control</A> could be addressed. </p>
<p>Then Josh Haas wrote <A HREF="http://blog.joshhaas.com/2014/04/a-modest-proposal-to-fix-science/">A Modest Proposal To Fix Science</A>, which took the words right out of my mouth. Separate out exploratory and confirmatory research, have the latter done by different people with no stake in the matter.</p>
<p>So if I want to wring a blog post out of this I&#8217;m going to have to go way further than that, come up with something <i>really</i> outlandish.</p>
<p>So here is how science works in <A HREF="http://slatestarcodex.com/2013/05/15/index-posts-on-raikoth/">the utopian culture of Raikoth</A>.</p>
<p><b>II.</b></p>
<p>Anyone can do exploratory research. It can be experiments published in special exploratory research journals. Or it can be a collection of anecdotes supporting a theory published in a magazine. Or it can be a list of arguments on a website. The point is to get an idea out there, build interest.</p>
<p>Remember <A HREF="http://slatestarcodex.com/2013/05/06/raikoth-laws-language-and-society/">the Angel of Evidence</A>? The centralized nationwide prediction market? Anyone with a theory can list it there. The goal of exploratory research is to get people interested enough in the idea to bet about it on the Angel.</p>
<p>Suppose you become convinced that eating grapes cures cancer. So you submit a listing to the Angel: &#8220;Eating grapes cures cancer&#8221;. Probably most people doubt this proposition and the odds are around zero. So you do some exploratory research. You conduct a small poorly controlled study of a dozen cancer patients who sign up, and feed them a lot of grapes. You report that all of them seemed to have their cancer go away. You talk about how chemicals in grapes are known tumor inhibitors. Gradually a couple of people start thinking there&#8217;s something to what you&#8217;re saying. They make bets on the prediction market &#8211; maybe saying there&#8217;s only a 10% chance that you&#8217;re right, but it&#8217;s enough. The skeptics, and there are many, gladly bet against them, hoping to part gullible fools from their money. Business on the bet starts to go up.</p>
<p>These research prediction markets are slightly negative-sum. Maybe the loser loses $10, but the winner only gets $9. When enough people have bet on the market, the value of this &#8220;missing money&#8221; becomes considerable. This is the money that funds a confirmatory experiment.</p>
<p>What this means is that it is interest in &#8211; and disagreement about &#8211; the question at hand that makes an experiment possible. When no one believes grapes cure cancer, everyone&#8217;s probability is around zero and so no one bets against anyone else and there is no money available for the experiment. When you do your exploratory research and come up with good arguments why grapes should work, then if you really make your case some people should be willing to bet on it &#8211; not at even odds, maybe, but at least at 10:1 odds favoring them or whatever.</p>
<p>Suppose the experiment returns positive results (what qualifies as &#8220;positive results&#8221; is predefined &#8211; maybe the bet specifies &#8220;effect size > 0.4, p < 0.05"). Now one of two things happens. Either everyone is entirely convinced that grapes cure cancer, and people stop doing science and start eating more grapes. Or the controversy continues. If the controversy continues, a bet can be placed on the prediction market for the success or failure of a replication. No doubt people will want to take or short this bet at very different odds than they did the last one. Maybe you could get 10:1 odds against grapes curing cancer the first time, when you were going on a tiny exploratory study, but now you can only get even odds. No problem. The pro-grape side bets in favor, the anti-grape side is still willing to bet against, and the replication takes place.    Rinse and repeat. At every step, one of three things is true. First, there is still controversy, in which case the controversy funds more experiments, and the odds at which people will bet on those experiments is the degree of credence we should have in the scientific prediction involved. Second, there isn't enough money in favor of the proposition to get a market going, in which case the proposition has been soundly disproven. Third, there isn't enough money against the proposition to get a market going, in which case the proposition is universally accepted scientific fact.    In practice things are not this easy. The system is excellent at resolving controversies, and you can easily get as much money as you need to study whether guns decrease crime or whatever. But science includes not just controversies but basic research. Things like particle physics might suffer - who is going to bet for or against the proposition that the Higgs boson has a mass greater than 140 GeV? Only a couple of physicists even understand the question, and physicists as a group don't command large sums of spare capital.    So what happens is that scientific bodies - the Raikothin equivalent of our National Science Foundation - subsidize the prediction markets. This is very important. Instead of donating $1 million to CERN to do boson research, they donate $1 million to the Angel of Evidence to make the prediction market more lucrative. Suddenly the market is positive-sum; maybe you lose $10 if you're wrong, but gain $11 if you're right. The lure of free money is very attractive. Some ordinary people jump in, not really sure what a boson is but knowing that the odds are in their favor. But more important, so do "science hedge funds" that hire consultant physicists to maximize their likely return. Just as hedge fundies in the US might do lots of research into copper mining even though they don't care about copper at all in order to figure out which mining company is the best buy, so these "science hedge funds" would try to figure out what mass the Higgs boson is likely to have, knowing they will win big if they're right. Although the National Science Fund type organization funds the experiments <i>indirectly</i>, it is the money of these investors that directly goes to CERN to buy boson-weighing machinery.</p>
<p><b>III.</b></p>
<p>So much for funding. How are the actual experiments conducted?</p>
<p>They are conducted by <i>consultant scientists</i>. The number one rule of being a consultant scientist is <i>you do not care about the hypothesis</i>.</p>
<p>The Raikolin would have a lot of reasons to react in horror if someone pointed them to Earth, but one of the bigger ones is that <i>the person who invented a hypothesis is responsible for testing it.</i> Or at least someone in the same field, who has been debating it for years and whose entire career depends upon it. This makes no more sense than asking criminals to judge their own trials, or having a candidate count the votes in their own election.</p>
<p>Having any strong opinion on the issue at hand is <i>immediate disqualification</i> for a consultant scientist to perform a confirmatory experiment.</p>
<p>The consultant scientist is selected by the investors in the prediction market. Corporate governance type laws are used to select a representative from both sides (those who will profit if the theory is debunked, and those who will profit if it is confirmed). Then they will meet together and agree on a consultant. If they cannot agree, sometimes they will each hire their own consultant scientist and perform two independent experiments, with the caveat that a result only counts if the two experiments return the same verdict.</p>
<p>As the consultant plans the experiment, she receives input from both the pro- and the con- investors. Finally, she decides upon an experimental draft and publishes it in a journal.</p>
<p>This publication is a form of pre-registration, but it&#8217;s also more than that. It is the exact published paper that will appear in the journal when the experiment is over, except that all numbers in the results section have been replaced by a question mark, ie &#8220;We compared three different levels of grape-eating and found that the highest level had ? percent less cancer than the lowest, p < ?". The only difference between this draft and the real paper is that the real one fills in the numbers and adds a Discussion section. This gives <i>zero</i> degrees of freedom in what tests are done and in how the results are presented.</p>
<p>Two things happen after the draft is published.</p>
<p>First, investors get one final chance to sell their bets or bow out of the experiment without losses. Perhaps some investors thought that grapes cured cancer, but now that they see the experimental protocol, they don&#8217;t believe it is good enough to detect this true fact. They bow out. Yes, this decreases the amount of money available for the experiment. That comes out of the consultant scientist&#8217;s salary, giving her an incentive to make as few people bow out as possible.</p>
<p>Second, everyone in the field is asked to give a statement (and make a token bet) on the results. <b>This is the most important part</b>. It means that if you believe grapes cause cancer, and the experiment shows that grapes have no effect, you can&#8217;t come back and say &#8220;Well, OBVIOUSLY this experiment didn&#8217;t detect it, they used overly ripe grapes, that completely negates the anti-tumor effect, this study was totally useless and doesn&#8217;t discredit my theory at all&#8221;. No. When the draft is published, if you think there are flaws in the protocol, you speak then or forever hold your peace. If you are virtuous, you even say something like &#8220;Well, right now I think grapes cure cancer with 90% probability, but if this experiment returns a null result, I guess I&#8217;ll have to lower that to 10%.&#8221;</p>
<p>These statements are made publicly and recorded publicly. If you say an experiment will prove something, and it doesn&#8217;t, and this happens again and again, then <i>people will start noticing you don&#8217;t actually know any science</i>.</p>
<p>(If you&#8217;re always right, you immediately get hired by a science hedge fund at an obscenely high salary.)</p>
<p>Finally, the consultant scientist does her experiment. Some result is obtained. The question marks in the draft are filled in, and it is resubmitted as a published paper. The appropriate people make or lose money. The appropriate scientific experts gain or lose prestige as people who are or aren&#8217;t able to predict natural processes. The appropriate consultant scientists gain or lose prestige as people whose results were or weren&#8217;t replicated. The exploratory scientist who proposed the hypothesis in the first place gains or loses prestige that make people more or less likely to bet money on the next idea she comes up with.</p>
<p><b>IV.</b></p>
<p>There are no homeopaths in Raikoth.</p>
<p>I mean, there <i>were</i>, ages ago. They proposed experiments that could be done to prove homeopathy, and put their money where their mouth was. They lost lots of money when it turned out not to work. They added epicycles, came up with extra conditions that had to be in place before homeopathy would have an effect. Their critics were more than happy to bet the money it took to test those conditions as well, and the critics ended up rich. Eventually, the homeopaths were either broke, or sufficiently mugged by reality that they stopped believing homeopathy worked.</p>
<p>But <A HREF="http://slatestarcodex.com/2014/04/15/the-cowpox-of-doubt/">homeopathy is boring</A>. The real jewel of this system is to be able to go online, access the Angel of Evidence, and see a list of every scientific hypothesis that anyone considers worth testing, along with the probability estimate that each is true. To watch as the ones in the middle gradually, after two or three experiments, end up getting so close to zero or one hundred as makes no difference, and dropping off the &#8220;active&#8221; list. To hear the gnashing of the teeth of people whose predictions have been disconfirmed and who no longer have a leg to stand on.</p>
<p>Also, if you can predict the masses of bosons consistently enough, you get crazy rich. </p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/05/01/utopian-science/feed/</wfw:commentRss>
		<slash:comments>68</slash:comments>
		</item>
		<item>
		<title>The Control Group Is Out Of Control</title>
		<link>http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/</link>
		<comments>http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/#comments</comments>
		<pubDate>Tue, 29 Apr 2014 00:46:27 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[long post is long]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[studies]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=1921</guid>
		<description><![CDATA[I. Allan Crossman calls parapsychology the control group for science. That is, in let&#8217;s say a drug testing experiment, you give some people the drug and they recover. That doesn&#8217;t tell you much until you give some other people who &#8230; <a href="http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><b>I.</b></p>
<p>Allan Crossman calls parapsychology <A HREF="http://lesswrong.com/lw/1ib/parapsychology_the_control_group_for_science/">the control group for science</A>.</p>
<p>That is, in let&#8217;s say a drug testing experiment, you give some people the drug and they recover. That doesn&#8217;t tell you much until you give some other people who are taking a placebo drug you <i>know</i> doesn&#8217;t work &#8211; but which they themselves believe in &#8211; and see how many of <i>them</i> recover. That number tells you how many people will recover whether the drug works or not. Unless people on your real drug do significantly better than people on the placebo drug, you haven&#8217;t found anything.</p>
<p>On the meta-level, you&#8217;re studying some phenomenon and you get some positive findings. That doesn&#8217;t tell you much until you take some other researchers who are studying a phenomenon you <i>know</i> doesn&#8217;t exist &#8211; but which they themselves believe in &#8211; and see how many of <i>them</i> get positive findings. That number tells you how many studies will discover positive results whether the phenomenon is real or not. Unless studies of the real phenomenon do significantly better than studies of the placebo phenomenon, you haven&#8217;t found anything.</p>
<p>Trying to set up placebo science would be a logistical nightmare. You&#8217;d have to find a phenomenon that definitely doesn&#8217;t exist, somehow convince a whole community of scientists across the world that it does, and fund them to study it for a couple of decades without them figuring out the gig.</p>
<p>Luckily we have a natural experiment in terms of parapsychology &#8211; the study of psychic phenomena &#8211; which most reasonable people don&#8217;t believe exists but which a community of practicing scientists does and publishes papers on all the time.</p>
<p>The results are pretty dismal. Parapsychologists are able to produce experimental evidence for psychic phenomena about as easily as normal scientists are able to produce such evidence for normal, non-psychic phenomena. This suggests the existence of a very large &#8220;placebo effect&#8221; in science &#8211; ie with enough energy focused on a subject, you can <i>always</i> produce &#8220;experimental evidence&#8221; for it that meets the usual scientific standards. As Eliezer Yudkowsky puts it:<br />
<blockquote>Parapsychologists are constantly protesting that they are playing by all the standard scientific rules, and yet their results are being ignored &#8211; that they are unfairly being held to higher standards than everyone else. I&#8217;m willing to believe that. It just means that the standard statistical methods of science are so weak and flawed as to permit a field of study to sustain itself in the complete absence of any subject matter.</p></blockquote>
<p>These sorts of thoughts have become more common lately in different fields. Psychologists admit to a <A HREF="http://blogs.nature.com/news/2012/11/psychologists-do-some-soul-searching.html">crisis of replication</A> as some of their most interesting findings turn out to be spurious. And in medicine, John Ioannides and others have been criticizing the research for a decade now and telling everyone they need to up their standards.</p>
<p>&#8220;Up your standards&#8221; has been a complicated demand that cashes out in a lot of technical ways. But there is broad agreement among the most intelligent voices I read (<A HREF="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/">1</A>, <A HREF="http://lesswrong.com/lw/ajj/how_to_fix_science/">2</A>, <A HREF="http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer">3</A>, <A HREF="http://blogs.plos.org/mindthebrain/author/jcyone/">4</A>, <A HREF="http://www.haas.berkeley.edu/groups/online_marketing/facultyCV/papers/nelson_false-positive.pdf">5</A>) about a couple of promising directions we could go:</p>
<p>1. Demand very large sample size.</p>
<p>2. Demand replication, preferably exact replication, most preferably multiple exact replications.</p>
<p>3. Trust systematic reviews and meta-analyses rather than individual studies. Meta-analyses must prove homogeneity of the studies they analyze.</p>
<p>4. Use Bayesian rather than frequentist analysis, or even combine both techniques.</p>
<p>5. Stricter p-value criteria. It is far too easy to massage p-values to get less than 0.05. Also, make meta-analyses look for &#8220;p-hacking&#8221; by examining the distribution of p-values in the included studies.</p>
<p>6. Require pre-registration of trials.</p>
<p>7. Address publication bias by searching for unpublished trials, displaying funnel plots, and using statistics like &#8220;fail-safe N&#8221; to investigate the possibility of suppressed research.</p>
<p>8. Do heterogeneity analyses or at least observe and account for differences in the studies you analyze.</p>
<p>9. Demand randomized controlled trials. None of this &#8220;correlated even after we adjust for confounders&#8221; BS.</p>
<p>10. Stricter effect size criteria. It&#8217;s easy to get small effect sizes in <i>anything</i>.</p>
<p>If we follow these ten commandments, then we avoid the problems that allowed parapsychology and probably a whole host of other problems we don&#8217;t know about to sneak past the scientific gatekeepers.</p>
<p>Well, <A HREF="http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID2427865_code1602198.pdf?abstractid=2423692&#038;mirid=1">what now, motherfuckers?</A></p>
<p><b>II.</b></p>
<p>Bem, Tressoldi, Rabeyron, and Duggan (2014), full text available for download at the top bar of the link above, is parapsychology&#8217;s way of saying &#8220;thanks but no thanks&#8221; to the idea of a more rigorous scientific paradigm making them quietly wither away.</p>
<p>You might remember Bem as the prestigious establishment psychologist who decided to try his hand at parapsychology and to his and everyone else&#8217;s surprise got positive results. Everyone had a lot of criticisms, some of which were <A HREF="http://www.talyarkoni.org/blog/2011/01/10/the-psychology-of-parapsychology-or-why-good-researchers-publishing-good-articles-in-good-journals-can-still-get-it-totally-wrong/">very very good</A>, and the study <A HREF="http://news.discovery.com/human/psychology/controversial-esp-study-fails-yet-again-120912.htm">failed replication several times</A>. Case closed, right?</p>
<p>Earlier this month Bem came back with a meta-analysis of ninety replications from tens of thousands of participants in thirty three laboratories in fourteen countries confirming his original finding, p < 1.2 * -10<sup>10</sup>, Bayes factor 7.4 * 10<sup>9</sup>, funnel plot beautifully symmetrical, p-hacking curve nice and right-skewed, Orwin fail-safe n of 559, et cetera, et cetera, et cetera.</p>
<p>By my count, Bem follows all of the commandments except [6] and [10]. He apologizes for not using pre-registration, but says it&#8217;s okay because the studies were exact replications of a previous study that makes it impossible for an unsavory researcher to change the parameters halfway through and does pretty much the same thing. And he apologizes for the small effect size but points out that some effect sizes are legitimately very small, this is no smaller than a lot of other commonly-accepted results, and that a high enough p-value ought to make up for a low effect size.</p>
<p>This is <i>far</i> better than the average meta-analysis. Bem has always been pretty careful and this is no exception.</p>
<p>So &#8211; once again &#8211; what now, motherfuckers?</p>
<p><b>III.</b></p>
<p>In retrospect, that list of ways to fix science above was a little optimistic.</p>
<p>The first nine items (large sample sizes, replications, low p-values, Bayesian statistics, meta-analysis, pre-registration, publication bias, heterogeneity) all try to solve the same problem: accidentally mistaking noise in the data for a signal.</p>
<p>We&#8217;ve placed so much emphasis on not mistaking noise for signal that when someone like Bem hands us a beautiful, perfectly clear signal on a silver platter, it briefly stuns us. &#8220;Wow, of the three hundred different terrible ways to mistake noise for signal, Bem has proven beyond a shadow of a doubt he hasn&#8217;t done any of them.&#8221; And we get so stunned we&#8217;re likely to forget that this is only part of the battle.</p>
<p>Bem definitely picked up a signal. The only question is whether it&#8217;s a signal of psi, or a signal of poor experimental technique.</p>
<p><i>None</i> of these five techniques even <i>touch</i> poor experimental technique &#8211; or confounding, or whatever you want to call it. If an experiment is confounded, if it produces a strong signal even when its experimental hypothesis is true, then using a larger sample size will just make that signal even stronger. </p>
<p>Replicating it will just reproduce the confounded results again. </p>
<p>Low p-values will be easy to get if you perform the confounded experiment on a large enough scale.</p>
<p>Meta-analyses of confounded studies will obey the immortal law of &#8220;garbage in, garbage out&#8221;.</p>
<p>Pre-registration only assures that your study will not get any worse than it was the first time you thought of it, which may be very bad indeed.</p>
<p>Searching for publication bias only means you will get <i>all</i> of the confounded studies, instead of just some of them.</p>
<p>Heterogeneity just tells you whether all of the studies were confounded about the same amount. </p>
<p>Bayesian statistics, alone among these first eight, ought to be able to help with this problem. After all, a good Bayesian should be able to say &#8220;Well, I got some impressive results, but my prior for psi is very low, so this raises my belief in psi slightly, but raises my belief that the experiments were confounded <i>a lot</i>.&#8221;</p>
<p>Unfortunately, good Bayesians are hard to come by. People like to mock Less Wrong, saying we&#8217;re amateurs getting all starry-eyed about Bayesian statistics even while real hard-headed researchers who have been experts in them for years understand both their uses and their limitations. Well, maybe that&#8217;s true of some researchers. But the particular ones I see talking about Bayes <i>here</i> could do with reading the Sequences. Here&#8217;s Bem:<br />
<blockquote>An opportunity to calculate an approximate answer to this question emerges from a Bayesian critique of Bem’s (2011) experiments by Wagenmakers, Wetzels, Borsboom, &#038; van der Maas (2011). Although Wagenmakers et al. did not explicitly claim psi to be impossible, they came very close by setting their prior odds at 10^20 against the psi hypothesis. The Bayes Factor for our full database is approximately 10^9 in favor of the psi hypothesis (Table 1), which implies that our meta-analysis should lower their posterior odds against the psi hypothesis to 10^11</p></blockquote>
<p>Let me shame both participants in this debate.</p>
<p>Bem, you are abusing Bayes factor. If Wagenmakers uses your 10^9 Bayes factor to adjust from his prior of 10^-20 to 10^-11, then what happens the next time you come up with another database of studies supporting your hypothesis? We all know you will, because you&#8217;ve amply proven these results weren&#8217;t due to chance, so whatever factor produced these results &#8211; whether real psi or poor experimental technique &#8211; will no doubt keep producing them for the next hundred replication attempts. When those come in, does Wagenmakers have to adjust his probability from 10^-11 to 10^-2? When you get another hundred studies, does he have to go from 10^-2 to 10^7? If so, then by <A HREF="http://lesswrong.com/lw/ii/conservation_of_expected_evidence/">conservation of expected evidence</A> he should just update to 10^+7 right now &#8211; or really to infinity, since you can keep coming up with more studies till the cows come home. But in fact he shouldn&#8217;t do that, because at some point his thought process becomes &#8220;Okay, I already know that studies of this quality can consistently produce positive findings, so either psi is real or studies of this quality aren&#8217;t good enough to disprove it&#8221;. This point should probably happen well before he increases his probability by a factor of 10^9. See <A HREF="http://lesswrong.com/lw/3be/confidence_levels_inside_and_outside_an_argument/">Confidence Levels Inside And Outside An Argument</A> for this argument made in greater detail.</p>
<p>Wagenmakers, you are overconfident. Suppose God came down from Heaven and said in a booming voice &#8220;EVERY SINGLE STUDY IN THIS META-ANALYSIS WAS CONDUCTED PERFECTLY WITHOUT FLAWS OR BIAS, AS WAS THE META-ANALYSIS ITSELF.&#8221; You would see a p-value of less than 1.2 * 10^-10 and think &#8220;I bet that was just coincidence&#8221;? And then they could do another study of the same size, also God-certified, returning exactly the same results, and you would say &#8220;I bet that was just coincidence too&#8221;? YOU ARE NOT THAT CERTAIN OF ANYTHING. Seriously, <i>read the @#!$ing Sequences</i>.</p>
<p>Bayesian statistics, at least the way they are done here, aren&#8217;t gong to be of much use to anybody.</p>
<p>That leaves randomized controlled trials and effect sizes.</p>
<p>Randomized controlled trials are great. They eliminate most possible confounders in one fell swoop, and are excellent at keeping experimenters honest. Unfortunately, most of the studies in the Bem meta-analysis were already randomized controlled trials.</p>
<p>High effect sizes are really the only thing the Bem study lacks. And it is very hard to experimental technique so bad that it consistently produces a result with a high effect size.</p>
<p>But as Bem points out, demanding high effect size limits our ability to detect real but low-effect phenomena. Just to give an example, many physics experiments &#8211; like the ones that detected the Higgs boson or neutrinos &#8211; rely on detecting extremely small perturbations in the natural order, over millions of different trials. Less esoterically, Bem mentions the example of aspirin decreasing heart attack risk, which it definitely does and which is very important, but which has an effect size lower than that of his psi results. If humans have some kind of <i>very weak</i> psionic faculty that under regular conditions operates poorly and inconsistently, but does indeed exist, then excluding it by definition from the realm of things science can discover would be a bad idea.</p>
<p>All of these techniques are about reducing the chance of confusing noise for signal. But when we think of them as the be-all and end-all of scientific legitimacy, we end up in awkward situations where they come out super-confident in a study&#8217;s accuracy simply because the issue was one they weren&#8217;t geared up to detect. Because a lot of the time the problem is something more than just noise.</p>
<p><b>IV.</b></p>
<p>Wiseman &#038; Schlitz&#8217;s <A HREF="http://www.richardwiseman.com/resources/staring1.pdf">Experimenter Effects And The Remote Detection Of Staring</A> is my favorite parapsychology paper ever and sends me into fits of nervous laughter every time I read it.</p>
<p>The backstory: there is a classic parapsychological experiment where a subject is placed in a room alone, hooked up to a video link. At random times, an experimenter stares at them menacingly through the video link. The hypothesis is that this causes their galvanic skin response (a physiological measure of subconscious anxiety) to increase, even though there is no non-psychic way the subject could know whether the experimenter was staring or not. </p>
<p>Schiltz is a psi believer whose staring experiments had consistently supported the presence of a psychic phenomenon. Wiseman, in accordance with <A HREF="http://en.wikipedia.org/wiki/Nominative_determinism">nominative determinism</A> is a psi skeptic whose staring experiments keep showing nothing and disproving psi. Since they were apparently the only two people in all of parapsychology with a smidgen of curiosity or rationalist virtue, they decided to team up and figure out why they kept getting such different results.</p>
<p>The idea was to plan an experiment together, with both of them agreeing on every single tiny detail. They would then go to a laboratory and set it up, again both keeping close eyes on one another. Finally, they would conduct the experiment in a series of different batches. Half the batches (randomly assigned) would be conducted by Dr. Schlitz, the other half by Dr. Wiseman. Because the two authors had very carefully standardized the setting, apparatus and procedure beforehand, &#8220;conducted by&#8221; pretty much just meant greeting the participants, giving the experimental instructions, and doing the staring.</p>
<p>The results? Schlitz&#8217;s trials found strong evidence of psychic powers, Wiseman&#8217;s trials found no evidence whatsoever.</p>
<p>Take a second to reflect on how this <i>makes no sense</i>. Two experimenters in the same laboratory, using the same apparatus, having no contact with the subjects except to introduce themselves and flip a few switches &#8211; and whether one or the other was there that day completely altered the result. For a good time, watch the gymnastics they have to do to in the paper to make this sound sufficiently sensical to even get published. This is the only journal article I&#8217;ve ever read where, in the part of the Discussion section where you&#8217;re supposed to propose possible reasons for your findings, both authors suggest maybe their co-author hacked into the computer and altered the results.</p>
<p>While it&#8217;s nice to see people exploring Bem&#8217;s findings further, <i>this</i> is the experiment people should be replicating ninety times. I expect <i>something</i> would turn up. </p>
<p>As it is, Kennedy and Taddonio <A HREF="http://jeksite.org/psi/jp76.pdf">list ten similar studies</A> with similar results. One cannot help wondering about publication bias (if the skeptic and the believer got similar results, who cares?). But the phenomenon is sufficiently well known in parapsychology that it has led to its own host of theories about how skeptics emit negative auras, or the enthusiasm of a proponent is a necessary kindling for psychic powers.</p>
<p>Other fields don&#8217;t have this excuse. In psychotherapy, for example, practically the only consistent finding is that whatever kind of psychotherapy the person running the study likes is most effective. Thirty different meta-analyses on the subject have confirmed this with strong effect size (d = 0.54) and good significance (p = .001).</p>
<p>Then there&#8217;s <A HREF="http://criticalscience.com/researcher-allegiance-psychotherapy-research-bias.html">Munder (2013)</A>, which is a meta-meta-analysis on whether meta-analyses of confounding by researcher allegiance effect were themselves meta-confounded by meta-researcher allegiance effect. He found that indeed, meta-researchers who believed in researcher allegiance effect were more likely to turn up positive results in their studies of researcher allegiance effect (p < .002).     It gets worse. There's <A HREF="http://www.npr.org/blogs/health/2012/09/18/161159263/teachers-expectations-can-influence-how-students-perform">a famous story</A> about an experiment where a scientist told teachers that his advanced psychometric methods had predicted a couple of kids in their class were about to become geniuses (the students were actually chosen at random). He followed the students for the year and found that their intelligence actually increased. This was supposed to be a Cautionary Tale About How Teachers&#8217; Preconceptions Can Affect Children.</p>
<p>Less famous is that the same guy did the same thing with rats. He sent one laboratory a box of rats saying they were specially bred to be ultra-intelligent, and another lab a box of (identical) rats saying they were specially bred to be slow and dumb. Then he had them do standard rat learning tasks, and sure enough the first lab found very impressive results, the second lab very disappointing ones.</p>
<p>This scientist &#8211; let&#8217;s give his name, Robert Rosenthal &#8211; <A HREF="http://www.lscp.net/persons/dupoux/teaching/JOURNEE_AUTOMNE_CogMaster_2011-12/readings_deontology/Rosenthal_1994_interpersonal_expectancy_effects_a_review.pdf">then investigated three hundred forty five different studies</A> for evidence of the same phenomenon. He found effect sizes of anywhere from 0.15 to 1.7, depending on the type of experiment involved. Note that this could also be phrased as &#8220;between twice as strong and twenty times as strong as Bem&#8217;s psi effect&#8221;. Mysteriously, animal learning experiments displayed the highest effect size, supporting the folk belief that animals are hypersensitive to subtle emotional cues.</p>
<p>Okay, fine. Subtle emotional cues. That&#8217;s way more scientific than saying &#8220;negative auras&#8221;. But the question remains &#8211; what went wrong for Schlitz and Wiseman? Even if Schlitz had done everything short of saying &#8220;The hypothesis of this experiment is for your skin response to increase when you are being stared at, please increase your skin response at that time,&#8221; and subjects had tried to comply, the whole point was that they didn&#8217;t <i>know</i> when they were being stared at, because to find that out you&#8217;d have to be psychic. And how are these rats figuring out what the experimenters&#8217; subtle emotional cues mean anyway? <i>I</i> can&#8217;t figure out people&#8217;s subtle emotional cues half the time!</p>
<p>I know that standard practice here is to tell <A HREF="http://en.wikipedia.org/wiki/Clever_Hans">the story of Clever Hans</A> and then say That Is Why We Do Double-Blind Studies. But first of all, I&#8217;m pretty sure no one does double-blind studies with rats. Second of all, I think most social psych studies aren&#8217;t double blind &#8211; I just checked the first one I thought of, Aronson and Steele on stereotype threat, and it certainly wasn&#8217;t. Third of all, this effect seems to be just as common in cases where it&#8217;s hard to imagine how the researchers&#8217; subtle emotional cues could make a difference. Like Schlitz and Wiseman. Or like the psychotherapy experiments, where most of the subjects were doing therapy with individual psychologists and never even saw whatever prestigious professor was running the study behind the scenes.</p>
<p>I think it&#8217;s a combination of subconscious emotional cues, subconscious statistical trickery, perfectly conscious fraud which for all we know happens much more often than detected, and things we haven&#8217;t discovered yet which are at least as weird as subconscious emotional cues. But rather than speculate, I prefer to take it as a brute fact. Studies are going to be confounded by the allegiance of the researcher. When researchers who don&#8217;t believe something discover it, that&#8217;s when it&#8217;s worth looking into.</p>
<p><b>V.</b></p>
<p>So what exactly happened to Bem?</p>
<p>Although Bem looked hard to find unpublished material, I don&#8217;t know if he succeeded. Unpublished material, in this context, has to mean &#8220;material published enough for Bem to find it&#8221;, which in this case was mostly things presented at conferences. What about results so boring that they were never even mentioned?</p>
<p>And I predict people who believe in parapsychology are more likely to conduct parapsychology experiments than skeptics. Suppose this is true. And further suppose that for some reason, experimenter effect is real and powerful. That means most of the experiments conducted will support Bem&#8217;s result. But this is still a weird form of &#8220;publication bias&#8221; insofar as it ignores the contrary results of hypotheticaly experiments that were never conducted.</p>
<p>And worst of all, maybe Bem really did do an excellent job of finding every little two-bit experiment that no journal would take. How much can we trust these non-peer-reviewed procedures?</p>
<p>I looked through his list of ninety studies for all the ones that were both exact replications and had been peer-reviewed (with one caveat to be mentioned later). I found only seven:</p>
<p>Batthyany, Kranz, and Erber: .268<br />
Ritchie 1: 0.015<br />
Ritchie 2: -0.219<br />
Richie 3: -0.040<br />
Subbotsky 1: 0.279<br />
Subbotsky 2: 0.292<br />
Subbotsky 3: -.399</p>
<p>Three find large positive effects, two find approximate zero effects, and two find large negative effects. Without doing any calculatin&#8217;, this seems pretty darned close to chance for me.</p>
<p>Okay, back to that caveat about replications. One of Bem&#8217;s strongest points was how many of the studies included were exact replications of his work. This is important because if you do your own novel experiment, it leaves a lot of wiggle room to keep changing the parameters and statistics a bunch of times until you get the effect you want. This is why lots of people want experiments to be preregistered with specific committments about what you&#8217;re going to test and how you&#8217;re going to do it. These experiments weren&#8217;t preregistered, but conforming to a previously done experiment is a pretty good alternative.</p>
<p>Except that I think the criteria for &#8220;replication&#8221; here were exceptionally loose. For example, Savva et al was listed as an &#8220;exact replication&#8221; of Bem, but it was performed in 2004 &#8211; seven years before Bem&#8217;s original study took place. I know Bem believes in precognition, but that&#8217;s going <i>too far</i>. As far as I can tell &#8220;exact replication&#8221; here means &#8220;kinda similar psionic-y thing&#8221;. Also, Bem classily lists his own experiments as exact replications of themselves, which gives a big boost to the &#8220;exact replications return the same results as Bem&#8217;s original studies&#8221; line. I would want to see much stricter criteria for replication before I relax the &#8220;preregister your trials&#8221; requirement.</p>
<p>(Richard Wiseman &#8211; the same guy who provided the negative aura for the Wiseman and Schiltz experiment &#8211; has started <A HREF="http://www.richardwiseman.com/BemReplications.shtml">a pre-register site for Bem replications</A>. He says he has received five of them. This is very promising. There is also <A HREF="http://www.koestler-parapsychology.psy.ed.ac.uk/TrialRegistry.html">a separate pre-register for parapsychology trials in general</A>. I am both extremely pleased at this victory for good science, and ashamed that my own field is apparently behind parapsychology in the &#8220;scientific rigor&#8221; department)</p>
<p>That is my best guess at what happened here &#8211; a bunch of poor-quality, peer-unreviewed studies that weren&#8217;t as exact replications as we would like to believe, all subject to mysterious experimenter effects.</p>
<p>This is not a criticism of Bem or a criticism of parapsychology. It&#8217;s something that is inherent to the practice of meta-analysis, and even more, inherent to the practice of science. Other than a few very exceptional large medical trials, there is not a study in the world that would survive the level of criticism I am throwing at Bem right now.</p>
<p>I think Bem is wrong. The level of criticism it would take to prove a wrong study wrong is higher than that almost any existing study can withstand. That is not encouraging for existing studies.</p>
<p><b>VI.</b></p>
<p>The motto of the Royal Society &#8211; Hooke, Boyle, Newton, some of the people who arguably invented modern science &#8211; was <i>nullus in verba</i>, &#8220;take no one&#8217;s word&#8221;.</p>
<p>This was a proper battle cry for seventeenth century scientists. Think about the (admittedly kind of mythologized) history of Science. The scholastics saying that matter was this, or that, and justifying themselves by long treatises about how based on A, B, C, the word of the Bible, Aristotle, self-evident first principles, and the Great Chain of Being all clearly proved their point. Then other scholastics would write different long treatises on how D, E, and F, Plato, St. Augustine, and the proper ordering of angels all indicated that clearly matter was something different. Both groups were pretty sure that the other had make a subtle error of reasoning somewhere, and both groups were perfectly happy to spend centuries debating exactly which one of them it was.</p>
<p>And then Galileo said &#8220;Wait a second, instead of debating exactly how objects fall, let&#8217;s just drop objects off of something really tall and see what happens&#8221;, and after that, Science.</p>
<p>Yes, it&#8217;s kind of mythologized. But like all myths, it contains a core of truth. People are terrible. If you let people debate things, they will do it forever, come up with horrible ideas, get them entrenched, play politics with them, and finally reach the point where they&#8217;re coming up with theories why people who disagree with them are probably secretly in the pay of the Devil. </p>
<p>Imagine having to conduct the global warming debate, except that you couldn&#8217;t appeal to scientific consensus and statistics because scientific consensus and statistics hadn&#8217;t been invented yet. In a world without science, <i>everything</i> would be like that.</p>
<p>Heck, just look at <i>philosophy</i>.</p>
<p>This is the principle behind the Pyramid of Scientific Evidence. The lowest level is your personal opinions, no matter how ironclad you think the logic behind them is. Just above that is expert opinion, because no matter how expert someone is they&#8217;re still only human. Above that is anecdotal evidence and case studies, because even though you&#8217;re finally getting out of people&#8217;s heads, it&#8217;s still possible for the content of people&#8217;s heads to influence which cases they pay attention to. At each level, we distill away more and more of the human element, until presumably at the top the dross of humanity has been purged away entirely and we end up with pure unadulterated reality.</p>
<p><center><IMG SRC="http://slatestarcodex.com/blog_images/se_pyramid.png"></p>
<p><i>The Pyramid of Scientific Evidence</i></center></p>
<p>And for a while this went <i>well</i>. People would drop things off towers, or see how quickly gases expanded, or observe chimpanzees, or whatever.</p>
<p>Then things started getting more complicated. People started investigating more subtle effects, or effects that shifted with the observer. The scientific community became bigger, everyone didn&#8217;t know everyone anymore, you needed more journals to find out what other people had done. Statistics became more complicated, allowing the study of noisier data but also bringing more peril. And a lot of science done by smart and honest people ended up being wrong, and we needed to figure out exactly which science that was.</p>
<p>And the result is a lot of essays like this one, where people who think they&#8217;re smart take one side of a scientific &#8220;controversy&#8221; and say which studies you should believe. And then other people take the other side and tell you why you should believe different studies than the first person thought you should believe. And there is much argument and many insults and citing of authorities and interminable debate for, if not centuries, at least a pretty long time.</p>
<p>The highest level of the Pyramid of Scientific Evidence is meta-analysis. But a lot of meta-analyses are crap. This meta-analysis got p < 1.2 * 10^-10 for a conclusion I'm pretty sure is false, and <i>it isn&#8217;t even one of the crap ones</i>. Crap meta-analyses look <A HREF="http://www.psychologytoday.com/blog/the-skeptical-sleuth/201112/editor-should-have-caught-bias-and-flaws-in-review-mental-health-ef">more like this</A>, or even worse. </p>
<p>How do I know it&#8217;s crap? Well, I use my personal judgment. How do I know my personal judgment is right? Well, a smart well-credentialed person like James Coyne agrees with me. How do I know James Coyne is smart? I can think of lots of cases where he&#8217;s been right before. How do I know those count? Well, John Ioannides has published a lot of studies analyzing the problems with science, and confirmed that cases like the ones Coyne talks about are pretty common. Why can I believe Ioannides&#8217; studies? Well, there have been good meta-analyses of them. But how do I know if those meta-analyses are crap or not? Well&#8230;</p>
<p><center><IMG SRC="http://slatestarcodex.com/blog_images/se_ouroboros.png"></p>
<p><i>The Ouroboros of Scientific Evidence</i></center></p>
<p>Science! YOU WERE THE CHOSEN ONE! It was said that you would destroy reliance on biased experts, not join them! Bring balance to epistemology, not leave it in darkness! </p>
<p><center><IMG SRC="http://slatestarcodex.com/blog_images/se_obiwan.png"></p>
<p><i>I LOVED YOU!!!!</i></center></p>
<p><b>Edit:</b> <A HREF="http://andrewgelman.com/2013/08/25/a-new-bem-theory/">Conspiracy theory</A> by Andrew Gelman</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/feed/</wfw:commentRss>
		<slash:comments>191</slash:comments>
		</item>
		<item>
		<title>Based on your findings, which theory about alien thickness seems most valid or most accurate?</title>
		<link>http://slatestarcodex.com/2014/02/02/based-on-your-findings-which-theory-about-alien-thickness-seems-most-valid-or-most-accurate/</link>
		<comments>http://slatestarcodex.com/2014/02/02/based-on-your-findings-which-theory-about-alien-thickness-seems-most-valid-or-most-accurate/#comments</comments>
		<pubDate>Sun, 02 Feb 2014 20:35:01 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=1420</guid>
		<description><![CDATA[Seventh-grade science students with flexible ethics: you&#8217;ve come to the right place! Every so often I look at the search terms that led people to this blog. Most of them are what you would expect, but one of the top &#8230; <a href="http://slatestarcodex.com/2014/02/02/based-on-your-findings-which-theory-about-alien-thickness-seems-most-valid-or-most-accurate/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Seventh-grade science students with flexible ethics: you&#8217;ve come to the right place!</p>
<p>Every so often I look at the search terms that led people to this blog. Most of them are what you would expect, but one of the top search terms, one that keeps showing up again and again and again, is &#8220;based on your findings, which theory about alien thickness seems most valid or most accurate?&#8221;</p>
<p>I feel like at some point I must have mentioned the words &#8220;aliens&#8221; and &#8220;theory&#8221; or &#8220;thickness&#8221; together by accident, and that started it off. Then at one point I commented on it, just wrote a paragraph on how weird it was that I keep getting these alien thickness people, and since that caused the entire phrase to be on my blog in one piece, it opened the floodgates and now I can&#8217;t stop getting curious alien thickness theorists.</p>
<p>So today I finally decided to figure out what was going on, and Google led me a 7th grade science class at Madisonville Junior High School, Los Angeles. As best I can tell, this class&#8217;s teacher gives her students a <A HREF="http://teacherweb.com/LA/MadisonvilleJuniorHighSchool/Blankenship/application-questions.docx">homework assignment</A> that includes various questions on genetics and ecology, most of which make sense.</p>
<p><b>[EDIT: or it may be a national/statewide curriculum, which only that teacher has put online. That would explain the large number of search terms better than a single class would]</b></p>
<p>But on question 25, it suddenly jumps to a question about alien thickness which is completely inexplicable by the terms of everything that has come before:<br />
<blockquote>25. Based on your findings, which theory about alien thickness seems most valid or most accurate?<br />
A. Alien thickness is mainly affected by sunlight. The thickness may be used as a way to shield out the sunlight.<br />
B. Alien thickness is mainly affected by temperature. This may be because aliens become dehydrated (lose water) at higher temperatures and become thinner.<br />
C. Alien thickness is mainly affected by temperature. The greater thickness at lower temperatures may be used as a way to stay warmer in colder weather.<br />
D. Alien thickness seems to change at random. It does not seem to be affected</p></blockquote>
<p>Although there are a few other questions that deal with a changing population of thick and thin aliens, none of them give any information on sunlight, temperature, or hydration status. So I&#8217;m not surprised that what, by the numbers, has to be <i>every single student in her class</i> decides to Google the question to try to find the answer online.</p>
<p>I looked around myself and eventually found <A HREF="http://www.explorelearning.com/index.cfm?method=cResource.dspView&#038;ResourceID=397">this web gizmo</A>, where you adjust water, temperature, and sunlight to a group of little aliens and it tells you about their changing phenotypes. This has <i>got</i> to be the source of the test question, but it&#8217;s not mentioned on the test and judging by student confusion they don&#8217;t know about it. Maybe it&#8217;s mentioned in class one day and then it ends up on a worksheet a month later and nobody remembers it anymore? In any case, dozens of students keep entering it into Google and ending up at my blog.</p>
<p>And this is bad because the blog entries where I mention how weird it is that I&#8217;m getting all these alien thickness people <i>also</i> mention lots of things that are really really inappropriate for seventh graders, yet likely seventh-graders are finding them through alien-thickness-motivated confusion. So my new plan is to direct them all here, to answer their question, and let them go on their way and maybe finish their homework a little quicker. I don&#8217;t feel bad about this since any question that gets an entire class trying to cheat <i>en masse</i> must be pretty flawed and since <A HREF="http://www.livescience.com/19379-homework-bad-kids.html">homework is</A> <A HREF="http://www.alfiekohn.org/books/hm.htm">mostly bad</A> <A HREF="http://www.thecaseagainsthomework.com/">for kids anyway</A>. </p>
<p>So, seventh-graders! Still with me? Let&#8217;s talk alien thickness!</p>
<p>Start by going to <A HREF="http://www.explorelearning.com/index.cfm?method=cResource.dspView&#038;ResourceID=397">the gizmo</A> and adjust the different sliders from lowest to highest one-at-a-time, while watching the bar graph measuring alien thickness. You will notice that adjusting the water slider from highest to lowest doesn&#8217;t change thickness. Likewise, adjusting the sunlight slider from highest to lowest doesn&#8217;t change thickness. But adjusting the temperature slider while holding the other two constant <i>does</i> change thickness. So we conclude that thickness probably depends on temperature.</p>
<p>So now we can eliminate all the answers except B and C, the ones that say that alien thickness is affected by temperature. How do we distinguish between these two?</p>
<p>Well, B says that temperature only affects aliens indirectly, through its effect on dehydration. But if that were true, we would expect preventing the aliens from getting dehydrated to remove the effect of temperature. But this doesn&#8217;t happen &#8211; no matter how high the water slider is, moving the temperature slider still causes the aliens to shift from thick to thin. So the effect of temperature doesn&#8217;t depend on hydration.</p>
<p>Armed with this knowledge it should be pretty simple to pick the correct answer through process of elimination.</p>
<p>Let&#8217;s move on to question 26:<br />
<blockquote>26. Based on the data you found, about how many of the 100 aliens would become thin if the temperature were 35°C?<br />
A. fewer than 10<br />
B. about 50<br />
C. about 80<br />
D. more than 90  </p></blockquote>
<p>You notice that at temperature 20 degrees, about fifty aliens are thin. At 25, about seventy aliens are thin. And at 30, about eighty-eight aliens are thin. The take-home point is that the higher the temperature, the more aliens we expect to be thin. So at 35 degrees, we would expect more aliens to be thin than the eighty-eight who are thin at 30 degrees. Which option best reflects that expectation?</p>
<p>So there&#8217;s your answer. But there&#8217;s a more important meta-point here. Your teacher wouldn&#8217;t include a nonsensical question on the worksheet, so clearly in 26 she expects you to be able to calculate alien thickness based on temperature. So just by reading 26, you know the answer to 25 is one that says thickness is based on temperature. So you can eliminate A and D and be left with a 50% chance of getting it right. And without looking at the original data, you can conclude that it&#8217;s probably not B, since it says only temperature matters but the explanation implies that water and hydration status matter as well. So really, even if your teacher forgot to link you to the gizmo thing, you should be able to guess the right answer based on test-taking skills alone.</p>
<p>A story from my own life &#8211; my first month of medical residency, my schedule was extremely disorganized and I ended up starting a class they day they were having their final exam. This exam happened to be on the treatment of radioactivity-related injuries, a field of medicine I was unaware existed until that moment. Because of inconsistent answers, clues in other questions, and basic common sense, I was able to guess well and ended up getting a B- (the class average was a C). </p>
<p>My point is, test-taking skills matter.</p>
<p>I haven&#8217;t gotten any Google search queries asking about any of the other questions, so I&#8217;m going to assume you&#8217;ve got all of those down. Good job, seventh-grade science students!</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2014/02/02/based-on-your-findings-which-theory-about-alien-thickness-seems-most-valid-or-most-accurate/feed/</wfw:commentRss>
		<slash:comments>56</slash:comments>
		</item>
		<item>
		<title>Science &amp; Medicine Links for August</title>
		<link>http://slatestarcodex.com/2013/08/10/science-medicine-links-for-august/</link>
		<comments>http://slatestarcodex.com/2013/08/10/science-medicine-links-for-august/#comments</comments>
		<pubDate>Sat, 10 Aug 2013 12:07:37 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[medicine]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[studies]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=902</guid>
		<description><![CDATA[Case report from the BMJ that would also make a good Twilight Zone episode: Woman hallucinates ghost children. Husband takes pictures of scene to try to prove that there&#8217;s nobody there. Woman sees exact same hallucinations in the photographs. Woman &#8230; <a href="http://slatestarcodex.com/2013/08/10/science-medicine-links-for-august/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Case report from the BMJ that would also make a good Twilight Zone episode: Woman hallucinates ghost children. Husband takes pictures of scene to try to prove that there&#8217;s nobody there. Woman <A HREF="http://mindhacks.com/2013/05/29/photographing-hallucinations/">sees exact same hallucinations in the photographs</A>. Woman takes some psychiatric drugs, mostly stops having hallucinations, but still sees the hallucinatory ghost children in the (empty to everyone else) old photos. Psychiatry is <i>weird</i>, and/or possibly haunted.</p>
<p>A very strange but creative methodology with by which to study the notoriously complicated field of diet: scientists find that <A HREF="http://lesswrong.com/r/discussion/lw/hoh/weak_evidence_that_eating_vegetables_makes_you/">a gene that makes vegetables taste better also makes you live longer</A>. Weak evidence suggesting that eating more vegetables makes you live longer? Maybe!</p>
<p><A HREF="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3222234/">A Critical Review of the First Ten Years of Candidate Gene by Environment Interaction Research in Psychiatry</A>. Key phrase: &#8220;Ninety-six percent of novel cG×E studies were significant compared with 27% of replication attempts.&#8221; Note that gene x environment interaction studies are a very particular kind of study that is especially easy to do bad work on and this might not generalize to other types of genetics research &#8211; but that at least to some degree it probably does.</p>
<p>A while ago there was great excitement at the discovery that the drug rapamycin extended lifespan in mice. Although this finding has since been replicated and seems broadly correct, the bad news seems to be that <A HREF="http://www.jci.org/articles/view/67674">is now clear</A> that the drug <A HREF="http://www.sciencedaily.com/releases/2013/07/130725141715.htm">just treats some very specific deadly pathologies</A> (like cancer) and does not fight or slow aging. Although if the bad news is that a drug cures cancer, we&#8217;re still doing pretty well.</p>
<p>But if you absolutely must have some miracle substance that might cure aging in lab animals to be excited about, you&#8217;ll be happy to know that <A HREF="http://www.sci-news.com/biology/article01170-rhodiola-extract-lifespan-drosophila.html">rhodiola extends the lifespans of fruit flies 24% and delays age-related loss in physical performance</A>. Also it <A HREF="http://en.wikipedia.org/wiki/Rhodiola_rosea#Scientific_evidence">might be a nootropic or antidepressant or something</A>.</p>
<p>Speaking of miracles, <A HREF="http://psychcentral.com/news/2013/06/19/skin-abnormality-may-prove-biological-basis-for-fibromyalgia/56233.html">Skin Abnormality May Prove Biological Basis For Fibromyalgia</A>. I predict this will probably turn out to be nothing, the same way everyone was super excited a few years ago that we&#8217;d discovered that the <i>real</i> cause of multiple sclerosis was venous outflow obstruction and then it didn&#8217;t replicate, but until then at least fibromyalgia sufferers will get a few good years in of &#8220;SEE! I TOLD YOU IT WAS BIOLOGICAL AND YOU DIDN&#8217;T BELIEVE ME!&#8221;</p>
<p>Not technically a study but a good thing to include here: <A HREF="http://pipeline.corante.com/archives/2013/06/21/eight_toxic_foods_a_little_chemical_education.php">Eight Toxic Foods and a Little Chemical Education</A>. Describes some of the scare claims the media sometimes makes about chemicals and health risks and dissects them carefully and rigorously.</p>
<p>And if that was too basic for you, here&#8217;s the Epic-level version of the same thing: <A HREF="http://thelastpsychiatrist.com/2010/04/deconstructing_a_promotional_s.html">the Last Psychiatrist dissects claims made in a presentation on the drug Geodon</A>. This is old, but I just found it and it terrifies me, in that I thought I knew what to look for and yet this study would have completely passed all the filters I usually have to protect myself from this sort of thing. A good example of how a drug company can run a seemingly rigorous study that stays far away from anything even resembling data falsification or cover-ups &#8211; and yet still get exactly the results they want.</p>
<p>Here&#8217;s Scientific American giving a good exposition of <A HREF="http://www.scribd.com/doc/155870078/Tononi-New-Hypothesis-Explains-Why-We-Sleep-Scientific-American">one of the best current theories about why we sleep</A>. Also, it apparently has evidence behind it now, which it didn&#8217;t the last time I heard about it. Still doesn&#8217;t really explain <A HREF="http://www.overcomingbias.com/2012/10/sleep-is-to-save-energy.html">why some people can go without sleep completely</A>, but maybe that&#8217;s why they brought in the &#8220;local sleep&#8221; points.</p>
<p>Back when people realized it was easy to get positive results from a drug for spurious reasons, they started adding a control group to the experiment. Now that people have realized it&#8217;s easy to get positive results from a controlled trial for spurious reasons, is it time to go one meta-level up and add a control experiment on to the study? One group takes an experiment used to &#8220;prove&#8221; that SSRIs cause gastric bleeding, compares it to dozens of &#8220;control experiments&#8221; run with drugs that don&#8217;t cause gastric bleeding, and finds that, although the real experiment reported positive results, it <A HREF="http://onlinelibrary.wiley.com/doi/10.1002/sim.5925/full">in fact does no different than placebo experiments</A>. This is <i>really</i> clever although probably impractical in most cases.</p>
<p>Psychotherapy over the Internet works at least as well and probably better than face-to-face psychotherapy, <A HREF="http://www.mediadesk.uzh.ch/articles/2013/psychotherapie-via-internet-wirkt-gleich-gut-oder-besser-wie-im-sprechzimmer_en.html">says a study this month</A>, adding to the small mountain of evidence saying the same. A friend of mine uses online psychotherapy and says it&#8217;s easier and more productive because the therapist is less of a Terrifying Authority Figure. Also good for people who want a psychologist who will have severe difficulty calling the cops on them and having them committed. Also good for social phobics who <i>are currently required to leave the house and hang out at a busy medical office if they want to get treatment for their social phobia who the heck came up with this system?</i></p>
<p>&#8220;Adoption study of human obesity&#8221; sounds like something you would get from a Things Scott Is Interested In Mad Libs, along with &#8220;utilitarian behavioral genetics&#8221; or &#8220;double-blind placebo-controlled cuddling of cute girls&#8221;. But it turns out this is a real field that various people have looked into, and the conclusion of <A HREF="http://books.google.com/books?id=Z9eBvuQccfkC&#038;pg=PA50&#038;lpg=PA50&#038;dq=adoption+study+obesity&#038;source=bl&#038;ots=X6OFYd6VKO&#038;sig=7qSVFW304KYwRn38vzFFkcvLsIk&#038;hl=en&#038;sa=X&#038;ei=P8v2Ufn_O8fuyQH8kYGABg&#038;ved=0CDAQ6AEwATgK#v=onepage&#038;q=adoption%20study%20obesity&#038;f=false">most of the studies</A>, including a <A HREF="http://www.ncbi.nlm.nih.gov/m/pubmed/3941707/">very rigorous one in Denmark published in NEJM</A> and a <A HREF="http://ajcn.nutrition.org/content/87/2/398.long">huge UK one by Robert Plomin</A> agree that whether the parents who raise you are obese has zero impact on whether you will become obese, but whether your biological parents whom you may never meet are obese has massive impact on whether you will become obese. This doesn&#8217;t completely disprove the idea that the childhood environment affects obesity &#8211; it could still be that whether or not parents are good at teaching their children not to be obese just has zero correlation with whether the parents themselves are obese &#8211; but it sure casts a lot of doubt on environmental hypotheses and confirms that genetics plays a very big role.</p>
<p>On the other hand, <A HREF="http://www.apa.org/pubs/journals/releases/psp-101-3-579.pdf">here&#8217;s a study from 2011</A> which shows that people with lower Conscientiousness and higher Impulsivity are much more likely to be obese &#8211; &#8220;Participants who scored in the top 10% of impulsivity weighed, on average, 11 kg more than those in the bottom 10%&#8221;. LWers pointed out that this is not itself incompatible with genetics, since most personality traits are themselves somewhat heritable.</p>
<p>On the mutant third hand, if it&#8217;s all just impulsive people who have been poorly trained by their parents, <A HREF="http://www.thedailybeast.com/newsweek/2010/12/10/what-fat-animals-tell-us-about-human-obesity.html">why are wild animals getting fatter</A>?</p>
<p>JAMA Psychiatry: <A HREF="http://jornal.fmrp.usp.br/wp-content/uploads/2013/05/NOSchizophrenia-JAMA-Jaime-1.pdf">Rapid Improvement of Acute Schizophrenia Symptoms After Intravenous Sodium Nitroprusside</A>. Anything that improves schizophrenia symptoms is good news, but this is especially interesting for two reasons. First, the rapid and dramatic effect is easier to replicate and less corruptible than the usual &#8220;take this pill for a month and maybe you will feel better&#8221;, and is reminiscent of the very similar effect of ketamine on depression. Second, sodium nitroprusside is a drug used for high blood pressure without any previously known relevance to psychiatry, opening up a whole new direction in research. The small but interesting field of <A HREF="http://www.ncbi.nlm.nih.gov/pubmed/16005189">nitric oxide in schizophrenia</A> is about to get a lot more scrutiny.</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2013/08/10/science-medicine-links-for-august/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Holocaust Good For You, Research Finds, But Frequent Taunting Causes Cancer In Rats</title>
		<link>http://slatestarcodex.com/2013/07/31/holocaust-good-for-you-research-finds-but-frequent-taunting-causes-cancer-in-rats/</link>
		<comments>http://slatestarcodex.com/2013/07/31/holocaust-good-for-you-research-finds-but-frequent-taunting-causes-cancer-in-rats/#comments</comments>
		<pubDate>Thu, 01 Aug 2013 06:56:26 +0000</pubDate>
		<dc:creator><![CDATA[Scott Alexander]]></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://slatestarcodex.com/?p=892</guid>
		<description><![CDATA[A study published this month in PLoS One finds that victims of weight discrimination (&#8220;fat-shaming&#8221;, in case you only speak Tumblrese) are more likely to subsequently gain weight. It&#8217;s hard for me to like a study that so obviously got &#8230; <a href="http://slatestarcodex.com/2013/07/31/holocaust-good-for-you-research-finds-but-frequent-taunting-causes-cancer-in-rats/">Continue reading <span class="pjgm-metanav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>A study published this month in PLoS One finds that victims of weight discrimination (&#8220;fat-shaming&#8221;, in case you only speak Tumblrese) are <A HREF="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0070048">more likely to subsequently gain weight</A>. </p>
<p>It&#8217;s hard for me to like a study that so obviously got exactly the result its organizers wanted it to get. And obvious confounders are obvious &#8211; level of discrimination faced was based on self-report, and the sorts of people who hang around the sorts of people who fat-shame may differ systematically (in class? education?) than who avoid that kind of abuse &#8211; but the study&#8217;s endpoint of <i>change</i> in weight over time rather than just weight itself goes some of the way toward addressing those concerns. And I&#8217;ve got to give them credit for studying an important issue and getting a highly significant result. So let&#8217;s let them have their soapbox:<br />
<blockquote>There are both behavioral and physiological mechanisms that may contribute to the relation between discrimination and obesity. Weight discrimination is associated with behaviors that increase risk of weight gain, including excessive food intake and physical inactivity. There is robust evidence that internalizing weight-based stereotypes, teasing, and stigmatizing experiences are associated with more frequent binge eating. Overeating is a common emotion-regulation strategy, and those who feel the stress of stigmatization report that they cope with it by eating more. Individuals who endure stigmatizing experiences also perceive themselves as less competent to engage in physical activities and are thus less willing to exercise and tend to avoid it. Finally, heightened attention to body weight is associated with increased negative emotions and decreased cognitive control. Increased motivation to regulate negative emotions coupled with decreased ability to regulate behavior may further contribute to unhealthy eating and behavioral patterns among those who are discriminated against.</p></blockquote>
<p>New study! This one published &#8211; oh, look, isn&#8217;t that interesting &#8211; this month in PLoS One, finds that survivors of the Holocaust <A HREF="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0069179">have greater life expectancy</A> than control Jews who did not experience the Holocaust.</p>
<p>Here the authors definitely got a result they were <i>not</i> looking for and did <i>not</i> want. And here, too, we have all sorts of confounders: they tried hard to construct a matched control group of Jews who emigrated from Poland to Israel just before the Holocaust, but we have no idea what sort of differences there might have be in those populations (just to make up one story, maybe poor people who had less to lose were more likely to emigrate). And here too, there is no shortage of soapboxes. From <A HREF="http://www.universityherald.com/articles/4078/20130731/male-holocaust-survivors-longer-life-expectancy.htm">here</A>:<br />
<blockquote>One possible explanation for these findings might be the &#8220;Posttraumatic Growth&#8221; phenomenon, according to which the traumatic, life-threatening experiences Holocaust survivors had to face, which engendered high levels of psychological distress, could have also served as potential stimuli for developing personal and inter-personal skills, gaining new insights and a deeper meaning to life. All of these could have eventually contributed to the survivors&#8217; longevity. &#8220;The results of this research give us hope and teach us quite a bit about the resilience of the human spirit when faced with brutal and traumatic events&#8221;, concluded Prof. Sagi-Schwartz.</p></blockquote>
<p>So, let me sum up what we&#8217;ve learned here today.</p>
<p>Having someone call you fat is a profoundly disturbing form of stigmatization that breaks your normal cognitive coping mechanisms and subjects you to levels of stress that the human body and psyche were never designed to withstand.</p>
<p>But being rounded up like cattle, having your entire family killed in front of you, and then being starved nearly to death in a concentration camp for several years is useful opportunity to grow as a person, and will leave you stronger and better-adjusted.</p>
<p>I shouldn&#8217;t be too sarcastic. Stranger things have ended up being true. Maybe constant low-grade minor stress has a deleterious effect but a single extremely stressful event can be salutary. Maybe stress is good for you only after you&#8217;ve achieved a safe distance from the stress and can reflect on it from a position where you&#8217;re absolutely sure it will never happen again. Maybe stress makes you obese in the short term, but also makes you live longer in the long-term. Maybe the cultural differences between elderly Polish Jews and middle-aged Americans mediate the effect stress has on their bodies. </p>
<p>Or maybe these effects are mediated by unexpected processes. Maybe the Holocaust survivors live longer not because of personal growth, but because they got a sort of involuntary <A HREF="http://en.wikipedia.org/wiki/Caloric_restriction">caloric restriction</A> that permanently altered their metabolism. Maybe (as the researchers point out in their paper) only people who were exceptionally healthy survived the Holocaust, and these people continued being exceptionally healthy into their old age. Maybe obese people who aren&#8217;t shamed stick to a careful diet to avoid shaming, but once the shaming starts they figure it can&#8217;t get any worse and go wild.</p>
<p>Or maybe one or both of these studies is totally and fundamentally flawed and we&#8217;re wasting our time here. I give 50% probability that the fat result is legitimate, and 90% probability the Holocaust result is due to something other than personal growth, probably survivor effect or caloric restriction &#8211; but I bet others will disagree.</p>
<p>Yet I think what struck me most about this combination was how &#8220;stress makes you miserable and unhealthy&#8221; sounds reasonable, but &#8220;stress is a salutary process that allows you to grow&#8221; also sounds reasonable. No matter what happens to stressed people, psychology can go &#8220;Oh yeah, according to our theories, stress causes that&#8221; and I will nod my head and agree.</p>
<p>Or maybe another way to put it is that I&#8217;m impressed with the ease at which we switch narratives. <i>All the time</i> I hear &#8220;Well, a little bit of adversity will be good for him/her&#8221;. Or else &#8220;What you&#8217;re doing is going to destroy his/her self-esteem and scar him/her for life.&#8221; Most people selectively use either or, depending on whether they want to excuse something or condemn something at that particular moment, and they have the <A HREF="http://slatestarcodex.com/2013/06/22/social-psychology-is-a-flamethrower/">science available</A to support either.     Not only do we operate on <A HREF="http://lesswrong.com/lw/k5/cached_thoughts/">cached thoughts</A>, but we have a store of contradictory cached thoughts sufficient to support any proposition <i>or</i> its opposite.</p>
<p>This is why the Ethics Committee needs to hurry up and approve my replication experiment to <A HREF="http://www.smbc-comics.com/?id=1202">commit genocide against a randomly selected sample of the population</A>.</p>
]]></content:encoded>
			<wfw:commentRss>http://slatestarcodex.com/2013/07/31/holocaust-good-for-you-research-finds-but-frequent-taunting-causes-cancer-in-rats/feed/</wfw:commentRss>
		<slash:comments>40</slash:comments>
		</item>
	</channel>
</rss>
