Imagine a black box which, when you pressed a button, would generate a scientific hypothesis. 50% of its hypotheses are false; 50% are true hypotheses as game-changing and elegant as relativity. Even despite the error rate, it’s easy to see this box would quickly surpass space capsules, da Vinci paintings, and printer ink cartridges to become the most valuable object in the world. Scientific progress on demand, and all you have to do is test some stuff to see if it’s true? I don’t want to devalue experimentalists. They do great work. But it’s appropriate that Einstein is more famous than Eddington. If you took away Eddington, someone else would have tested relativity; the bottleneck is in Einsteins. Einstein-in-a-box at the cost of requiring two Eddingtons per insight is a heck of a deal.
What if the box had only a 10% success rate? A 1% success rate? My guess is: still most valuable object in the world. Even an 0.1% success rate seems pretty good, considering (what if we ask the box for cancer cures, then test them all on lab rats and volunteers?) You have to go pretty low before the box stops being great.
I thought about this after reading this list of geniuses with terrible ideas. Linus Pauling thought Vitamin C cured everything. Isaac Newton spent half his time working on weird Bible codes. Nikola Tesla pursued mad energy beams that couldn’t work. Lynn Margulis revolutionized cell biology by discovering mitochondrial endosymbiosis, but was also a 9-11 truther and doubted HIV caused AIDS. Et cetera. Obviously this should happen. Genius often involves coming up with an outrageous idea contrary to conventional wisdom and pursuing it obsessively despite naysayers. But nobody can have a 100% success rate. People who do this successfully sometimes should also fail at it sometimes, just because they’re the kind of person who attempts it at all. Not everyone fails. Einstein seems to have batted a perfect 1000 (unless you count his support for socialism). But failure shouldn’t surprise us.
Yet aren’t some of these examples unforgiveably bad? Like, seriously Isaac – Bible codes? Well, granted, Newton’s chemical experiments may have exposed him to a little more mercury than can be entirely healthy. But remember: gravity was considered creepy occult pseudoscience by its early enemies. It subjected the earth and the heavens to the same law, which shocked 17th century sensibilities the same way trying to link consciousness and matter would today. It postulated that objects could act on each other through invisible forces at a distance, which was equally outside the contemporaneous Overton Window. Newton’s exceptional genius, his exceptional ability to think outside all relevant boxes, and his exceptionally egregious mistakes are all the same phenomenon (plus or minus a little mercury).
Or think of it a different way. Newton stared at problems that had vexed generations before him, and noticed a subtle pattern everyone else had missed. He must have amazing hypersensitive pattern-matching going on. But people with such hypersensitivity should be most likely to see patterns where they don’t exist. Hence, Bible codes.
These geniuses are like our black boxes: generators of brilliant ideas, plus a certain failure rate. The failures can be easily discarded: physicists were able to take up Newton’s gravity without wasting time on his Bible codes. So we’re right to treat geniuses as valuable in the same way we would treat those boxes as valuable.
This goes not just for geniuses, but for anybody in the idea industry. Coming up with a genuinely original idea is a rare skill, much harder than judging ideas is. Somebody who comes up with one good original idea (plus ninety-nine really stupid cringeworthy takes) is a better use of your reading time than somebody who reliably never gets anything too wrong, but never says anything you find new or surprising. Alyssa Vance calls this positive selection – a single good call rules you in – as opposed to negative selection, where a single bad call rules you out. You should practice positive selection for geniuses and other intellectuals.
I think about this every time I hear someone say something like “I lost all respect for Steven Pinker after he said all that stupid stuff about AI”. Your problem was thinking of “respect” as a relevant predicate to apply to Steven Pinker in the first place. Is he your father? Your youth pastor? No? Then why are you worrying about whether or not to “respect” him? Steven Pinker is a black box who occasionally spits out ideas, opinions, and arguments for you to evaluate. If some of them are arguments you wouldn’t have come up with on your own, then he’s doing you a service. If 50% of them are false, then the best-case scenario is that they’re moronically, obviously false, so that you can reject them quickly and get on with your life.
I don’t want to take this too far. If someone has 99 stupid ideas and then 1 seemingly good one, obviously this should increase your probability that the seemingly good one is actually flawed in a way you haven’t noticed. If someone has 99 stupid ideas, obviously this should make you less willing to waste time reading their other ideas to see if they are really good. If you want to learn the basics of a field you know nothing about, obviously read a textbook. If you don’t trust your ability to figure out when people are wrong, obviously read someone with a track record of always representing the conventional wisdom correctly. And if you’re a social engineer trying to recommend what other people who are less intelligent than you should read, obviously steer them away from anyone who’s wrong too often. I just worry too many people wear their social engineer hat so often that they forget how to take it off, forget that “intellectual exploration” is a different job than “promote the right opinions about things” and requires different strategies.
But consider the debate over “outrage culture”. Most of this focuses on moral outrage. Some smart person says something we consider evil, and so we stop listening to her or giving her a platform. I don’t want to argue this one right now – at the very least it disincentivizes evil-seeming statements.
But I think there’s a similar phenomenon that gets less attention and is even less defensible – a sort of intellectual outrage culture. “How can you possibly read that guy when he’s said [stupid thing]?” I don’t want to get into defending every weird belief or conspiracy theory that’s ever been [stupid thing]. I just want to say it probably wasn’t as stupid as Bible codes. And yet, Newton.
Some of the people who have most inspired me have been inexcusably wrong on basic issues. But you only need one world-changing revelation to be worth reading.
http://www.paulgraham.com/disc.html ?
Darn. Well, only scooped on about 20% of this, could be worse. I’ve added the link into the post.
Yeah, I agree you added quite a bit, and I’m glad to see further elaboration! Graham’s post is tragically short. But the connection is there.
Paul Graham is relevant to your post in another way. He posted over 100 essays dating back to 1993 that were almost always interesting and frequently thought provoking. In Jan 2016 he wrote one about Income Inequality that argued we should be more concerned with addressing poverty than worrying about inequality for which he was roundly criticized. He wrote 6 more and hasn’t published one since Nov 2017. I don’t know if he qualifies as a black box, but his essays were well thought out, sincere and respectful. It would be a shame if he was a casualty of the outrage culture.
He still goes around pissing people off on Twitter, so I doubt that was the issue.
🎶 ’cause one out of three ain’t bad… 🎵
All I can think of when I hear that is https://www.youtube.com/watch?v=VakU20APPdw (context: Congress has just been massacred)
I’m not sure I agree. As far as I can tell, all your examples also either 1) tested their prediction or 2) explained how to test their prediction.
If a magical machine spit out hypotheses that were correct with 0.1% chance, but gave no way of testing them, then I don’t think it’s that valuable. Experimentalists have plenty of things to test already.
I think you missed the “as game-changing and elegant as relativity” bit. My impression is not that we have a surfeit of novel hypotheses as potentially important as relativity.
“Einstein seems to have batted a perfect 1000 (unless you count his support for socialism).”
I disagree. Einstein believed in a static universe that would’ve been dynamically unstable (though he did quickly reverse himself once he saw Hubble’s data), gave terrible arguments against the existence of black holes, repeatedly retracted and then reinstated his prediction of gravitational waves, doubted the fundamental nature of quantum indeterminacy and entanglement (though in his defense, his opponents like Bohr weren’t giving strong arguments—those would only come later), and mostly wasted his last decades pursuing a unified field theory that ignored not only QM but even the nuclear forces. He also pursued false leads (even “stupid” ones) along the way to general relativity. Still, nothing as silly as Newton’s Bible codes, let alone anything that would challenge his well-earned fame as one of the two or three greatest scientists in history. Just like your (Scott’s) being wrong about Einstein just now doesn’t change my assessment of you as the world’s best blogger, a magic box that spits out >>50% original and important truths. 🙂
Einstein also apparently rejected continental drift, which seems more traditionally crazy to me as a nonphysicist.
Copy of the foreword he wrote for Charles Hapgood’s crackpot theory.
https://en.wikipedia.org/wiki/Charles_Hapgood
https://archive.org/stream/eathsshiftingcru033562mbp/eathsshiftingcru033562mbp_djvu.txt
To be fair, it’s a pretty tepid endorsement.
Continental drift wasn’t out of crackpot theory for too long when Einstein must have come in contact with it.
It seems weird right now, but continental drift is a relatively recent entrant into the scientific canon.
See the history section in https://en.m.wikipedia.org/wiki/Continental_drift
Also, Einstein’s entire work, as it’s very well-known among historians of science and very little known among the lay public, was built on the equations of Scotland’s James Clerk Maxwell, the basis of electrodynamics, as they show that electricity and magnetism are two aspects of a single force. We should add the work of Jules Henri Poincaré to that mix, as he worked on predecessors to many of Einstein’s theories. Poincaré remarked on the apparent “conspiracy of dynamical effects” which caused apparent time and distance to alter according to the speed of an object following an 1887 experiment performed by Albert Michelson and Edward Morley that failed to obtain the results anyone at the time expected. Under conventional Newtonian physics, light travelling in the direction of the Earth’s rotation around the Sun should have appeared to have a different speed from that of light travelling at right angles. Still, it remained resolutely constant, as Poincaré observed. Distances compress and time slows enough to make the velocity of light stay constant, he pointed out. Einstein being treated as a unique fountain of wisdom and truth is just wrong: https://aussiesta.wordpress.com/2019/02/27/albert-einstein-dark-saint-of-progress/
The conventional wisdom that I’ve heard (and which seems right to me, for what it’s worth) is that if Einstein had not existed, someone would have figured out special relativity within a few years anyway, with people like Poincare, Lorentz, and Fitzgerald being nearly there, and some of the experiments, such as Michelson-Morley, having already been done. But, if Einstein had not existed no one would have come up with general relativity for decades. The experimental evidence (unless you count the precession of Mars) wouldn’t have come up years if no one was looking for it. While all science is necessarily indebted to previous science, Einstein did make a pretty big contribution (GR), and came up with a lot more of the theories whose “time had come” (SR, photoelectric effect, stokes-einstein) than one would expect from any given scientist
I don’t think that’s right. My impression is that Hilbert would have figured out a complete version of GR if Einstein hadn’t.
Hilbert would certainly have come up with the field equations if Einstein hadn’t (possibly he found them independently, even). But without Einstein, no one would have even thought to look for field equations, or even gravity as a type of curvature, for a long time. The mathematical tools for GR, while more complicated than those for special relativity, were well established by the 20th century. The primary insight, about the nature of inertia, was purely physical and purely Einsteins.
Aren’t you being unfair? His dogmatizing about QM still led to EPR.
gwern: I was just trying to show that Einstein wasn’t infallible, not that he wasn’t Einstein! 🙂 He seems to have wanted what we’d now call a local hidden-variable theory. His opponents at the time were totally right that that couldn’t be had (assuming you want to reproduce QM’s empirical predictions), and totally wrong that they’d proven its impossibility. Both sides were missing Bell’s theorem, which had to wait till 1964, after Einstein had died.
Yeah, I had opened the comments basically just to say that.
(Though most of those things were later in his life — had he died at the age of 27, he would have been way closer to a perfect 1000.)
Right, but had he died at 27, he wouldn’t have done GR! So maybe give him until 37? 🙂
What, that Einstein had a steadily increasing ratio of crackpottery-to-genius as he got older?
While I won’t say this is true of every legendary genius with revolutionary good ideas, my impression is that it’s true of a lot of such geniuses.
We often hear about people who made brilliant discoveries in their twenties and thirties but delve deep into pseudoscientific nonsense later in life. The reverse is relatively rare- not a lot of people who spend a lifetime indulging in pure pseudoscience and have revolutionary visionary ideas that actually pay off in their fifties and sixties.
This could be a survivorship effect, I suppose. Twenty or thirty years of a reputation as a crackpot may cause people to self-select out of the relevant field from being laughed at, or may destroy their ability to get a brilliant idea listened to when they have one at the age of fifty.
On the other hand, it could also be that the processes of diving into pseudoscience and having genius ideas aren’t quite identical- that a genius falls into pseudoscience for reasons rather different than the common run of pseudoscience advocate.
Maybe if you only have that bold iconoclastic streak, but not the deep underlying comprehension of what categories of things can and cannot be true, you become a conspiracy theorist, but if you do have that comprehension you become a scientific visionary?
…
Another thing we should do is differentiate between cases where scientists have ideas that are merely incorrect, but within the Overton window of accepted scientific theory at the time, and ideas that are outright bonkers.
Being a continental drift skeptic in the 1920s was rational in context- there was nothing like a proposed mechanism. Being one in the 1950s was normative. Being a continental drift skeptic in the 1980s was a sign of crackpottery. Believing that megadoses of vitamins could cure diseases wasn’t a stupid idea by the standards of, say, the 1920s or 1950s, but would have been borderline by the 1980s and is sheer blithering foolishness now. Believing in hidden variable theories was at least understandable in the 1950s, but would be foolish now in the light cast by Bell’s Theorem.
There’s a difference between a scientist being wrong about a theory that has not been fully proven, or believing a theory that has not been fully disproven… Versus a scientist indulging in things that almost any competent mind in the field ought to be ruling out, given what is known at the time.
The former is entirely unremarkable and very common at all levels of scientific ability. The latter shows that interesting bimodal distribution where geniuses turn into crackpots.
This phenomenon also makes sense in terms of regression to the mean. Observed success is likely to be a combination of luck and ability. When you come to my attention due to your recent extraordinary success, there’s a very good chance that you won’t manage to be that successful again, because quite probably, your success depended on a lot of luck as well as ability. OTOH, someone who’s produced a lot of extraordinary success is very likely extremely capable. (The guy who wins Wimbledon once probably won’t win it again–he had to be very good *and* have a couple really good weeks. But they guy who’s won it five times is probably a lot more likely to win it again–he must be really, really good.)
ETA: Hamming wrote a great talk about doing research, in which he speculated that the pattern of scientists having one big hit and not doing much afterward was due to people who’d had one big hit thinking they shouldn’t work on anything but the very biggest and hardest problems, as befits their status. Since those problems are very hard to get any traction on, it’s easy enough to see how they’d spend the remaining years of their career spinning their wheels.
“What, that Einstein had a steadily increasing ratio of crackpottery-to-genius as he got older?
While I won’t say this is true of every legendary genius with revolutionary good ideas, my impression is that it’s true of a lot of such geniuses.”
Ray Kurzweil. QED.
Scott, I think it’s sort of a myth that Einstein was resistant or confused about quantum indeterminacy and entanglement. I think Tim Maudlin does a good job of entertainingly discussing how philosophers of science view this in this book review.
orin: I know Tim Maudlin well. Tim has a specific ax to grind here: namely, “realism” (as exemplified by nonlocal hidden-variable theories like deBroglie-Bohm). Like other Bohmians, Tim celebrates Einstein for having held down the fort of realism until Bohm could come along. However, one complication for this story is that Einstein knew about the deBroglie-Bohm theory and rejected it (calling it “too cheap” in a letter to Max Born in 1952). Einstein certainly couldn’t have liked the nonlocality that’s a core feature of dBB.
My personal view is that *everyone* was at least someone confused about entanglement until the 1960s. And my argument is that, if they weren’t, then they would’ve done the few simple lines of arithmetic to derive Bell’s theorem.
I’m pretty confused regarding what Von Neumann thought about QM, but at least some of the time he seems to have believed in the relative state interpretation.
I still don’t think that’s quite fair to Einstein. There are plenty of people in the present day who are not confused or resistant to quantum mechanics, and yet fully embrace realism while rejecting dBB for similar reasons (among others) as Einstein. I don’t think it’s fair to say that being resistant to anti-realism is synonymous with being confused about QM.
No, he wasn’t confused because he wanted realism—only because he wanted local realism.
Scott, again this is too strong. Everettian interpretations are local and real, and don’t reflect a confusion or resistance about QM. (Note to experts: MWI doesn’t violate Bell’s theorem because it is not counterfactually definite).
Agreed. Also, I just want to say that I love the fact that Scott Alexander and Scott Aaronson are fans of each other. 🙂
Eh. Granted that I’m a crackpot, but I’m pretty sure he was broadly right about his approach to a unified theory, and his rejection of QM, or at least the version of quantum mechanics that existed when he was around. If Rhydberg had finished his work first, we probably never would have seen quantized energy, and a surprising bit of uncertainty vanishes without it.
I suspect this phenomenon pairs with the desire to find some reason to stop listening to some thinker when they say something we don’t want to hear. We’re *always* looking for an excuse to throw out upsetting ideas/claims without having to actually give them a hearing.
Right.
E.g., Noam Chomsky was wrong about Cambodia so I don’t have to pay any attention to him on anything else.
In contrast, I agree with Scott. For example, Chomsky was sounding the alarm way back in the mid-1970s when few were paying attention that letting Indonesia take over East Timor from the Portuguese was bad. That strikes me as impressive and therefore means I should pay more attention to Chomsky on other topics.
I’m a glass is half full kind of guy.
I think the argument on Chomsky wrt Cambodia is not that he was wrong, although he was, but that he was dishonest, that the chapter on Cambodia he coauthored deliberately twisted the evidence in order to suggest that the Khmer Rouge were the good guys.
If that’s true, the implication is not that he will be wrong about other things but that what he writes about other things cannot be trusted. Reaching the wrong conclusion once isn’t evidence that your other conclusions are wrong. Deliberately lying once about something important is evidence, not that other things you say are wrong, but that the fact you say something is at most weak evidence that it is true.
I’d say “was (and still is) wilfully blind because of his priors” rather than “lied”. The difference is a subtle one, but the correct conclusion to draw matters.
I’ve read a lot of Chomsky, and I think his linguistics is pretty much entirely wrong, I’m not really qualified to speculate about his economics, and his geopolitical analyses (insofar as they are separate from economics) are largely correct. He’s been wrong about very little that I’m aware of other than Cambodia when it comes to critiques of US foreign policy, and in the case of Cambodia it was due to his (largely justified) inherent distrust of the US propaganda machine.
What struck me about the Cambodia chapter was that he cited, as if it was a reliable source of information, a book that was based almost entirely on what the KR told the authors, the authors of which described Pol Pot as a saintly figure. It was easy by reading it to see that the book was straight KR propaganda but nothing in the chapter signaled that.
Chomsky is not stupid, so he had to know the nature of the source he was using.
Is this Manufacturing Consent that is being discussed?
Pretty much anyone who makes a career of arguing controversial things has made this kind of error. But I haven’t seen that in most of the rest of his work (except his linguistics, where he’s wilfully blind to an astounding degree). I haven’t fully vetted every one of his sources, to be fair, and he’s definitely preaching to the choir in my case, so I’m sure my biases are impacting that judgment. But almost everything I’ve checked other than that Cambodia chapter seems to be accurate.
@Smacky: not in specific, no. More the various publications that have been made, usually of speeches or articles he’s written, about US foreign policy.
@Smacky: oh, did you mean the Cambodia thing? That’s not from Manufacturing Consent, no, it’s a chapter from some book he co-wrote (edited? introduced?) in the 70’s I think, and not about media per se. It’s pretty much the go-to Chomsky controversy, easily google-able.
@DavidFriedman
> Reaching the wrong conclusion once isn’t evidence that your other conclusions are wrong. Deliberately lying once about something important is evidence, not that other things you say are wrong, but that the fact you say something is at most weak evidence that it is true.
I personally agree with this, but I’m not sure that it logically follows. Why do we presume that the flaw in ethics is more likely to recur than all other flaws? It makes sense that the ethical flaw is more likely to be recurrent than a random error, but I’m not sure it makes sense to assume that all other errors are likely to be random.
Consider two researchers, one with impeccable ethics and a typically flawed understanding of statistics, and another who will in extreme circumstances publish lies to benefit a favored cause but who has truly expert statistical acumen. We find that the ethical one makes fairly constant errors of interpretation regardless of subject matter, while the shady researcher is always reliable on certain topics and always untrustworthy on others.
Logically, it seems we might prefer the research of the unethical researcher on the unbiased topics. But instead, it seems like we usually treat all their results as tainted, and instead put greater faith in the genuine but incompetent researcher. If our goal is figuring out which research to trust (rather than personally punishing transgressors) is this really the right approach?
I don’t think that you can neatly separate the taint from the good bits.
If they are committed to being proven right, rather than be right, it seems unlikely that they have a commitment to doing science properly in the first place. So they will presumably be very prone to taking shortcuts. So this researcher that is very good but sometimes deceptive may be more of a unicorn or thought experiment than someone you can commonly find.
People with a strong agenda also typically have a tendency to shoehorn it into other topics. So it may seem like their very strong agenda against smoking shouldn’t have an impact on their research into the efficacy of preschooling, except that they may then manage to ‘find’ that preschooling doesn’t work well for some kids because their parents smoke (or whatever).
The way their deception works may also be non-obvious if you don’t share their world view. You may not think that a certain finding is or should be in conflict with their agenda, but they may disagree.
Finally, there is the basic issue that people tend to desire to punish those with bad intent much more than those who are incompetent. Shunning is such a punishment.
Historically speaking, innovation and science correlate highly with superstition. My guess is that a lot of this stuff is really just the social presentation of basic personality traits. I suspect that a huge portion of this behavior could be “explained” as a correlation with openness.
If this is the tradeoff you’re facing, you’re probably reading too many people – neither of them sounds worth paying attention to. Why bother hacking through the jungle of bad takes for a rare good one, or boring yourself with stuff you already know well? There are so many thinkers who are reasonably good on both metrics that you can’t possibly keep up with their output.
I agree with the post title, because everyone starts out in the “ruled out” bin and it takes considerable selection (both positive and negative) to rule them in. Maybe you should lean more on positive selection if you’re specifically looking for paradigm-shifting ideas in cleanly empirical fields. But if you’re trying to evaluate whether someone is worth listening to, in general, negative selection is underrated if anything.
This seems right to me. My time allocated to reading-stuff-in-an-attempt-to-understand-the-world is much too small to get through everything that’s out there.
That being the case, I select heavily towards (1) people with good track record of novel insight or at least framing that leads me to new and better understanding (Hi Scott) and (2) people who seem good at finding and summarizing interesting thinking that’s going on in other people’s heads.
The thing that gets people ruled out fast, given that I don’t realistically have time to check their work, is any sign of dishonesty. I don’t have time to waste reading people who are gonna play games with data or hide the ball.
This might seem self-evident, but I also think you’re describing an attitude about genius, and about people, that is probably one cause of the “black box” pattern.
You’re saying it’s a problem/fallacy that we mythologize people and expect them to either be very smart, and therefore right about everything, or else be wrong about everything. This is because we’re ignoring potentially important ideas from the “black boxes” once they spit out a stupid idea.
However, what I’m proposing is that the reason we have this phenomenon is partly because of the attitude you’re describing. Sure, Newton might have had “amazing hypersensitive pattern-matching going on” but he also had an ego shaped by his culture; possibly he was aware of the attitude people had about geniuses and intellectuals and had simply decided that he was one, hence the bible codes. I don’t know if this accurately reflects 17th century opinion though.
I guess my question is: Does our attitude about geniuses enable them, or hold them back? My guess is that it enables them, because cultural evolution should select for this trait (producing geniuses). So does this mean we should continue to mythologize the successful ones, and discount those that produce one stupid idea? After all, it doesn’t seem to stop them. (This is somewhat unrelated to your point, which was more about evaluating geniuses for our own benefit)
But it does seem to produce some amusing overconfidence. Ignoring the elephant in the room, I’m going to name the two people that came to my mind immediately: Eric Raymond and Elon Musk.
Agreed. When we treat “genius” as an inherent and mysterious trait, we abdicate judgment and critical analysis to our faith in the natural abilities of the “genius.” Scientific thought becomes mystical instead of methodical. Policy makers begin to focus on cultivating and nurturing geniuses rather than devising methods for producing novel thought among the general populace.
I actually think you missed my point: “cultivating and nurturing geniuses” is our method for producing novel thought among the general populace. I wasn’t saying that it’s a bad thing.
I’ve argued this for years. Cultural and scientific advancement is basically just evolution.
A useful analogy, but I think it goes against Scott’s point.
Think of selectively breeding animals: we’ve made huge strides just by selecting on mutations already existing in the population. It’s true that crossing breeds plays some role, but it’s mostly selection. Not novel mutations.
In general, increasing the mutation rate tends to mess up organisms, whereas changing selection can make a big positive difference. Humans would definitely be better off with a lower mutation rate (less genetic load).
Admittedly it’s not clear that this applies to society. Though when I look around, I see a lot of crazy (or at least inconsistent) ideas running around. I’m far from convinced that we need to prioritize idea generation rather than improved selection.
This is not true, however, with plants, where mutation breeding is a common thing.
Interesting! That makes sense given the numbers involved with plants.
Of course, the desirability of mutation farming depends on the probability that a given mutation will be beneficial, and that this will be detected and captured, compared to the cost of raising the organism (including time costs).
As we get better at genetic engineering we will get better at mutation farming — instead of introducing a random mutation into a cow’s genome, we introduce a mutation that has a 10% chance of increasing milk production. As we understand more and more, our precision gets better.
Probably will need pretty high precision to ethically introduce novel mutations in humans. At least under our current view on ethics — easy to imagine alternate worlds in which large-scale mutation-farming of fetuses is a thing.
For all of the outcry about the recent use of CRISPR to modify a couple of fetuses in China, I seriously doubt that the powers that be over there were anywhere near as outraged as claimed.
Likely true. But there’s a world of difference between (1) fixing an “error” in a fetuses’ genome (i.e. replacing a rare deleterious mutation with the common form), (2) replacing a common variant with another common variant because we think it has a beneficial effect (e.g. giving someone common +IQ genes), and (3) introducing novel mutations to see if they have good effects.
My sense is that most people are okay with (1) in theory, most people are squeamish about (2), and (3) is right out.
I’m guessing (1) will be the gateway. We won’t do (3) until the science has advanced significantly (and by then there may well be much better options).
A sort of in-between case is introducing novel mutations in humans that have (roughly) known effects in other mammals. We could possibly predict their effects fairly well.
This is where NNT’s use of the term “skin in the game” makes sense. He’s not just talking about incentives here, he’s talking about the ability to fail so thoroughly that you get taken out of the market entirely. That imposes a kind of systemwide discipline–people/strategies who have failed that thoroughly aren’t around anymore.
This is debatable. If you are selecting for existing mutations you are (probably) selecting for lines that have more mutations, and so lines that will be introducing new mutations.
Your problem was thinking of “respect” as a relevant predicate to apply to Steven Pinker in the first place. Is he your father? Your youth pastor? No? Then why are you worrying about whether or not to “respect” him?
This is a good insight, I think. Whether or not we respect someone as a person, or even whether we respect their ideas as a whole, isn’t relevant to how right or not right any given idea is. It’s a rough heuristic people use because evaluating every single claim on its own merits is exhausting and time-consuming (and a lot of people just plain aren’t smart enough to do that–I’m including myself in that judgment), and we have to narrow down the field of “stuff we pay attention to” in some way, so we use the filtering mechanism of “does this person seem generally smart and have a good track record of saying things that other smart-seeming people agree are right?”
Maybe that’s inevitable, but it’s a pretty bad way to judge ideas, one that inevitably going to end up being more about status than about correctness.
I think it makes sense to hold people to a different standard when they’re speaking within vs. outside their area of expertise. With Pinker specifically, I tossed Better Angels aside a third of the way through for the sheer volume of laughably silly arguments–but Pinker is a neuroscientist, not an historian. I’d expect him to be more on-target when talking about brain structure. Likewise Richard Dawkins is, I hear, a very good biologist, and Pauling’s daft ideas about medicine don’t detract from his expertise at straightforward chemistry.
Derp! He’s a psychologist, not a neuroscientist. I got myself mixed up because he wrote The Blank Slate. Sorry. Broader point stands.
I haven’t read Better Angels. Is your criticism that he has his historical facts wrong or that you find his explanations of the facts unpersuasive?
I asked the same question of red sheep elsewhere.
I think you would love Better Angels and his more recent book Enlightenment Now. The statistics on the decrease in violence and improvements in human flourishing are thoroughly convincing for the most part (he has thousands of facts and charts so is bound to slip up a few times, but overall is extremely convincing to anyone with an open mind on the possibility of human progress over the past 200 years). Sheep may of course disagree.
BTW, David, I did not see any comments from you on yesterday’s thread on wage growth (I was looking for them because yours are usually among the best on economic topics)
I thought about commenting on that, but it isn’t the sort of economics that I have expertise on, and it’s clearly a complicated tangle of data and interpretation.
The one thought that did occur to me, along theoretical rather than empirical grounds, was that an increase in wage inequality might be a result of an increasingly meritocratic society.
Suppose you start with a society where the children of the elite go to Harvard and get high status, high pay jobs, the children of the poor don’t make it through high school and get low status, low pay jobs.
The society changes, in ways that make it more likely that a smart child of the poor will get a scholarship to Harvard and end up with a high status job, a stupid child of the elite end up living on his parents’ money and doing volunteer work because no elite law firm will hire him. You now have more very smart doctors and lawyers, fewer very smart fast food cooks and construction workers. So the difference between the top and the bottom of the salary range increases.
Add to that the effect of increased assortative mating.
My guess is that these processes, although plausible, are too slow and continuous to explain the observed pattern of recent decades, however. Rereading my parents’ autobiography, there are lots of people they knew who started out poor and ended up as prominent academics and the like. And those would be people born near the beginning of the 20th century. Indeed, I would be surprised if the process slowed in recent decades.
Well if that is going to be our standard then I have a lot of posts to go and delete.
Sorry, I should have confined my griping about Pinker to one post. I can accept that violence has declined, depending on how one defines the period–I suspected it anyway–but found his analysis deeply suspect, indeed bad enough that I perversely started to wonder if perhaps it hasn’t been a straight decline after all.
That should have been “wouldn’t be surprised”.
Hm… Dr. Friedman, this sounds like an excellent example of a negative feedback loop.
More meritocratic access to high-income, high-status positions means ‘brain drain’ among the lower class. Which drives down wages in the low-status jobs as you describe, and makes the high-status jobs more valuable since they’re more consistently held by meritorious people.
But then, as income inequality rises, the sheer scale of the disparity is unlikely to do meritocracy a lot of favors. Very poor people with limited prospects will have a harder time providing the educations that launch their children into the meritocratic elite. Very wealthy people have more spare resources for signalling games, which inflates the price of entry into the top levels of society.
Society shifts more towards a Victorian model where there’s a ‘glass ceiling’ above members of the ‘lower orders,’ a glass floor under all but the most dissolute and incompetent members of the upper class, and where the distribution of merit shifts accordingly.
I suppose the question is whether this is then enough to trigger a reverse swing of the pendulum- does having more meritorious people frustrated and pushing against a glass ceiling act to increase meritocracy and reduce income inequality? Or does the system tend to “stick” in an unequal state, requiring special outside circumstances to shake it loose and restore a (temporarily?) more equitable state with rising meritocracy.?
Here’s my in-depth review of Steven Pinker’s “The Better Angels of Our Nature” in “The American Conservative.” I tried to give a fair accounting of Pinker’s strengths and weaknesses:
https://www.theamericanconservative.com/articles/steven-pinkers-peace-studies/
Interesting review, but I think you have a math error:
1/1024=2^-10, not 2^-100.
But what’s twenty-seven orders of magnitude (plus a bit) between friends.
@DavidFriedman:
Unless he assumed that without any graphs the book would have sold 2^90 copies. 😉
Fair point. My error.
22-23 orders of magnitude.
I think Newton was likely bipolar. There is some ok but not great discussion here: https://futurism.com/the-madness-of-sir-isaac-newton
This seems to be a classic case of rationalising in modern terms, and since in that frame of reference Newton’s actions don’t make sense, attributing a modern definition of mental illness. Which allows one to dismiss the bits one doesn’t like.
As a seventeenth-century gentleman, Sir Issac would expect to dabble in a number of fields, as modern academic specialisation is a nineteenth/twentieth-century idea. Note he was also important in reforming the English currency, which is nothing to do with Biblical codes or alchemy or gravity. In fact looking for patterns and codes in the Bible is a long-standing tradition greatly helped by the fact the writers and translators of the Bible often wrote in a specific fashion (‘biblical style’) based on repeating patterns and apparent numological structures, so was a respected field of study in Sir Issac’s day, and is not regarded as totally stupid by biblical scholars nowadays.
Basically Newton looking for patterns in biblical text is probably no different from Einstein espousing socialism: with hindsight it looks a bit odd, but it was a rational and normal thing to do at the time.
Great post. In the RIP culture war thread SSC says something like there are people smarter than you who believe heinous things. I think that’s all. Miscalibrated heuristics, underestimating the sheer grind required to arrive at an accurate belief.
lf there’s a cognitive style underlying this, I lean towards the Newton-bipolar-dopamine receptor creativity hypothesis being mostly wrong in this context. I suspect it is more there are types of beliefs that snowball into knowledge. Newton should be looked at in context of his time (eg peers, Great Plague). Maybe also something like the Hermetic theory of correspondences or behind-the-veil, conspiratorial-type thinking. I also suspect he gets too much and disproportionate flack here although this is a handy example. He doesn’t seem that irrational in context of his time. Like Lucretius thinking people can’t live in the antipodes, or something like that maybe (or Francis Bacon being a geocentrist type thing). A better example might be 20th century quantum physicists and Vedantic mysticism or Freud, *especially* Wolfgang Pauli and Jung
https://facebook.com/story.php?story_fbid=493452407852243&id=100015624640913&refid=17
We’re only pretty sure that things can be verified more easily than generated. 😛 https://en.wikipedia.org/wiki/P_versus_NP_problem
Haha, nice one. 🙂
I don’t think we’re even sure – rather the converse.
Gödel’s incompleteness theorem shows that things can be true but unprovable (and equivalently, false but unfalsifiable).
What if the black box spits out a 50:50 mixture of these? It would basically be useless.
Yeah, more seriously/concretely, this argument is dependent on easy verifiability, which is only true in some contexts. If your ideas aren’t scientific theories, but predictions about the future (e.g., stock prices, election results), being right 50% of the time is literally the worst possible performance.
Though (slightly less seriously now) consistently being right only 10% of the time is potentially good, because the opposite of what you say will be right 90% of the time. How useful this is depends on how binary the things you’re trying to predict are (e.g., being right exactly 10% of the time on if a stock will go up or down is really great, being right exactly 10% of the time about where an attack will take place is less useful: 90% of the time you’ll be able to rule out one of potentially many attack locations)
Scott Aaronson gives the above argument as a reason that P != NP. Aaronson writes:
For background, the question of whether or not P = NP is one of the biggest open problems in science, with a $1 million prize attached to it. There’s a bunch of technical underpinnings, but roughly speaking P is the set of all problems that are easy to solve, and NP is the set of all problems whose solutions are easy to verify. If P = NP, then every problem whose solution is easy to verify is also easy to solve outright. Or equivalently, there are no problems that are hard to solve but easy to verify. Or equivalently, your bank payments aren’t secure and maybe we don’t need mathematicians anymore.
Scott Aaronson (along with most experts) believes that P != NP for many reasons, one of which is that there are a lot more readers of SSC than writers of SSC.
On the other hand, “easy” is in a technical sense which does not necessarily mean anything in practice. (Something with complexity n^1000 is in P, whereas something with complexity exp(1e-30*n) isn’t.) As I’ve heard someone put it, “there are two kind of numbers, those smaller or equal to BusyBeaver(Graham’s number) and those larger than it. Only the former are ever relevant to real-world problems, but traditional complexity theory only deals with the latter.”
Sure, technically something that’s O(n^1000) is in P, but those kinds of problems basically never arise naturally. If they did, it would be a bad characterization.
“but those kinds of problems basically never arise naturally.”
I know very little complexity theory, but I’ve sometimes wondered whether this isn’t just a case of the drunk looking for his lost keys under the lamppost since it’s hopeless to find them in the unlit bushes where he actually dropped them. Are there any theoretical or even heuristic arguments for why we should expect interesting, useful problems in P to be in O(n^10) rather O(n^1000)? And even when we know a priori that some problem is in O(n^3), in many interesting cases n is so damned large the exponent is basically irrelevant.
Linear programming for example is even better than this theoretically, and you can do many useful things with it, but you still can’t use it to plan an economy, so far as I can tell. And moreover, it was many decades (or millennia, depending on your point of view) before linear programming was shown to be in P.
So I can certainly imagine that there are plenty of other hard problems that will eventually be shown to be in P and they’ll still be effectively hard. Extremely intelligent experts tell us that P = NP is very likely false and that it would be a super big deal if it were true. Since they are extremely intelligent and know what they are talking about, they are probably right. But I still don’t get what all the fuss is about.
Is there even a single “real”* algorithm that’s O(n^1000)? What’s the largest k such that there’s a “real” algorithm in O(n^k)? I’d be surprised if it’s >10 (though to be fair I haven’t really investigating this specific issue much).
I’m not sure what this means exactly? If anything, the exponent matters more the larger n gets, no?
Exponential functions grow really fast. n^1000 breaks even with 2^n at n = ~14k. At n=15k, 2^n eclipses n^1000 (n^1000 / 2^n = ~4.37e-340 at n = 15k).
Or, is your point that all n^k are basically equivalent for any k? (I.e., that the difference between n, n^2, and n^3 doesn’t really matter). If so, then that’s consistent with CS theory’s treatment of polynomial algorithms as all being “fast” and exponential algorithms as being “slow”.
* “real” meaning something like “thought to be the best possible at some time at solving some real problem, not constructed to prove a point”
The most accurate ab initio quantum chemistry algorithms scale like O(N^7) with the number of particles being simulated. Off the top of my head I don’t recall any algorithms with larger N that weren’t either exponential or contrived, but that’s probably a limitation of my head rather than of the space of useful algorithms.
The original version of the AKS primality test was O(n^12). However, it was quickly reduced to O(n^6). Anyway, that’s the highest non-arbitrary exponent in a natural algorithm that I am aware of.
I would be extremely surprised if you could find any non-contrived problem where the optimal complexity is O(n^1000).
“Is there even a single “real”* algorithm that’s O(n^1000)?”
Probably not, but maybe simply because *real algorithms* as we know them are made by puny human minds.
I am asking rather for a heuristic or theoretical argument for why there shouldn’t be *real problems* whose optimal solution involved an algorithm running in O(n^1000).
“Or, is your point that all n^k are basically equivalent for any k?”
I’m saying something rather mundane: for large enough n, all algorithms become useless, whether in P or not. An example is linear programming, which is certainly very useful, but quickly runs into practical limits.
A heuristic argument against natural polynomial problems with high complexity is that when a polynomial algorithm with high complexity comes out, it usually quickly gets optimized. For example the original version of AKS had O(n^12) complexity, but that was soon reduced to O(n^6). If you see a polynomial algorithm with high complexity, that usually means that the authors didn’t bother to optimize it (because the real breakthrough is showing that a polynomial algorithm exists at all), not that the problem is inherently hard.
If someone came up with a O(n^1000) algorithm for SAT, I wouldn’t think “oh good, we’re safe forever”. I would think “it’s going to be O(n^10) tomorrow”.
On a side note, I just came across Reingold’s algorithm for undirected connectivity, which, according to Wikipedia, has complexity of O(n^64^32). But Wikipedia also says that nobody has bothered to optimize that algorithm, since in the real world, log space doesn’t matter and we just use the O(n) polynomial space algorithms for that problem.
As for a theoretical argument – most natural numbers are small. As the saying goes, nature counts 0, 1, infinity. It would be very surprising if a non-contrived problem had optimal complexity of O(n^1000) because that is so arbitrary.
Thanks, but I still don’t get it. Maybe this is a cultural divide between people who mostly study algorithms (not me) and people who sometimes try to apply algorithms to solve particular problems and usually find the algorithms wanting (me).
“As for a theoretical argument – most natural numbers are small. As the saying goes, nature counts 0, 1, infinity.”
I guess that’s a joke, but my experience is the opposite of this. I don’t go looking for combinatorial explosion. It comes looking for me. Just about every problem that interests me has a very thin edge between what I can do by hand and what is beyond practical limits.
Concerning the heuristic argument above, that polynomial time algorithms tend to get optimized, a cautionary tale:
From Newton until Poincare, people expected the universe ran like a simple, elegant clock, and that given sufficiently precise knowledge of current conditions, we should be able to predict the behavior of dynamical systems far into the future. The discovery of chaotic dynamical systems
showed that expectation to be unrealistic. Chaotic dynamical systems are actually ubiquitous and not hard to construct, and there is no reason in principle that they couldn’t have been discovered hundreds of years earlier. People just didn’t notice them because they didn’t expect them to exist.
So what? The universe doesn’t care about our egos. And it’s not like Mozart, Gauss or Buffett had NP oracles in their brains.
Is that actually a good argument? That’s equivalent to saying: “there can’t be a polynomial solution to the traveling salesman problem because if there was we would have found it.” Maybe we haven’t been clever enough yet (to solve TSP in polynomial time or to figure out how to write SSC as easily as we can read it).
Not finding an X is Bayesian evidence that no X exists, though obviously far short of a mathematical proof.
Very often in research, empirical observations are a useful guide to what sorts of statements might be provably true. For example, if you observe that all X are Y, either there is a theorem that this is necessarily true, or in pursuing this theorem you find some conditions under which X are Y (X+Z -> Y), and some counterexamples (X+~Z -> ~Y).
It’s an observation that NP problems tend to be “harder to solve” than P problems. Maybe there is some reason for this observation besides P!=NP, but that’s not the way to bet.
Well…
If we had listened to Einstein about war and geopolitics, chances are the weapon his breakthroughs and fame helped to create would have been used to conquer or destroy the Earth in his predicted WWIII scenario.
So… turns out he was terrible when it came to geopolitics, warfare, economics… and really anything except physics. But his physics work was indeed supreme, so I think this only adds to your point.
I have to wonder if the genius-whisperers – the one who understands the geniuses and figures out which ideas are worth listening to and which ones… aren’t – would be perhaps more valuable than the geniuses themselves?
Probably not, simply because I suspect there are a lot more of them and they’re significantly more fungible.
He was good at philosophy of science, so there is at least one area outside physics itself he was good at (and it is very far from being the case that scientists generally are good at that).
Einstein was right to worry that intensive preparations for war by the world’s military-industrial complexes would increase the risk of war. Think about the Cuban Missile Crisis- a crisis created entirely by the superpowers’ effort to prepare for war against one another, in which we were potentially only a single misunderstanding away from the end of civilization. And at least on the US side, there were a lot of generals actively urging that we have that misunderstanding.
This is a case where history needs to be read backwards. We now take for granted that nuclear war would destroy everyone and is a horrible idea, and if anything fear too much pacifism.
But that’s because living in the Atomic Age has forced a certain degree of anti-war sentiment, reluctance to go to war, and willingness to voluntarily disarm some of our most dangerous weapons. These traits are necessary for the continued survival of our culture or even our species in a world that’s accustomed to living in the shadow of the Bomb.
However, in 1950 we had no such inoculation against such ideas. World wars seemed a natural and inevitable state of affairs to people who had come of age in the early to mid-20th century. War was viewed as a bad thing, but not an avoidable thing, and there was an understandable post-WWII conviction that you really, really couldn’t ever have too many weapons, as long as you could pay for them.
The thing is, as I mentioned in the paragraph before last… We had to get rid of those ideas if we were going to survive. We had to be capable of limiting the scale of nuclear arsenals, choosing to not use them even when there were risks, and making it official policy to simply never use a nuclear arsenal at all except in the most dire imaginable situations.
The military officer corps of the 1950s, and the society of recent World War Two veterans who largely trusted them, were endangering the world by being too gung-ho about the prospect of future warfare. That didn’t mean the optimal solution was unilateral disarmament, but intellectuals living in that time had a positive duty to warn society at large that a third world war would be horribly, horribly worse than the second or first ones. Even the work of nuclear war theorists like Kahn (who was hardly an advocate of disarmament or quietism in the face of Soviet aggression) emphasized this. I don’t know if you’ve ever read [i]On Thermonuclear War[/i], but it’s a good work.
Even closer, on the Soviet side – but I doubt anyone here needs reminding of Vasily Arkhipov.
A slight aside: openness is highly correlated with creativity, but also with general intelligence (.30) and with schizotypy (.29) So you quite often get a package of deal of brilliance + madness. Or as Kanye put it, “name one genius that ain’t crazy”.
I read about this in a book by Geoffrey Miller. If I remember correctly, he said there was no correlation between schizotypy and creativity, once you control for openness. In other words, it’s just something that happens to comes along with that personality trait, and doesn’t actively contribute to creative breakthroughs.
> If you took away Eddington, someone else would have tested relativity; the bottleneck is in Einsteins.
I am pretty sure, if Einstein hadn’t existed, someone else would have come up with relativity no more than a decade later. Einstein based his theories on existing works and there were other physists that could connect the dots, though probably more slowly.
And I don’t think a 1% predictor would be very useful, unless it generates particularly testable predictions. Most scientific theories do not give specific recipes of say cancer cure. They give deep explanations. How many interpretations of quantum mechanics are there? 15? 20? Would another one with a 1% chance of being true help?
A different way to apply the same principle: take low cost risks more often. Ideally restrict this to bets where you can win big, but it’s a good idea to make a habit out of it just to breed some iconoclasm and independence into your personality.
A different Scott A(dams) would disagree with you:
He’s not quite talking about the same thing, but his post may be more relevant here than is readily apparent.
I put to everyone here that we already have “a black box that generates scientific hypotheses” – it’s called the internet. Go on a trawl and you’ll have ideas to test for days. Better roll your sleeves and get cracking…
… or not, because a whole lot of those ideas are, at first glance, well… crazy.
There’s a bunch of stuff worth remembering about Einstein, say. First, he didn’t just propose the idea of relativity (special or general) – he worked out the implications behind those ideas and tied the whole thing in with the existing body of knowledge. That’s why people were willing to take him seriously enough to actually test his hypothesis.
Second, it’s not like we even needed Einstein. Henri Poincaré had worked out a fair amount of the same stuff independently. Even if you don’t agree that Poincaré would have developed the same theory in the absence of Einstein, someone else surely would have, because all the work done previously was pointing in that direction.
Kinda like Newton standing on the shoulders of giants.
Speaking of Newton, there was that row he had with Leibniz over who invented calculus. It seems that great minds do, in fact, think alike.
Darwin and Wallace is such an obvious example it scarce deserves mention.
The broader point is that “original” ideas seldom are and that a high-error hypothesis generator isn’t the blessing it appears to be at first glance. Filtering mechanisms that allow us to differentiate between the breakthroughs and the cranks is where it’s at.
I admit that this doesn’t really touch the point of the latter part of your post, but – truth be told – neither does the first part of it. The proposition that speech should be evaluated separately from the speaker is neither novel or particularly controversial. There’s even a named fallacy for it.
“I put to everyone here that we already have “a black box that generates scientific hypotheses” – it’s called the internet. Go on a trawl and you’ll have ideas to test for days. Better roll your sleeves and get cracking…”
The percentage of the black box’s hypotheses that are utter bunk is not a feature, they are a cost. Having a box that generates only bunk is different from one that is guaranteed to generate some percentage of true hypotheses. The part of the Internet that you are referencing is composed of depleted hypotheses:’crazy ideas that might just work’ after all of the ideas that actually do work have been removed.
I’m referencing the whole of the internet that can be said to “generate hypotheses” (for simplicity’s sake, we can discount all the ideas that have already been shown to work). This includes everything from pre-print servers to people’s blogs and random 4chan posts.
To characterise this as a “box that generates only bunk” would be wrong: a lot of the stuff on arXiv and such is proper scientific work, that may simply not have gone through the whole publishing song-and-dance, yet. Moreover, it is by no means clear that valuable insight is never to be found in less obvious places.
At the same time, much of the stuff of arXiv just isn’t particularly valuable, epistemically, and the rest of the internet is worse.
Why wouldn’t you think that this fits Scott’s definition of a “low probability valuable hypothesis generating black box”? I’d say with a high degree of certainty that it generates some percentage of true, hitherto not commonly known, hypotheses.
We seem to be in agreement that such a box isn’t nearly as valuable as Scott would suggest.
The key difference is that the internet has unknown odds, and probably vastly less than 1-in-1,000 odds of being as significant as relativity.
I would say we only get something as significant as relativity, across the globe, maybe once in a decade (but feel free to disagree). If we assume even a million scientists in the world, that means it takes 10,000,000 scientist-years to produce one relativity.
If it takes 10 scientists a year to test a theory, and your box is one-in-a-million, it just barely breaks even with our current system. I would currently rate the odds of “there is an undiscovered genius on 4chan and they have published the next theory of relativity” at significantly lower odds, so it’s probably not a productive use of time.
Conversely, the example of a one-in-a-thousand box is a thousand times more effective than our current system – imagine the next millennia of progress occurring in a year.
If you want to put it like that, it becomes a fairly straightforward ROI computation: assuming we have a magical fixed-odds box, where we know the odds beforehand, we can simply estimate the anticipated value of a true hypothesis, calculate the expected value given the odds and thus know what kind of testing expenditure is justified.
Unfortunately, this introduces a new layer of magic: to do this, we would have to know the unknown. It’s fine and dandy when we’re considering requests like “Box, give me a cure for cancer”, but how do you estimate the value of something like the theory of relativity if you don’t know it and haven’t had the opportunity to explore its consequences?
Fortunately, it’s besides my point, which is: the real value is in the filtering mechanisms that allow us to separate the genuinely valuable ideas from the just plain nutty. The faster, more reliable and cheaper this mechanism, the more ideas we can process.
The best part is: we don’t need magic boxes.
There is no dearth of ideas that may or may not be true. Ideas are a dime a dozen.
Afterword:
Your key premise appears to be that the box is guaranteed to produce very valuable, true hypotheses – and that makes it significant, despite the fact that the only way to find out which of the generated hypotheses are actually true and valuable requires testing all of them.
I posit that this description is true of humanity in general. It’s not like we got gravity or relativity out of a magic box.
After-afterword:
If wishes were fishes, we’d all cast nets. It would be luverly, but since we don’t have this particular flavour of magic box, nor can realistically expect to obtain one, we can simply ignore all possibilities associated therewith.
There’s been at least one example of 4chan making a scientifically relevant discovery:
https://www.theverge.com/2018/10/24/18019464/4chan-anon-anime-haruhi-math-mystery
There is a difference between ideas and scientific hypotheses, or to be more precise, scientific hypotheses are a very special type of ideas.
If Einstein had just said “Gravity might be some sort of non-Euclidean distortion of space-time”, it would have been the one of these Internet crackpot-tier ideas. If he had just said “Light might be made of particles” well duh, Newton said it before, and the Greek atomists probably did it too. The key point is that Einstein actually did the math and condensed these ideas into scientifically testable hypotheses. That’s the “execution” part that the Third Scott A was talking about.
That only narrows down which particular ideas you’d be looking at. There’s still a lot of them out there.
Edited for clarity:
Consider a scale between a statement like “the government is spraying mind-controlling chemicals from airliners” on the one end and Einstein’s papers on relativity on the other. Somewhere between the two lies Rooter.
You can use numerous heuristics to eliminate Chemtrails Guy – and compatible – without much time or effort. Rooter doesn’t appear obviously bad at first glance, but reveals its joke character on closer reading. Once you get past those simple examples, however, you are faced with a sea of seemingly serious papers that could potentially contain something valuable.
That’s funny, because I this of Scott Adams as one of the top “many crazy ideas, but many valuable and original ones” thinkers in his area of interest.
If the internet were an object it would undoubtedly be the most valuable object in the world.
“Ladies and gentlemen, I present to you: the Internet!”
This seems to line up with my experience of being a grad student: idea after idea that sounds great, then I try to test it, and it’s either impossible to implement or the results table might as well be replaced with “¯\_(ツ)_/¯”.
Of course that could just mean that my “black box factor” was pretty low, and a few of the researchers I most admired probably could spit out a game-changer once out of every few dozen ideas they tried. But there was another big group of highly successful students who were clearly churning out reams of stupid ideas, but also had the ability to test and reject their own dumb ideas at insane speeds, so from the outside it looked like their genius was just as concentrated as the first group—r-strategy researchers, if you will. Did Einstein/Newton/Darwin achieve success by having an unusual fraction of good ideas, or were they just faster than others at culling out the bad ones?
To go along with your point is the example of Thomas Edison, who said something along the lines of “I have not failed. I’ve just found 10,000 ways that won’t work.” and “Nearly every man who develops an idea works it up to the point where it looks impossible, and then he gets discouraged. That’s not the place to become discouraged.”
Of course, he only managed to create entire industries based on his patents and inventions, so what did he know, right? 😉
Different fields also have different levels of difficulty in formulating, understanding, and testing their interesting ideas. At one extreme, you’ve got some bit of theoretical physics where it takes a very smart person a decade of concentrated work to get to the point where they can really understand the current set of theories, and testing your novel theory might require some just-barely-possible experimental setup costing a billion dollars. At another extreme, a smart person can get to an “edge” and start doing new work with a couple years of study, and the field is still rich with relatively easily tested ideas. (I think a lot of computer science is like this.)
I think those of you pointing out Einstein’s mistakes are getting Scott’s example wrong: it’s not about being wrong (as Einstein sometimes was), it’s more about pursuing silly – in hindsight – stuff or not.
We don’t have enough hindsight yet to tell what stuff Einstein pursued is objectively silly in hindsight.
High openness and high discernment are not necessarily correlated.
This is definitely somewhat of a tangent, so apologies for that, but nevertheless…
This gets at why I don’t much care about the replication crisis. Or, more accurately, I’ve seen the numbers people quote, and they strike me as firm evidence that psychology (or whatever scientific field you choose) is doing magnificently well. If we can trust 50% of published results in a field, then we should be shouting this from the rooftops as one of humanity’s greatest triumphs. It is one of these boxes.
I suppose one might (correctly) argue that I’m overstating the analogy, as the replication crisis is mostly about hypothesis-testing rather than hypothesis generation. But that (admittedly large) issue aside, one of the things a scientific field is about is a way of determining what to believe about a facet of reality. If it has a 50% success rate, holy shit that is orders of magnitude better than literally method of knowledge generation humans have ever implemented until, being extremely generous here, a hundred years ago.
I firmly agree that there are certain specific problems in the field that recent discussions about the replication crisis have identified. Various ways of misusing statistics, the rarity of pre-registered studies, perhaps suppression of controversial ideas, etc. My personal favourite, which I don’t think gets discussed nearly enough, is the fact that openly exploratory work is extremely hard to get credit for, either in terms of publications, funding, or grants. This means that much of what is actually exploratory gets treated as if it were hypothesis-testing (which I think lies at the root of several of the other problems).
To the extent that solutions to these problems can be identified, they should be adopted by the field, because of course 75% is better than 50%, and hell, 51% is better than 50%. But 50% (or 30%, or whatever) is still incredible. And our understanding of the fields should reflect this.
I’ve been thinking about this for years now, probably written about it here once or twice, and I’m working on a longer piece arguing the ideas in more detail. But that’s it in a nutshell, I think.
EDIT: Re-reading this comment it strikes me that a lot of it sounds familiar, so maybe I’m parroting stuff that has already been said? Possibly even here? If so I apologize, it was not intentional, and I genuinely don’t remember any specific argument to this effect.
The replication crisis is a big deal because
1. Publication isn’t idea generation, its idea generation + some level of testing. Having a replication crisis means that this level of testing isn’t dis-confirming bad ideas at a high rate. It also might imply that some replicated publications are also false.
2. People are getting jobs based on publishing papers that don’t hold up, and are going to be teaching the next generation. Rather than weeding out bad ideas the process is propagating the teaching of them and rewarding people who come up with them.
Also, these published ideas are then used to drive policy decisions, or do other expensive real-world things. If stereotype threat and implicit bias and priming are real things, that matters a lot for how we should run schools and do policing. If your adult intelligence is largely determined by how many words you hear before the age of 5, that has strong implications for parenting, and also for stuff like universal pre-K or Headstart. If schizophrenia is caused by parenting style, we want to know that to know what kind of parents we should be. And so on.
50% (which is the figure I’ve seen thrown around a lot) is a vastly higher rate of correct rejection than has ever existed in human history, prior to very recent times.
I think we get hung up on the fact that there are two options (true or false) and 50% implies that the output of psychology (or whatever) is no better than chance levels. But this is only the case if each of the options is equally likely in some sort of random idea-generation scenario. And I think this is clearly not the case: the chance of a random idea about human psychology (or even a random idea that is commonly articulated about human psychology) being true is probably well under 1%. So increasing that by 50 or more times is worth almost any downside, if you’re interested in improving human knowledge.
I do agree the incentive structure in science is messed up, and there should be a lot more emphasis on clearly delineating exploratory and observational work from true experiments. But this will require a concurrent move of many sciences to prioritize exploratory and observational work much more than they currently do. Fortunately this is less of a shift than you might think, since a huge proportion of published work in virtually any field I’m aware of (possibly other than physics) is exploratory and/or observational, and masks itself as true hypothesis-testing because that’s what the incentive structure of science rewards.
50% isn’t a correct rejection rate its a false confirmation rate. Publishing in a peer reviewed journal is supposed to mean that your analysis has been vetted by experts.
The crisis is not that “we went through all the published studies and these are the good ones, these are the bad”, its “we went through a tiny subset of all studies and found out that if you cite a study you are as likely citing a false study as citing a correct one.” This means all studies, minus a handful that have been replicated multiple times across multiple decades, are suspect in the field.
Apologies, you’re right that it’s a false positive rate, not a correct rejection.
I stand by everything else I said. I’m comfortable with 50% false positives in an immature science, so long as we’re explicit about it (which we admittedly aren’t, most of the time).
50% false positives means 50% true positives, which is, once again, the best rate that existed in any domain of human enterprise until a century ago.
Yes, all studies, minus the handful that have been replicated multiple times (preferably by different labs) are suspect. This is the correct attitude to take.
Quoting myself…
“Suspect” in that they have as good a chance as being wrong as they do of being right. But they have a vastly higher chance of being right than random ideas about human psychology that are not part of the field, which is the appropriate comparison point.
the chance of a random idea about human psychology (or even a random idea that is commonly articulated about human psychology) being true is probably well under 1%.
I would think that any non-random “idea that is commonly articulated about human psychology” is likely to be true. On the other hand, things that get into psych journals are generally ideas that ordinary people don’t come up with, ideas that are “original” and “interesting”. Things that allow experts to “pull back the curtain” and tell you “what’s really going on”.
If they are true, it’s new knowledge. If only 50% are true but they are all presented as true and proven, there’s probably a net subtraction from human knowledge.
Or to put it a different way, the appropriate comparison point isn’t all random conjectures. It’s what is commonly believed.
Saying the psych literature is better than idiocy is not saying anything important. Saying that it increases our true knowledge more than it increases our wrong knowledge is important. The “replication crisis” says the delta is shockingly small.
This seems like it would make sense if we had a whole system devoted to testing the published results of the field, and were treating every published report as a hypothesis. Maybe replication sort of accomplishes that, but my impression is that once someone publishes a study it’s being treated as far more likely than a hypothesis.
I think if we actually were treating published results as hypotheses, we’d need to make plans to test 100% of those published results for replication (or immediately discard them as not useful enough to bother testing, and not treat them as true). Do we do that?
That’s pretty close to what I’m saying. “Hypothesis” isn’t quite the right word here, because it’s already reserved for the thing that happens before true experiments. What we should treat papers as, I’d argue, is something more along the lines of “results that I believe generalize outside my lab”.
And, effectively speaking, this is what most people inside a field do for any results that haven’t been convincingly replicated. However (a) this is very poorly communicated outside the field, and (b) even inside the fields, the incentives (publications, funding, jobs, etc) are structured so that you have to pretend you’re saying “this is proven” when what you really should be saying is “here is some good evidence that I honestly believe supports this interpretation”.
So I do believe there needs to be a re-focussing of immature sciences to focus more on accurate observation and exploration, prior to hypothesis-testing. But the ideal system I’m envisioning would actually change the current publication base very little (save that it would add more replications and pre-registrations). It’s more that you’d just slap a “not completely verified” sticker on any non-replicated results. And no, we don’t do this nearly well enough, at least not in my field.
I think you are being far too charitable.
First, as other people have pointed out, this is not merely “idea generation” it is supposed to be the idea vetting. If there was a 50/50 chance “implicit bias” was real indeed thinking of that idea is a pretty interesting thing. But then you do the testing and you say this idea is indeed 95% likely to be true. And then people look at your testing and concur. This is Einstein + Eddington being wrong 50% of the time.
Second, most of psych studies do not produce things that are all that valuable. Learning that implicit bias is true would produce little value to society. Indeed, I would argue that most of the debunked studies were actively harmful before they were debunked. This demonstrates an inherent flaw in the field that is compounded by the publishing of incorrect results.
Third, as other have pointed out, careers are decided by this sort of thing. People get into powerful positions based on publication.
I’m not fully sure I understand your first point. As for the second, my response would likely be too culture war-y to make it a useful discussion.
Clutzy:
I’m pretty sure we’d be better off knowing the truth about whether implicit bias is meaningful (say, whether those tests actually predict anything about real-world bias in behavior), regardless of which answer to that question we think would be socially beneficial.
The first point is that, yes, it would indeed be very impressive if people came up with ideas in psychology and 50% of them were true. But that is not the case. 50% of the “tested and vetted” ideas are true (or untrue if you care to state it that way). This is not a good percentage, and this is not some recency bias. Vetted ideas and works have always been better than that traditionally. Its akin to taking your busted jacket to a trained tailor and 50% of the time he fixes the rip, and the other 50% he lights your coat on fire.
The second point is not culture war-y in essence, although you can take it that way. It is simply commentary on how things that are “not intuitive” get published more and get more attention in psychology. However, things that are not intuitive are disruptive to act on, and if you are wrong acting on a non-intuitive that is extremely negative.
I think that’s literally the best percentage of accurate new ideas in any field of knowledge production in all of human history, with the exception of mature sciences in the past 100 years. Since psychology is not a mature science, it’s foolish to expect better.
I’m not sure what you mean by “vetted”. If you mean “have had appropriate replication experiments done that confirmed them” then, yes, by definition that will have a vanishingly small false positive rate (assuming we live in a lawful universe, which we appear to). However there is no traditional example of that that I’m aware of. The only fields of knowledge production where appropriate replication experiments have been performed are mature sciences in the past century or so. If you could provide an explicit counterexample it might help.
Its much, much worse that this. If something doesn’t replicate then there is a wide range of possible reasons, which includes a failure on the part of the replicators. We also don’t have a good grasp on how easy it is to replicate a false outcome, so we have this blind rabbit hole of what a replication means and what a failure to replicate means.
If something has not been replicated, its probability of being true is, let’s assume, 50%.
If something has been replicated 1000 times by 1000 different research groups, I think we can agree that its probability of being true is as close to 100% as anything.
If something has failed to be replicated 1000 times by 1000 different research groups, I think we can agree that its probability of being true is as close to 0% as anything.
If something has been replicated 1 time by the same research group that published the original finding, I think we can agree that we should be more willing to trust it than before it was replicated, but probably not that much more.
Other cases lie somewhere in between. Such is life. Note that this is true of all sciences (although the original base rate of truth might be higher or lower, depending on the field and its maturity).
It’s worth remembering that some of the effects that fell to the replication crisis (particularly priming) were considered to be really solid by a lot of the researchers in the field. This wasn’t “someone came up with a new theory and it turned out to be false,” it was “someone came up with a new theory, it became the basis for a big branch of social psychology research, lots of papers and experiments seemed to verify its basic correctness, it was taught to generations of students as solid, but then once people began to dig into it and really try to replicate the results, they completely fell apart.” (Though I believe there was also some serious scientific fraud in the beginning of priming research.)
I’m really not sure what the bee people here have in their bonnet about “priming” is. Whatever it is, there are clearly some issues with their interpretation, to say the least.
Priming refers to systematic changes in response times in response to certain categories of stimuli. Probably the earliest demonstration of priming was by Helmholtz (yes, that one), and it is one of the most reliable tools in the research psychologist’s arsenal. Conservatively, I’d say there have been thousands of different kinds of priming reported, many of which have been replicated literally hundreds of thousands of times, as many of them form a fairly standard part of an intro psych course precisely because they are so reliable. Look up things like the Stroop Task for a primer (ha ha) on this.
I think, reading between the lines, that what you (and others) actually mean when you say “priming” is the Harvard Implicit Associates Test. Which was first described in 1998, so I have no idea where you’re getting “generations” from. In this, people have to make a response of some kind to paired stimuli, and usually one stimulus is a word and another is a picture. For example, dark and light skinned faces might be paired with words with positive and negative connotations. And the finding is that a pairing of dark faces and negative words leads to faster response times than a pairing of dark faces and positive words. There’s been hundreds of variants of this, including various different stimuli, different subject groups, and different manipulations of the subjects prior to the test (like having them watch MLK speeches or whatever).
The basic IAT effect is one of the best replicated ones out there. Again, this has been shown thousands of times, if not hundreds of thousands.
What you are actually objecting to (or should be), I think, is the claim that one’s score on particular IAT variants is directly linked to important real-world traits such as hiring practices or overt acts of racism. Which has been claimed, but is currently a matter of extreme controversy with important failures to replicate some of the claimed effects.
I have no idea where you got the multi-part story you’re telling about priming, however. The IAT is only 20 years old, and I think the first results that tried to link it directly to real world outcomes are less than a decade old. I have no idea why you think this linkage is clearly false, as we’ve been discussing, a failure to replicate can have many causes, and the debate is still live.
With all due respect, unless I’m very much mistaken about what you think you’re talking about, your claims of fraud are just ridiculous and you should be ashamed of yourself. Who committed fraud? Of what nature? Do you have any reason to suspect this is true other than that you don’t like the implications of the results that you believe were reported?
If you’re making the more reasonable claim that maybe people got over-excited about the IAT results and used them to make some unjustified claims that will almost certainly turn out to be false, then you’re on more solid ground. I’d stick with that, if I were you.
You keep saying things like this, but it doesn’t make any sense to me at all.
Guy has an idea for a new bridge, engineers vet it and find it structurally sound,it doesn’t collapse 50% of the time.
Guy invents new drug, tests it out in a trial, its safe. Doesn’t start killing people 50% of the time.
The only time 50% is a good ratio is when you are competing, like if 50% of your draft picks become really good players, or 50% of your tech startups become unicorns.
This is a nonsensical statement. If the base rate is 0.1%, then 50% is an incredible improvement. That’s what I’ve been suggesting are appropriate kinds of numbers for, respectively, the chance of a random statement about human psychology being true, vs the chance of a random statement about human psychology from a published paper in a decent journal being true.
What proportion of published claims in 19th-century physics or chemistry or biology would you say were true (or even coherent)? Something changed as the fields matured, and we began to be able to rule out entire domains which had previously been hugely important research areas. Phlogiston, caloric, the vital essence, etc.
There’s no way to produce knowledge without going on massive wild goose chases. Because sometimes we can’t know we’re wrong until generations have passed. That’s just the way life is.
Are you claiming that all the physicists who believed in the ether were frauds, charlatans, or incompetents? Because that’s not a good line to take.
There are undoubtedly a lot of ether-equivalents in modern psychology. Tools that can root them out more quickly are desirable. But the idea that it’s somehow shameful that they exist, or that we should have some kind moral judgement against the people who believe them, is fundamentally silly.
You are conflating musings and speculation about what a person thinks could be true with assertions that a thing has been proven significantly more likely to be true than not true.
You’ll have to be more specific about what “things” have been asserted to be true for me to understand your point. The only thing you’ve mentioned so far is “implicit bias”, and I’m not sure what precisely what you mean by it, or what you think actual scientists have claimed about it in research papers. But I suspect it’s something along the lines of what albatross11 calls “priming”, in which case see my lengthy reply on that.
I’m talking about priming research in general. I should be clear here: I’m an outsider who tries to read some papers (my favorite non-major class in college was a social psych class where we read a bunch of papers, and I remember reading about priming back then). So I could be misunderstanding a lot of nuances here.
This is a nice article describing some of the issues that have been raised, and it also includes links to articles in Nature about a fraudulent researcher in the field who was unmasked.
This is webpage that talks about these issues in some depth.
This is another article, talking about a large set of studies in psychology that have failed to replicate.
These issues are also discussed in some depth on Andrew Gellman’s blog, and were discussed in Gellman’s appearance on the EconTalk podcast, and also in Kahneman’s appearance on Conversations with Tyler. Gellman also wrote a wonderful unpublished paper called The Garden of Forking Paths discussing ways that researchers often unintentionally helped themselves to huge numbers of degrees of freedom that they couldn’t adjust their p-values for, because they didn’t even know they’d done it.
This blog post by a major researcher in social psychology is heart-rending and honest and makes me think that the future of the field is pretty good, even if the past is full of confused results that nobody can be sure of.
Among these are things written by several top researchers in psychology and statistics, who definitely are experts. This makes me thing that I’ve understood correctly that the replication crisis is a huge deal, and that priming research has been hit very hard. But like I said, I’m an outsider trying to understand things I have little training in–if I’m misunderstanding something, I’d like to know it.
albatross: thanks for the links. I’ll look over them in more detail.
Jesus fuck – I just spent twenty minutes writing a detailed edit to my previous response, I noticed I had a minute of editing time left so I quickly finished and pressed “save”, and now it tells me I was out of time. Sigh. Oh well, this is perhaps a more coherent version.
OK… those are all really good links (well, the articles are, I didn’t listen to the podcast but the names you gave are all ones I recognize, at least, and given the quality of your other links I’ll assume it’s similar). I’ve read most of them before, I think, though long enough ago that I’ve forgotten many of the details. I should have realized who you meant when you mentioned the fraud, and apologize for any unnecessary harshness.
Importantly, none of those links tell you that “priming” is fake, under whatever interpretation you want to place on the word “priming”. I promise you, there are many, many kinds of priming that are clearly, obviously real if you look at the quality of data collection and the number of times related data have been collected. Even *gasp!* forms of social priming as described in those links.
Furthermore, several of those links explicitly note that a relatively high rate of false positive findings is a necessary consequence of science being done properly when the field is not yet well-defined. To quote the NobaProject link:
To the extent that there are ways of decreasing false positives without doing damage to the field, they should be implemented. Several of your links mention ways of doing so, some of which I’d alluded to above: genuine career-related rewards for replication studies, formal pre-registration of studies with guaranteed publication, a much higher tolerance for explicitly exploratory and observational studies that do not need to be couched in the language of hypothesis testing, much less emphasis placed on flashiness of results, career-related rewards for peer review, etc.
But even if we implement all those perfectly, we should still expect a relatively high rate of findings that are ultimately shown to be non-generalizable. I don’t know what the precise figure is, but I’d guess it’s in the 25% ballpark. Because we really, really want people to be able to do interesting things that might pan out, and to describe what they’ve done, even if those findings are perhaps not rock-solid (provided they’re as honest as possible about the limitations of these findings). Otherwise we will hobble the ability of the field to discover truly weird and interesting things.
That being said, I’m happy to admit there’s a lot wrong in social psychology, and they deserved a bit of a spanking. Mostly because I’m not a social psychologist.
Complete amateur here, but one who enjoyed cognitive psych in college and occasionally reads about it still — I think that the priming research on speed of recognizing words or symbols is still considered fine: If you have been shown flashcards reading DOCTOR, NURSE, DRUGS then you will be several 100 milliseconds faster in reading HEALTH than HEARTH.
The questions are about research claiming longer lasting and larger effects: That someone who saw such words would be more likely to report illness in a survey an hour later, for example. Stereotype threat is a particular example of such a proposed long lasting effect, where priming a stereotyped social role is supposed to effect test taking behavior over the course of an hour.
My personal take is that the fact that social psychology is going through this crisis now means that in the future, their results are likely to be much more reliable.
I think the thing you expect to see in well-done science is that you have “false positives” either because the researchers just get lucky, or because someone screws up unintentionally.
As an example, there was a bunch of research done on some kind of mouse retrovirus (XMRV) that was believed to be involved in some human cancers and also in chronic fatigue syndrome–after some people dug more carefully into the data and samples, they discovered that the virus wasn’t present in humans at all–it was a common lab contaminant in labs that used mice/rats for experiments. All the reported results were apparently honest errors that involved differences in which labs had the contaminant, and sometimes in when the contaminant arrived in the labs, plus garden-of-forking-paths type errors. This whole story was covered on the excellent TWIV podcast many years ago.
@David Speyer: agreed.
@albatross11: yes, I think the changes that have happened in psych (and are still happening) are mostly positive, and that the rate will improve. Just… less than you might think.
I don’t think we’re arguing any more? Thanks for the pushback and forcing me to clarify.
Just as a starting point, I am going to point out that I am not the person who is making an extraordinary claim here, you are. You are making the claim that it is objectively good that a field “confirms” theories that have been shown to be about 50% untrue. You are the one saying P-hacking is good and pre-registration is bad.
I would not object to a “Journal of Psychology Ideas” which would be like a theoretical physics journal that lets people publish various thoughts on string theory, super symmetry, etc with all their mathematical proofs. And no one cares if 99/100 of those are wrong. What I am objecting to is that psychology takes and idea (which no one objects to testing) and tests it. Then they publish the test, and the test is unreliable because of P-hacking and a corrupt practice that lets you test dozens of variables at once, and then claim one of them is “significant” only after doing the test.
If you continue to want specific examples of important findings that haven’t held up:
Smiling makes you happier
Self Control as a limited resource
Whether kids eat a marshmallow at 5 determines your future
Google and other search engines have reduced memory
And some of these are the reason I said its kind of silly to pretend these are important, but people have treated them as important, and they aren’t replicable. And the biggest problem is a lack of pre-registration along with P-hacking. I could run a battery of tests on preschoolers, and inevitably I would find a result with p<.05 and then I could use that to push (or not) a theory about how preschool is awesome. This is not an objective good, it is an objective bad.
@Clutzy:
So I’m going to tap out after this one, I don’t think you or I are having a particularly productive discussion, and I feel bad for riding my hobbyhorse all over Scott’s not OT page. But just one point…
You’re confusing findings and interpretations of findings. For example, strictly speaking, the marshmallow test findings are vindicated. That is, children able to resist eating a marshmallow for longer do better in later life. This result has been nicely replicated.
The interpretation cast on these findings in the original paper, however, has been cast into doubt, by controlling for factors that the original researchers did not.
So… this is an example of theories about data being challenged by new data. That’s, you know, how science is supposed to work. And not even a tiny bit an example of the replication crisis.
With all due respect, I’m not sure your other examples are going to serve you much better. I’d still love to know what you think the “implicit bias” finding is and how it’s been reliably disproven, but that should probably wait for an OT. You have yourself a nice night, sir/madam.
EDIT: I didn’t notice this when I skimmed over your post, but I’d just like to point out that I explicitly said pre-registration is a good thing, and I promise I’ve never said p-hacking is good.
The point of an experiment is not to report what happened to the experimental subjects, it’s to use what happened to them in order to reach a conclusion about a broader group.
If the result of the experiment you are discussing was “among the children we tested, those who were willing to put off consuming the marshmallow ended up with better jobs,” that’s interesting only to the extent that it reveals a pattern among people in general. If the experimenters failed to control for a variable that correlated with their independent and dependent variables and for which they had data, and controlling for it eliminates the effect, then however accurate their report about their subjects, it didn’t support the conclusion they reached.
Isn’t that point obvious? If a result is obtained by p-hacking, it’s still a true statement of the experimental result–it’s just that the implication the experiment is supposed to support isn’t actually supported.
Enkidum:
Yeah, I think we’re in broad agreement.
I have this suspicion that a lot of the replication crisis was driven by the proliferation of popwerful statistics software. Lots of researchers got access to sophisticated statistical tools they didn’t fully understand, and could try several approaches with these tools until they found one that seemed to be fruitful. In the days where researchers were limited to statistical tools they could compute themselves and had to do the computations on a calculator/looking up the cutoff values in a table, they had less power, but also less ability to shoot themselves in the foot.
@DavidFriedman
The argument was that the marshmallow study was valid for the larger group, as it replicated.
This is different from p-hacking where a ‘significant’ result is tortured out of the data, in a way that doesn’t replicate, as the p-hacking consisted of magnifying noise.
@David Friedman
Like @Aapje said, the point is that the result does replicate – on average, kids who do better at resisting eating a marshmallow will likely go on to have better adult lives. It’s just that (if I understand the results I’ve skimmed over correctly – fair warning I could be wrong) the causal interpretation of the results in the original study is wrong. But the finding they reported is generalizable (which would not be the case if it was p-hacked).
I’m not trying to say that the original paper is good, honestly I don’t have a particularly strong opinion about it. It’s just that it’s not a victim of the replication crisis, which is about non-replicable results.
@Enkidum:
My point wasn’t that that result was p-hacked, it was that p-hacking gave an example of the difference between getting the results of your experiment wrong and getting the implications of your experiment wrong.
I conjecture from what was said so far (I don’t know the actual paper in question) that the criticism was that the effect disappeared if you controlled for some third factor. If that third factor is responsible for the observed relationship, that makes it likely that the implication supposed to be derived from the experiment was wrong.
Suppose the result of a study is that shorter people have wider hips. That could be a correct description of the data, but if the reason was not controlling for sex the implication would be wrong.
@DavidFriedman
I think that you are using confusing/wrong terminology.
P-hacking is far from the only mistake that researchers can make. We should not call all mistakes ‘p-hacking’ when the actual issue is something else (like ignoring (potential) confounders).
@aapje
P-hacking is called that because its not accidental. Its intentional massaging of stats to fit one’s own worldview onto a data set and/or to get enough results to become tenured (which not coincidentally is easier to achieve if you are pushing a “correct” worldview).
@Aapje:
I wasn’t calling all mistakes p-hacking–you are misreading my original comment.
I was using p-hacking as an example of a situation where the result of the experiment is reported correctly, the implication incorrectly. Failing to control for a relevant variable, which is what the experiment in questions was claimed to have done, is another example.
@David Friedman – I think we’re in agreement, then. The experimental results do replicate, but the interpretation placed on said results is wrong, apparently mostly due to failure to control for a third factor.
Hmmmm… but re-reading I think you’re objecting to me having conflated findings and interpretations myself in some of my previous comments, and I think you may be right. I’m giving up on this thread, I think, but will re-visit this in a longer piece in the not-too distant future. Hopefully I’ll be more coherent.
@Clutzy – that is not even remotely what p-hacking means. If you read Gelman’s Garden of Forking Paths paper, which @albatross11 first linked, you’ll notice that he specifically discusses how he regrets having used the term precisely because it leads some people to erroneously interpret it the way you have.
I think you’re completely wrong about how a 50% rate in psychology compares to knowledge acquisition in other realms. If you were to choose “mothers who seem to have two well adjusted kids over the age of 17” and ask them to develop ideas about human behavior I think you’d get at least a 50% rate of useful psychological information. Which is great. But the fact that trained psychologists can’t do better after being vetted through journals suggests that there is a lot wrong in psychology.
EDIT: This reads a lot grumpier than it was intended. Sorry, long day.
Really?
Most of what random people on the street will give you is some kind of watered-down version of Freud, Deepak Chopra, and whatever they last read in a newspaper.
Here’s a few psychological questions whose answers I have some degree of confidence in (I am a cognitive psychologist-cum-neuroscientist), that I think the average person would be utter shit at:
1) Does growing up in the city or in the country better predict adapting well to active combat roles?
2) What proportion of native English, Czech, and Chinese speakers have some form of experience of colours when they see letters (grapheme-colour synaesthesia)?
3) Does smoking make you smarter in some ways? If so, how?
4) When we attend to multiple things “at the same time”, do we actually do this at the same time, or are we constantly switching attention back and forth between them?
5) What is attention, anyways?
6) What visual cues do we use to catch a baseball? Specifically, what information present on your retina is helpful in this regard?
7) If you want to find your keys in a messy room, are you better moving your head and eyes around a lot, or keeping them relatively still and moving slowly? What if it’s a very clean room with very little in it?
8) Schizophrenics kill themselves a lot. Why? When are they most likely to do this?
9) Parkinson’s patients who are successfully treated tend to become problem gamblers. Why?
I could go on. But the point is – those are all the actual kinds of things that psychologists study. Answering them is extremely difficult, and takes a long time. I don’t actually care what your intuitions are about any of them, or what my wife’s intuitions are about them (we have two kids, not quite 17, but close enough). Intuition in such matters isn’t quite garbage, but it might as well be.
Perhaps I’m being unfair. Obviously most of the proto-hypotheses that your ideal mother would generate would be of a very different nature from the answers to those questions. But I bet you a significant chunk of money they’d be, on average, pretty crap.
Anyways, to ask a question I asked above: what percentage of reported findings in 19th-Century physics do you think were wrong, or were so fundamentally garbled as to be effectively meaningless? I have no idea what the answer is, but I bet you it’s a hell of a lot higher than it is today. And this is a necessary feature of an early science.
I think the replication crisis is as bad as it is because it is magnified by another sin: thinking that finding “an effect” is a useful hypothesis irrespective of size.
Outside mathematics and physical laws, it’s safe to assume that nothing stochastic has zero mean. Certainly not in the social sciences. The population effect of X on Y is surely either positive or negative (outside trivialities such as Y being a logical restatement of X, or a small finite population where the numbers work out just right). So there’s your 50% generator; just choose any X and Y and you have two hypothesis about effects and one of them will be true! You can even test it though it will take a while: take a large enough sample and you exhaust the population (gaining certainty) or else achieve however small a p-value as makes you happy.
People in psychology, like other sciences, are aware of effect sizes. If you mean that sometimes people get more excited about results than they warrant, even if they are true, then sure, I agree.
I’ve read more than enough papers that (however the authors might be “aware” effects have sizes) are which written entirely from the perspective that H_0: theta = 0 is a scientifically interesting question to test and report. To a very tight approximation, in social sciences that’s NEVER true.
The charitable interpretation would be that no-one ever means this to be taken literally, but instead really mean “theta is nontrivially/interesting large”. In which case we might reasonably ask the community to be a bit more straightforward about how they report their results.
But more concretely, if you don’t really mean theta=0 (and understand that this is not really what you mean), you’d do a few things. Your paper would probably have some discussion of what effect size is large enough to be interesting, and would actually discuss – in the abstract or body of the paper – the effect sizes found (and relegating them to a table in the appendix, with no other mention, doesn’t count.) It would not present results as if the actual p-value found was a (let alone the) measure of how interesting the result is. If it had to use an alpha=0.05 hypothesis test, it would provide some grounding in relevant effect sizes by giving a good and motivated power analysis so that we know you had sized the test to find interesting (by a reasonable, hopefully discussed, standard of ‘interesting’) results.
Well, I’ve read many papers that whose title/abstract is about claiming/denying the existence of an (unquantified) effect, and which entirely fails all the tests I propose in the previous paragraph.
—
I’m being altogether too serious. My original claim is correct but silly, and is: for pretty much all predicates X and Y within an absolutely vast, near universal, class, exactly one of:
X increases the chance of Y
X decreases the chance of Y
is true, and a large enough test could find out which with overwhelming confidence. Which only goes to show that we don’t want a “true with 50% probability” hypothesis generator; that could be entirely useless. The caveat that the hypothesis must be “good” ones (covered in Scott’s first sentence), not just possibly true ones, is absolutely essential.
Your first 4 paragraphs sound a lot like Stephen T. Ziliak and Deirdre N. McCloskey, The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives. They are economists.
So, yeah.
I’ll retract most of my previous response, you’ve noted a genuine issue that is prevalent in psych, particularly social psych. We should fix that too.
I’d say the replication crisis in psychology is important because of the political implications (and agenda) of a lot of that research.
Analogously, it is also important in medicine because health care policy is standardizing, and becoming more rules-based (over judgment-based) than ever before, and in matters of life and death; so the body of evidence for “evidence-based” reimbursement and treatment decisions better be good.
Fair. I think I overstated my point originally. It’s not that it’s not important, it’s that the bare fact that some large proportion of published findings are false is not necessarily something to worry about (in fact I think it’s an inevitable part of being a science at this stage of its development).
So long as we place the appropriate level of confidence in published findings, and don’t rush out and re-write laws on the basis of things we shouldn’t be that confident in.
Which clearly isn’t the case. I’m not sure what the solution is to that problem.
But what I am certain of is that the appropriate response is not to demand that every published finding (or 90% of them, or whatever) should be fully replicable. That’s a goal to aim towards, and maybe in 50 or 200 years we’ll be there. But we win’t there yet, and we shouldn’t pretend we are, otherwise the field will stagnate.
This is the problem. It wouldn’t be problematic is there were a large portion of published hypotheses that were false. Findings is a different issue. Findings should be very hard to rebut. The non-replication rate of findings should be 5% or less.
0.05% isn’t a magic number, it’s just an arbitrary threshold that psychologists happened to decide on one day. Physics, as you probably know, uses the 6-sigma rule, which is orders of magnitude stricter. And there are cases where one might not want it to be as strict as 0.05. For that matter, p-values don’t mean what you seem to be saying they mean.
But I’m splitting hairs. Let’s say 0.05 is super special and somehow controls the replication rate of well-designed hypothesis tests. Many results aren’t hypothesis tests. This is a feature, not a bug, so long as we’re honest about it.
One very frequently has to report findings that are based on already having examined the data. One can slap a confidence interval on such a finding, and pretend that this somehow meaningfully reporting the likelihood of the data having been produced by chance, but we all know this is nonsense (but it’s not fully nonsense – a smaller CI clearly means something better than a wider one, it’s just we can’t meaningfully quantify how much).
I’ll note that the very good links provided by albatross11 don’t agree with you.
Why is Newton looking for bible codes so silly? I mean, as an twentyfirst century atheist, sure. But if you accept the premise that God exists, is the creator of the world, and the Bible – a rather incoherent jumble of contradictory narratives – is His Word, why wouldn’t you look for the hidden codes revealing the Secrets of the Creator?
Another example might be Bill Gates’ idea of a robot tax to maintain tax revenues as workers are replaced by technology. He’s obviously a smart guy, but that idea should have taken all of five seconds to dismiss.
Isn’t this more or less the same as UBI?
I mean, UBI is not supposed to be funded by taxes only on high-tech economic activity, however if we assume that in the future high-tech will dominate the economy and replace most of labor, then most taxes to fund UBI will be “robot taxes”.
@Ketil
Because this will incentivize capitalists not to automate in the first place? It kind of reminds me of that guy I once heard of who wanted a tax on high speed trading to tackle it and also provide revenue for free college.
That’s what I thought five seconds after hearing it, but maybe you had a different five seconds.
@vV_Vv
The rules behind the taxes matter. UBI being funded by some kind of new tax that directly applies to automation is quite difference in terms of effects than UBI being funded by the more general tax base, even if the overall theoretical amount of money involved is the same.
I agree.
Maybe Bill Gates is precisely thinking of disincentivizing automation to some extent to trade it off with jobs? Sounds kinda strange since he’s the capitalist who arguably contributed more than anyone else to automation in the white-collar job sector, but it’s not impossible I guess.
Maybe Bill Gates is precisely thinking of disincentivizing automation to some extent to trade it off with jobs?
Too lazy to look it up, but I’m pretty sure he thought of it as a way to secure the same amount of tax income in a heavily automated society with a lot of leisure time or unemployment.
“quite difference”
Damn typos.
Well, yes.
And also because of the impossibility of managing such a tax. I mean, unless you are six years old and imagine anthropomorphic robots complete with commutes and mortgages and lunch breaks, you need to define what, exactly, constitutes a robot. Google’s automated ad systems, for instance, how many robots are those, and how much tax should Google pay? Or just the 3D printer my local IT department bought? Surely a robot. Should it be taxed as one average industry worker? More? Less? Automated stock trading systems? Airline ticket booking systems?
The best thing about the proposal is that it will likely keep the less gifted fraction of our politicians busy for quite a while.
Doesn’t sound that impossible. My 30-seconds solution is to define “automation capital” as all capital other than real estate and tax companies depending on the value of their automation capital divided by their labor expenses. Or you could just estimate labor productivity for each company according to a standardized method and tax it. I’m confident that Bill Gates and his hired economists thinking about this for a couple years could work out the details.
I’m not saying this is a sensible proposal, since it rewards rents on real estate and disincentivizes labor productivity and technological progress, but from the accounting point of view it doesn’t necessarily seem to be any more complicated than the existing taxation mechanisms.
If you imagine that any message from God embedded in the Bible would have to be incredibly hard to decode, or it would be decoded now, then probably you should only tackle that problem if you’re a one-of-a-kind genius….like Newton.
I don’t think Newton’s Bible Code is an error theory of the truth found in scripture, as the claim the Bible is incoherent is also tied very much to the tendencies and biases of atheism.(not saying it’s wrong, but it’s not an obvious starting point, as most of Christian theology never goes down that rabbit trail, and the stuff that does usually does not assume contradictions)
Newton, IIRC, derived many theological claims from reading the text rather than numeric codes, although definitely many unorthodox ones given he was a non-Trinitarian. Bible codes, however, are driven more out of associative pattern mapping to identify the future and Newton himself was very focused on the Book of Revelation.
If any of those geniuses hadn’t done something inexcusably wrong, it would be necessary to invent something inexcusable to attribute to them. For Einstein, it’s nuclear weapons.
I think this is an innate defense mechanism against Gods.
Ah, interesting! It reminds me of several things.
Someone (you?) pointed out that it was usually useful to separate out idea generation from idea evaluation. Like, your brain produces ideas. You can avoid blurting out the stupid ones. You can sort of help by absorbing more interesting ideas beforehand. But you can only really get “more ideas” or “less ideas”, there’s no knob in your brain that gives you “better ideas”. So the right strategy is to come up with lots of ideas and look for the good ones. It sounds like it’s the same for geniuses.
It also sounds like we don’t really recognise this. People sort of default to thinking that someone who came up with genius ideas is a good thinker in other ways. And often they are. People are TERRIBLE at evaluating ideas, so we really, really want to respect someone and just accept most of their ideas. Even though you can’t really do that.
I almost want a better words for “someone who isn’t trustworthy but comes up with great ideas sometimes” and “someone who is usually right”, to emphasise that it’s common that geniuses are weird and flaky, but I don’t know what I’d say.
I like contrasting Pinker, who is usually reliable, and Taleb, who is erratic but interesting.
I even like how they have different exercises: Pinker is a superb endurance athlete and Taleb is massively into weightlifting.
I have both solidly in the interesting/unreliable quadrant.
I think way too many people evaluate writers or thinkers on their political, tribal, or moral characteristics. I’m much more interested in whether they give me new insights, help me learn things, expose me to useful ways to think about stuff I didn’t consider before. I find (say) Matt Taibbi and Radley Balko and Glenn Greenwald to be interesting people to listen to, not because I think they’re uniformly good people or are on my side in some tribal/CW sense, but because I think they often have something interesting to say that teaches me something, or makes me think.
The other almost-uniform error people make is to evaluate current speakers based on currently-available facts, rather than going back to their comments a few years later and seeing how they held up.
My current example of someone I learn things from who goes to some trouble to make it clear he is not part of my tribe is James Scott.
My current experience in research is that the mythos of a genius is holding contemporary science back. This does not mean the argument by Scott is invalid, but that it ignores some social costs of glorifying Einstein, Newton, Crick, Mendel, … To me it feels that the current science has too many wannabe geniuses, everyone having a pet theory they pursue in hopes of becoming the next Einstein and few people bother to do some actual, less glamorous work. In the example with Newton, this would be de Brahe and Kepler making all the measurements of planet movement. I tend to believe those measurements were a scarcer resource than Newton-class idea generating. Likewise ideas similar to Einstein were pursued by others, but there was only one Michelson and Morley.
In my field (bioinformatics, mainly for molecular biology), everybody tries to uncover some “new mechanism”, but few people actually care about making measurements reliable, the sample sizes are ridiculously low (N = 3 is the standard) and there is a lot of handwaving over details. At the same time the big names (and egos) and their labs continuously fight each other, instead of cooperating. You may notice that some of the most important advances in the field are methods – western blot, PCR, RNA-seq, single-cell sequencing, … Without those, the great ideas of others could never be tested or even conceived.
I believe more humility and thoroughness and less ambition for Nobel prizes, next Newtons and the like would serve my field well.
I would go as far as to assume that most contemporary science is limited by data and methods, not by ideas – just look at current physics – so many clever ideas: supersymmetry, quantum loop gravity, strings, … Yet no way to actually put them to test.
I have friends who are ecologists, and it is the same there.
Collecting data and making an experiment that is even slightly more complex than the simple predator-prey model takes years. Every new variable you introduce means more measurements need to be done, so there is a very hard limit on how much complexity you can do in the lab (or observe in the field). Meanwhile, people with computers come up with new ideas and new mathematical models all the time, and publish regularly. And because of the publish or perish mentality, they keep churning models but don’t go around collecting data.
Going to the field, designing a good experiment, and obtaining useful data is very hard work. There are many people who fail, and don’t get useful data. Mathematical models are easy. They do require a lot of thinking, but there is less likelihood of failing because of some random accident that destroys everything (equipment failing, your colony getting overheated, equipment gets destroyed by animals, etc.).
While there may be fields where theory is what holds back everything else, actually useful data seems to be the bottleneck.
@Blueberry pie
+1
Nobel prizes seem largely vestigial anyway, premised on that idea of a single genius. Perhaps they should be more like the Oscars, with categories, including technical ones.
I think that funding should be way less dependent on originality or such, but way more on actually proving something. In my ideal world, far fewer hypothesis are investigated, but those that are, are investigated far more thoroughly.
I agree with this 100%:
And also with this:
But think I disagree with this:
The reason I say “I think” I disagree is that I’m not sure precisely what the “something” you have in mind in “proving something” is. It feels like you mean a useful simplifying theory of some kind (I don’t mean something as ground-breaking as relativity or whatever, just a run-of-the mill theory that provides a useful explanatory mechanism). In which case, I really do disagree. It’s not that I’m opposed to such theories (though, like you, I think there should be less emphasis on generating them). It’s just that I think that for most of us mere mortals in the sciences, we’ll be lucky to come up with one of those that happens to be true in our lives. Maybe five or six, if we’re insanely lucky. So if we predicate funding on that, no one will ever get funding (to a first approximation).
Rather, funding should be targeted towards “proving” things in the far more mundane sense of “making reliable observations that can be replicated”. “This thing happens under these conditions” is enough for me, and should be the basis of most science, at least in relatively young fields like all of the cognitive sciences (where I make my home).
It seems well-compensated by article after article in popular scientific publications denouncing the individualism of the genius, made worse by the fact that the geniuses we revere are too white and male. These articles are particularly wont to appear around Nobel award time.
There is need for both, but TANSTAAFL; and no short-cuts exist. Wanna-bes are a price to pay for an environment where geniuses can emerge. You pay for decentralization with inefficiency, and for centralization with fragility and rigidity.
This doesn’t seem very kind or necessary.
IMHO it’s necessary to avoid a comment thread about it.*
* For a somewhat generous value of “necessary,” which is itself necessary to allow most of the comments in the thread to qualify.
I think arguing a belief in early-twentieth-century socialism is reasonable is odd. Modern socialism is a very different beast from the statist, nationalising, centralising ideology that Einstein knew. For a start it’s generally the political ideology most influenced by postmodernism, for good or bad. Whilst I regard loden socialism as equally stupid as Einstein’s own ideology, at least it can argue it has evolved away from the state as sole provider model which has consistently failed. Whether Scott intended this nuance to apply is if course unknown to me, but in a reasonable light this seems a valid criticism.
The argument is horrible. The rejection-because-of-one-stupid-thing is a strawman for some, and the least of their epistemiological problem for others.
For the former, it’s generally the case that the rejected thinker has either a too low signal to noise ration for a particular person and their interests (i.e. there is a long line of people who are just as relevant and personally interesting and right more often) or the rejection is actually based not on saying one stupid thing, but on what it reveals about the underlying processes: Since you mention Pinker, I can add the anecdata that I read one of his books (forgot which) for which I was fortunate enough to know the primary literature well. I found about half of his citations misleading, omitting crucial details and his interpretations widely outside the scope of the original work. Not accidentally, but in order to serve an argument he wanted to make. Consequently, I don’t trust him at all. (I realize other authors do this too, and remain cautious, but for Pinker I know.) If he ever comes up with anything ‘as game-changing and elegant as relativity’, I’m perfectly happy to learn this idea from secondary and tertiary sources, like I do for the vast majority of ideas I learn about. Finally, there’s a bit of Pascal’s wager going on, in that the rewards in actual cases are more likely ‘right idea about something which is either irrelevant to your life and the life of your children or over which you have exactly zero influence’. Which, sure, is nice to have. But if the black box only talked about the Greco-Persian wars and was right 1 out of 100, I’d happily take the textbook version, i.e. the one were experts in the filed have determined which statements are correct and which of the correct ones are relevant. That might put me 20 years behind the curve, but so what.
I’m not sure at this point there’s a single one of Pinker’s main theses that I trust. He changed my life, for certain, and I was a huge fan for a long time. But then I learned a little more about some of the things he writes about, and had a similar experience to you.
It seems highly premature to consign Pinker’s views on AI risks to the dustbin of history, not least because they are basically right. No theory explains why an AI should want to take over the world, or should want anything: wanting is a meat thing. Perverse instantiation tries to meet this objection but no version of it I have ever seen is remotely credible. If you ask an AI to run a paperclip factory you are going to have to tell it your mission statement, whether they is to make the world a happier and better place or to maximise shareholder returns at all costs. Either way turning the world into paperclips patently doesn’t cut it, because where are the customers going to be and where will the money comes from to pay for the paperclips?
Even more boringly, AI risk cannot arise because of timing issues. There is no shortage of evidence that bad hombres tend to exploit the capabilities of computers to do harm as soon as they possibly can. A pure rogue AI is therefore doomed to fail because, if AIs are capable of doing that much damage, it will already have been done by a combination of AIs and bad hombres.
The criticism of Pinker is not that he has the wrong position on AI risk, it is that he cited Stuart Russel as a skeptic on AI risk when S/R is, in fact, a major proponent (he has written papers about how it is an existential risk, given talks about how it is an existential risk, and been on several podcasts stating that it is an existential risk), and when criticized, he doubled down. This is bad behavior regardless of who is correct on the AI issue.
Unrelated, are you interested in your position on AI risk being challenged? (I get the impression that the arguments you’re claiming are unconvincing are not the ones which most people in this field would make.)
Newton in particular was an odd case. He saw his work on physics as the least significant part of his work to unravel the divine plan of the universe and much more significant to his work were things like Alchemy and deciphering the Bible. He worked tremendously to find things like the philosophers stone and elixir of life, the sacered geometry of biblical places lik the Temple of Solomon, predicting the end of the world and finding the answer to all manner of esoteric and mystic questions: all of which (with his physics) he saw as part of the same endeavour.
There’s a quote by John Maynard Keynes, who made a study of Newton, which described the unity and inter-related nature of Newton’s thinking, “Because he looked on the whole universe and all that is in it as a riddle, as a secret which could be read by applying pure thought to certain evidence, certain mystic clues which God had laid about the world to allow a sort of philosopher’s treasure hunt to the esoteric brotherhood. He believed that these clues were to be found partly in the evidence of the heavens and in the constitution of elements (and that is what gives the false suggestion of his being an experimental natural philosopher), but also partly in certain papers and traditions handed down by the brethren in an unbroken chain back to the original cryptic revelation in Babylonia. He regarded the universe as a cryptogram set by the Almighty – just as he himself wrapt the discovery of the calculus in a cryptogram when he communicated with Leibniz. By pure thought, by concentration of mind, the riddle, he believed, would be revealed to the initiate.”
And it very famously described him thusly:
“In the eighteenth century and since, Newton came to be thought of as the first and greatest of the modern age of scientists, a rationalist, one who taught us to think on the lines of cold and untinctured reason.
I do not see him in this light. I do not think that any one who has pored over the contents of that box which he packed up when he finally left Cambridge in 1696 and which, though partly dispersed, have come down to us, can see him like that. Newton was not the first of the age of reason. He was the last of the magicians, the last of the Babylonians and Sumerians, the last great mind which looked out on the visible and intellectual world with the same eyes as those who began to build our intellectual inheritance rather less than 10,000 years ago. Isaac Newton, a posthumous child bom with no father on Christmas Day, 1642, was the last wonderchild to whom the Magi could do sincere and appropriate homage.”
Keynes said: “Isaac Newton, a posthumous child bom with no father on Christmas Day, 1642, was the last wonderchild to whom the Magi could do sincere and appropriate homage.”
How much of Newton’s worldview (e.g., his Arianism) had to do with his being born on Christmas (along with his enormous ego)? I kind of get the impression that Newton thought that he and that other Guy born on Christmas were worthy rivals to sit at the right hand of the Father.
What did you mean by “which shocked 17th century sensibilities the same way trying to link consciousness and matter would today.” Are people still arguing that consciousness is some special thing with no connection to matter (the brain)? If so, explain what happens when you get drunk. Does beer have the ability to interact with your metaphysical consciousness?
I’m pretty sure Scott meant something like ‘link consciousness to an indivisible physical substance (as opposed to being an emergent property of a system)’. Obviously I’m inferring this from Scott’s other writings and not this short snippet.
Linus Pauling is variously quoted as having said something like this himself: “If you want to have good ideas you must have many ideas. Most of them will be wrong, and what you have to learn is which ones to throw away” or just “the best way to have a good idea is to have a lot of ideas.”
Even if he himself fell victim to a weird vitamin rabbit hole, I think he was basically right.
Related to this, and perhaps in an effort to square this post with this one and this one about Scott Adams’s emphasis on procedure over ideas: I find that the best way to have good ideas relevant to my field is, boring as it sounds… to just spend a lot of dedicated time on research and writing, with some mixture of attempting to focus and yet also being willing to follow strange leads, so long as done in a focused way. Maybe this sounds incredibly prosaic, but what I guess I’m groping at is that “ideas” are, for me, the product of working on problems I’m interested more than the reverse (that a sudden flash of insight gives you idea and then you go work to see if it’s true; like this post implies, even knowing that something might be worth trying to prove is often half the battle).
My advice is that when you want to get some good thinking done, put your hands on the keyboard and start typing. The act of writing will often force you into coming up with new ideas to make what you’ve put down on the screen more plausible.
Some of that, I suspect, is due to my being old and my short term memory being shot. I used to do my best thinking shooting baskets in the driveway, but now I need to have my computer in front of me to store any ideas I come up with.
I’m a big fan of jotting down fragments of sentences of random ideas that occur to me that I can later try to put into order. The ability of smartphones to take dictation these days makes it easier to walk and talk than in the past.
Another technique is to keep a day journal of ideas. Oddly, after reading the journals later, I find I often come up with the same idea multiple times over the years.
I’d classify Pinker as an example of the well-rounded wiseman more than a one great idea genius. He has a fine track record of having intelligent and sensible things to say on many subjects. I’m not sure that he will be associated with any single breakthrough in particular, the way that, say, William D. Hamilton can be summed up with The Selfish Gene and The Red Queen. It’s more that Pinker is an extremely fine representative, perhaps the all-around best generalist, of the Darwinian tendency in contemporary intellectual thought.
All I have read of Pinker (except possibly a random article or two) was Better Angels, but that was so dumb I couldn’t finish it. He alluded to others’ research–which sounded interesting–and filled in the gaps with a hopeless mess of silliness:
People in the past entertained themselves with horribly violent fairy-tale narratives (and people today entertain themselves with far more violent and vivid TV/movie/game narratives, which he acknowledges only to immediately brush aside). The French Revolution went bad while the American didn’t because the French listened to the wrong philosophers (and not because they were two totally different kinds of conflict with similar names). The long peace couldn’t have been caused by nukes because we had poison gas already, and that didn’t stop war (ballistic missiles, invented at the crucial time, go unmentioned). People gave names like “atomic fireball” and “bikini” to innocuous objects (… and?). Today we find the idea of retaliating to insults with violence repulsive and barbaric (don’t know who exactly “we” is here; I’ve met plenty of people who would totally rough you up for dissing them). Etc. He even drags the Jesus Myth story into it in an aside. Possibly I should have simply skipped ahead to the chapters on neurology, but the claptrap burned through all my patience.
Based on that admittedly limited sample, I would classify Pinker as an entertaining writer and generally intelligent person who is highly vulnerable to Dunning-Kruger.
theredsheep,
So, you agree that Pinker made a good case that violence has dropped, you just disagree with his explanations?
If I recall his major explanations for this breakthrough in human welfare were…
1) literacy and the ability to experience other people’s viewpoints
2) interdependency and the extended networks of market economies and massive telecommunications.
3). Monopoly of coercion in state is aimed at using coercion only to suppress coercion. This changes the payoff of exploitation and violence and replaces amplifying feedback loops of revenge and status displays
4) intelligence and the Flynn effect and the correlation with planning for the long term and repressing Impulsiveness.
5). Enlightenment philosophy and emphasis on rational utilitarianism.
This seems like a pretty reasonable, if incomplete list to me (I would add a lot more on formal and informal institutions). Which do you disagree with?
So, you agree that Pinker made a good case that violence has dropped, you just disagree with his explanations?
It is (and already was when TBAOON was written) pretty widely-known that the amount of intrasocietal violence is lower now than in, say, the middle ages. In other words, simply making a case that violence has dropped is uninteresting; the whole value of Pinker’s book is in the explanation it proposes, and if that explanation is bad, so is his book.
X,
Actually, the idea that violence has dropped is quite controversial among certain ideologies. I’ve seen Sociologists and anthropologists of the “Progressive” persuasion get borderline apoplectic on the issue.
But, I listed out my notes on his primary explanations (which I admittedly recorded when reading the book several years ago). Which, if any, do you disagree with?
I largely agreed before reading the book that violence probably has declined, though I lack the expertise to say for certain. I read it to find out why he thinks that is, but found his explanations so persistently terrible that I couldn’t finish it. As for the real answer, I suspect technological and economic growth leading to the centralization and growth of the state has a lot to do with it, but at this point I’m just a guy who looked at a bunch of bad arguments and said, “I don’t know the answer, but this ain’t it.”
This is a bit of a bait and switch, Scott. There is a big difference between a box that generates random scientific hypothesis and one that generates hypothesis for a specific question.
>He must have amazing hypersensitive pattern-matching going on. But people with such hypersensitivity should be most likely to see patterns where they don’t exist. Hence, Bible codes.
New cause area: teaching conspiracy theorists advanced science.
It’s interesting to me that we train scientists to pattern-break, but not pattern-match. The Wason Selection Test obviously has one correct answer. But real life doesn’t put a limit on the number of guesses you get. Perhaps we should be more forgiving of pattern-matching behavior so long as it ends in pattern-breaking. And perhaps we should be better at teaching pattern-matching skills.
Arthur D. Little consultants emphasize pattern breaking / pattern matching:
http://digitalproblemsolving.com/node/13
A Fermi estimate on the value of the box:
If we get a Theory of Relativity about once a decade, and there’s a million scientists in the world, then it currently takes about ~10,000,000 scientist-years to produce 1 Theory of Relativity.
If the box is 1-to-1,000 odds, and it takes 10 scientists a year to test it, that means it produces 1 Theory of Relativity every 10,000 scientist-years.
That means the box produces about a millennia of scientific progress every year.
If we get more pessimistic, and give it 1-in-10,000 odds, and make it take a hundred scientists to test a theory in a year, that’s still 10x what our current system produces.
What’s the probability the box proposes an elegant, wrong, and difficult to test idea that causes a large fraction of the world’s best theoretical physicists to spend several decades working on it?
Pretty much 1. Because that happens all the time, in every field. How many man-hours were “wasted” thinking about phlogiston or ? Nothing you can do about it (beyond any changes you can make to the prior likelihood of being right).
A key feature of using the box would be saving many ideas “for later” like when it spews out theories about unifying the 4 forces, black holes, dark energy, etc. We already have hundreds of hypothesis about such things and they are all mostly untestable.
Have the majority of scientists work on easier to test theories. If an idea is difficult to test, rank it lower on your priorities. Maybe have a small percentage of scientists carefully vetting such ideas, just in case it’s only the difficult ones that are right (but that feels like a monkey’s paw sort of twist)
You wouldn’t be talking about string theory, would you?
What’s Pinker’s world-changing revelation, or even one that changes the world for linguists and not just some very narrow linguistics specialists. His schtick is repeating the basics of fields, and he gets the basics of fields wrong a LOT. Murray Gel-Mann came up with world-changing Revelations, and one of the lesser of these was the idea of Gel-Mann amnesia, which applies to Pinker and to the New York Times about equally well.
It’s a special case of the general argument for free speech. Given that nobody has a hotline to reality and therefore absolute certainty about how things are, it behooves us to keep the field open for people to say anything (that’s not slander or incitement). That means you’ll hear some rubbish, but because truth is difficult to find, it’s worth the wastage for the occasional nugget.
This used to be a clear argument for the Left, but many in that camp seem to have forgotten it these days. Maybe it only applied when they were the political underdogs 🙂
There’s a critical difference between the filtration method for discovery of hypotheses and the filtration method for deciding between them. One expects most hypotheses to be false, and the fact that a hypothesis was created at all is a massive update towards it being true, but when choosing between contested hypotheses we have to be able to distinguish between arguments which are likely to be systematically biased, and the magnitude of the largest deviations from reason or accuracy are our main evidence regarding the typical size of such deviations.
I think you may have just made an argument for why we can’t have geniuses contribute anymore. With the internet outrage machine in full swing, who knows how many brilliant world-changing ideas have been ‘ruled-out’ because they came from folk with other really weird ideas, which made them unacceptable. This is again an argument for tolerance of ideas being super-important.
This shows why what we call intelligence can’t merely be pattern recognition. Newton is considered a genius because he found so many patterns that made more accurate predictions than other patterns people had constructed in the relevant fields, but if he had only found useless patterns he would not be considered a genius.
If this is not because of some ineffable correctness force that differentiates an Einstein from David Icke, then it could be because the brains of geniuses must by natural instinct be constantly checking for physically real errors in what is predicted by the patterns they find, far more furiously, microscopically, and ceaselessly than the brains of non-geniuses, without their similarly planet sized egos getting in the way of this “practical intelligence”. It could also be that the very framework of the scientific method and sceptical society around them are what are doing most of the systematic error checking and without this system there’s not as much as we’d like to think separating Einstein from the wild eyed bus stop conspiracy theorist with his incredibly complex, logically consistent, conspiracy theories that are entirely and utterly wrong and insane.
I can’t speak for Newton, but this is not what Einstein did. Instead of looking for tiny errors, he derived things from first principles.
So the question is that had Einstein been told “alternative facts” that were reality non-compliant, and on this basis grounded first principles for his theory, what would have conceptually separated him from bus stop man who has a perfect genius explanation for everything given you believe certain things about the FBI, aliens, and the Pope from the get go?
Was he purely relying on the history of science to hand him facts for him to draw a pattern with? Are geniuses just playing join the dots or are they doing something else?
I’m no expert, but I’m under the impression that what Einstein did was realise that absolute space and time were incompatible with Galilean relativity (the idea that the laws of physics don’t care how fast you’re going).
And, given the result of the MM experiment, he decided to keep experiments being the same on trains and ditch absolute space and time.
It’s helpful if you can get a glimpse into the black box. In my experience, when you come across a weird and interesting idea generator, he or she typically falls into one of three categories.
1) Stopped clock. This person approaches all ideas with the same viewpoint – every problem is a result of [excessive government regulation|the malign influence of the rich|imperialism|malinvestment] and therefore can be predicted and cured based on that basis. This person is sometimes correct when the crowd is wrong, and can help break preconceptions in the same way as a tarot spread might.
2) Not even wrong. This person is so monumentally off that their predictions and explanations are sometimes interesting, but not predictably or normally very interesting except as entertainment or again, as an exercise to help break habits and preconception. (Like asking a 5 year old about monetary policy, for example).
3) Interesting and original thinker. This person is pretty much as Scott describes above.
I like this. The problem Scott seems to ignore (as several people have pointed out here) is that the potential idea space is very large. The difficulty isn’t in finding new ideas; it’s with determining which ideas are good.
Its both. Someone actually has to propose the good idea for it to be evaluate, if the number of bad ideas is high enough simply testing ideas just means you end up in a cycle of spending resources to eliminate bad ideas without ever finding a good one.
I mostly agree with you, but I suspect that those people might say that they haven’t forgotten to take the social engineer hat off; they think it’s the correct hat for the situation. There are (they might argue) some areas of human thought where the wrong opinions are so common and so harmful that promoting the right opinions is the right thing to do even at the cost of limiting intellectual exploration.
Ok, but what if 50% of the time, once the box is open, we learn that Einstein is dead, and 50% he’s alive?
Even with that uncertainty, if we assume that Einstein’s cat opens the box, we can predict that the cat will be indifferent 100% of the time.
The cat isn’t indifferent, if Einstein is dead the cat gets lunch.
I think the value of the machine would depend critically on (1) how easy it is to test an idea and deliver a clear assessment, (2) how compelling the ideas are in the absence of definitive tests one way or another.
For example, Marx and Freud were two of the most influential thinkers for 100+ years in their respective disciplines, and arguably their ideas wrought huge amounts of damage. (*Glances around to see if MarxBro is listening*). Elegant and persuasive wrong theories can lead to a lot of harm if they can’t be conclusively tested one way or another.
Completely agree. Ideas don’t just hang in the ether, having no effect until they are definitively tested. They can have substantial real world effects. I would guess that Freud delayed the progress of psychiatry/psychology by a decade or two. Marx provided a lot of rationalizations for bad behavior.
Meta-Comment – I’m personally enjoying these shorter posts. Although i understand much of your audience may like the long essays, I’d would rather those be the exception than the norm, given the same total word-count output per time period.
I think the basic problem with this thesis is assuming that readers can accurately reject incorrect ideas. Not only is it true that lots of people who make great intellectual contribution often entertain crazy ideas, it is *also* the case that readers attracted to original thinkers do not exercise appropriate skepticism of crazy ideas.
I also wanted to respond specifically to this example:
One issue is that, since most of us are necessarily non-expert in most areas, it’s easy to be taken in by an interesting and smart-sounding thinker. Thus when we encounter people we trust making clearly wrong arguments in areas we have some expertise in, it’s quite reasonable to wonder how reliable they are in other areas. (cf. Gell-Mann Amnesia).
(As an aside, I have a policy of exercise heightened skepticism towards people who sound too clever, particularly if they weigh in on lots of topics. They’re fun to listen to, but on priors they’re more likely to be sophists.)
My general rule is that I track writers’ originality and reliability, and exercise appropriate skepticism based on this assessment. For example, I rate Robin Hanson very high on originality, but quite low on reliability.
Einstein certainly had more flaws than just being socialist. He absolutely despised quantum mechanics and spent much of the later half of his career trying to disprove it. He didn’t like the idea that reality is not deterministic, in spite of the piling mountain of evidence that at a fundamental level it was all probability waves. Imagine what advances he could have made if he turned his mind to exploring the quantum world instead of rejecting it?
I’m still with Einstein on this one.
If I understand your post correctly, it implies that either a) the Many Worlds interpretation has a mountain of evidence piling up against it or b) the universe is not deterministic if many worlds is true. Is that intended? If so, which of the two do you think is true?
OK, so which games did relativity change?
It changed the bounds in which future scientists would propose and evaluate ideas, and resulted in lots of “whoa!” moments, but if that’s all there is there is to it, then “game” is right. There’s a class of people who propose and sometimes test ideas, these being evaluated as “whoa!” or “haha!” depending on how they correlate with preexisting ideas, such that each new idea changes the rules, and some people win status and some people lose it and whee, why should any of us care?
Or, maybe there’s some practical application that makes this valuable outside of academia. Relativity lets us calculate orbits with greater precision, whether of planets, satellites, or space probes. But so would the empirical fudge factors we’d have introduced when our Newtonian predictions kept coming in a bit off. The bit where we supposedly couldn’t have GPS without relativity is just plain wrong – or more precisely, it’s backwards, because if we hadn’t invented relativity first, the table of fudge factors that made GPS work anyway would have been a cheat sheet for theoretical physicists looking for a new theory. Same with the precision-timing aspects of relativity.
Scientific ideas are of little value unless they can be cheaply tested – because if testing is expensive, practical application will likely be prohibitive. The real value is in advancing the state of the art in cheap testing, and that’s more about careful execution than clever ideas. Ideas, we’ve got more of than we know what to do with, and the cutting-edge ideas are usually beyond the reach of cheap testing, so I’m leery of the claim that an idea machine would be the most valuable thing ever.
Late Edit: Unless the clever idea is on how to test/validate/replicate cheaply. Those ideas are valuable.
Who are the experimentalists who poked holes in Newthonian Mechanics before Einstein plugged them with Relativity? I would think they are also people with whom to compare Einstein.
You make several points here. The one that Graham wrote about is right on target: Geniuses are people who are willing to take intellectual risks, so you wouldn’t expect them to get everything right (on the other hand, I would also hope that a genius is somebody who has a good nose for what makes sense and what doesn’t, so that’s a countervailing notion here. But Newton never published his crackpot theories, because he didn’t have confidence in them. I think that’s right on target).
But your “black box” gets a little troublesome when it gets down into the 1% range. It’s long been my feeling that the major problem in this arena is knowing which ideas are good. “Just test them!” you say. Yes, well, if we’re talking about relativity, fine. But what if your idea is “the public needs an entirely new type of computer, with a friendly graphical user interface”? “Testing” that hypothesis takes hundreds of thousands to millions of dollars. Jobs did it, because he had access to those resources, after profiting off of other good ideas. Good for him, but my point is that in most situations the problem is determining which ideas are good and which are not, and new ideas are far more likely to be bad than good. So sure, if somebody already has a proven record, we should take his other ideas seriously (I’ve always wondered if Tesla’s energy-broadcasting system would actually work, for instance), and be willing to try them out.
My point here is that coming up with “original” ideas is easier than you’d think. But the space of ideas you haven’t considered is very large; the space of ideas you haven’t considered that are worth considering is much smaller.
So A) the reason people “lose respect” for thinkers that come up with bad ideas is that we are looking for people who have good intuition as to what is right and what isn’t, and don’t (as a rule) endorse wrong things, B) original ideas are not good things in themselves, and C) your “black box” is only really useful, below some probability threshold, if its ideas are fairly easily testable—which most ideas are not.
Edit: Apparently John Schilling and Eponymous had much the same idea a couple of hours ago.
+1
“He’s a bad person” or “he’s on the other side” tells you almost nothing about whether he’s got anything interesting to say. But then, it’s not really supposed to tell you that, either.
Newton believed in a creator God and believed the Bible was inspired by that God. He was a mathematical genius. Why wouldn’t he look for Bible codes?
So, has anybody here looked into the results of Newton’s Bible Code research? What did he conclude?
I read the Wikipedia page, if that counts!
Newton predicted that the world would not end before 2060, which appears to have been correct (but not exactly surprising).
He also believed the Antichrist was actually a metaphor for the position of Pope, and would rule for 1260 years. This figure comes from interpreting “time and times and half a time” as “three and a half years”, and then converting days to years (I’m not sure where he came up with the Pope being the Antichrist). This would then seem to imply that the Pope, or rather the papal office, would rule until 2016. It would seem this prediction has gone bust.
I don’t think it’s a coincidence that Newton has more crazy beliefs than Einstein and also lived in an earlier time. As we accumulate more knowledge, the space of possible useful ideas shrinks. Newton was setting up the earliest, useful foundation of physics. By the time of Einstein, there was a lot more ideas we could rule out. So a person having a crockpot idea today is more of a case against them than having a crockpot idea three hundred years ago.
It probably also depends on the field. Having crockpot ideas in physics is more of a strike against you than similarly unorthodox beliefs in economics.
Newton had more ideas that are clearly silly *to us*. The view from up on the shoulders of a whole enormous human pyramid of giants is surprisingly good–pity Newton had to make do with a much smaller human pyramid.
To play the contrarian, I don’t think this is necessarily true. It depends on the ideas and how easily testable they are. I can think of a few cases where we would be better off ignoring such a machine. I don’t know how applicable these are to reading controversial opinions, but it’s maybe worth keeping in mind:
What if the machine offers bad ideas that involve doing something horrible? For example, if it said, “we would achieve utopia if we killed all the star-bellied sneetches.” There’s no way to try that without killing hundreds of millions of
peoplesneetches. It would probably encourage a bunch of star-belly bigots. The consequentialist calculus might even come out positive, but it would result in really terrible things.What if there are steps in the testing process that would destroy humanity? If it said, “mixing Chemical X and Chemical Y will produce the Philosopher’s stone,” but mixing those two chemicals created a chain reaction that destroyed the world?
Or what if the machine comes up with incorrect theories that are almost correct or correct when first testing them. It could lead to humanity spending huge amounts of resources going down dead ends (this was actually a neat plot point in the (rot13) Qnex Sberfg series).
I didn’t un-rot13 that, but I’m just deciding it’s Qnex Sberfg, because that sounds awesome.
Coincidentally, I’ve recently noticed something that might be termed “source-ism.” It means dismissing a particular source of news (or other information) as 100% unreliable because it often says false/irresponsible/obnoxious things, even though the source also routinely curates news from trustworthy sources.
The best example I can think of is the Daily Mail. Most of its articles are tabloids, gossip, and rumormongering, but mixed in are curated articles about, say, medical advances that were first reported in the New England Journal of Medicine.
On many occasions, I’ve posted a fact-based article like that on Facebook, and someone who doesn’t like it for some reason will comment that it is invalid because it appeared in the Daily Mail, even though the Daily Mail article is just a summary of an article from a reputable outside source (and all the referencing will be there).
Has anyone else noticed this? Sourceism?
Arguably it’s a good idea to get as close as practical to the primary source anyway, even more so if the secondary or tertiary source is going to be offputting to the intended audience. If the Daily Mail is drawing from a more credible source (or even just a source more likely to be perceived as credible), why not link to that source instead?
Wikipedia discourages use of primary sources.
And IMO, that’s a problem for Wikipedia.
While I’ve heard bad things about editors’ implementations of the policy, the actual policy doesn’t seem that bad.
The aims seem to be 1) don’t let people and entities write their own articles (due to the obvious lack of objectivity and incentives for self-promotion and self-justification), and 2) prefer secondary sources laypeople can understand to highly technical primary sources.
I’m less thrilled with b), since it doesn’t match the way I was taught to cite and risks games of Telephone between the primary source and the intermediary. But while I’ve seen the frustrations with a) when someone wants to correct clear untruths on their own entry, the risk of becoming an advertising and vanity site is something I can see wanting to steer far clear of.
It’s understandable that they don’t want to become too much of a fact-checking website and defer to journalists. However, it does mean that they tend to fall victim to the same biases and mistakes as journalists.
If the Daily Mail is drawing from a more credible source (or even just a source more likely to be perceived as credible), why not link to that source instead?
Laziness. As a general rule, I cite original sources, but sometimes I just don’t feel like it and will post the article from the secondary or tertiary source, like the Daily Mail.
Of course, the Daily Mail article will always mention what the original source was and will usually provide a hyperlink to it. That makes it all the less excusable to dismiss the content of the Daily Mail article since verification is only one click away.
I’m all in on laziness myself, but if one isn’t prepared to make that click oneself, I’m not sure there’s much ground for criticizing one’s readers for doing taking the same stance.
Honestly, I wish people were more skeptical about sources, particularly when it comes to stories that back their preconceptions. The amount of clickbait passed around with only a tenuous connection to facts could be reduced a lot if people were inclined by nature or training to always track things back even a few steps before posting. Insufficient credulity doesn’t really seem to be a big problem on the net.
But that’s my educational and professional background talking, which all points to “if your mother says she loves you, you still need a cite to a primary source if you’re putting it in writing” . (It took me over a decade to be willing to cite to Wikipedia on blog comments rather than tracking back to whatever its sources were, and I still feel a little sloppy about doing it.)
The Daily Mail put the most interesting facts in the article up front so readers will notice them. The New York Times frequently buries the most interesting bits at the end, after most of the subscribers have stopped reading.
Therefore, the Daily Mail is much less respectable.
I don’t know the NYT always tests my patience / attention span by starting every article as one would start a very boring novel, telling you it was a gloomy September day and the leaves were turning red on an avenue of yellow brick houses and this kind of blather until many parapgraphs later it turns into describing an unusual illness, hobby or problem, or generally something actually newsworthy and interesting.
A lot of this looks to me like yet another application of the expert trust heuristic. If I respect someone – that is to say, I respect their opinion – all that functionally means is that whenever they claim some X, then my personal belief in P(X) is higher than if they never made the claim. However, this is all the case only if I have no independent way of evaluating X. In other words, expert opinions are heuristics I can employ to judge claims I know nothing about otherwise.
So if Pinker says something about AI that I think is harebrained because I know something about AI (or at least, about that claim), then Pinker’s expertise is rendered moot by my independent expertise. (Which might simply be someone else’s that I respect more.) I figure most people evaluate claims this way, if they haven’t taken them on faith.
There’s a little push-pull, too, when it comes to claims made by multiple sources. Suppose Pinker makes various claims I can’t verify independently, but that I have prior truth-beliefs on, because I’ve heard other people claim them. For example, Pinker claims eggs will give me cancer, and I believe that’s bunk because I heard the Food Babe say it and I believe she’s generally unwise. And he claims Norway is too reliant on oil exports, and I believe it’s probably true because I heard Megan McArdle say it and I believe she’s generally wise.
The result will be that my respect for Pinker and the egg cancer claim will pull closer – I’ll trust Pinker less, but I’ll also think egg cancer is more likely. And if I trust Pinker less than the Norway oil claim, I’ll now trust Pinker more, but also trust the Norway oil claim less, because there’s this person who claims it who I think is relatively somewhat of a clown, even though I now think he’s a bit less of a clown. Similarly, those claims will pull McArdle and the Food Babe along with them.
There’s a question of exactly how much each truth belief changes. 0.01? 0.1? -0.5? I often don’t know, and I’m probably not that good at it, but I hope it’s at least monotonic – the more I believed X, the more my respect for an expert should rise for claiming it, and vice versa. There could also be exceptions, but I think they always fall out as cases where I had independent information. And of course, respect takes the domain as a parameter – I might respect Pinker’s opinion on AI more than I would on, say, classical music.
I wonder how many other people update their belief-webs this way, whether or not they’re conscious of it.
“I just worry too many people wear their social engineer hat so often that they forget how to take it off, forget that “intellectual exploration” is a different job than “promote the right opinions about things” and requires different strategies.”
This is an extremely good point.
A kinda-sorta related example of the sort of ridiculous gatekeeping of self-appointed social engineers:
A few years ago, a discussion about how VO2 Max is measured popped up in r/skeptic. The methodology for measuring this basically entails strapping you to a stationary bike or treadmill, getting a baseline, and then getting you to exert yourself while measuring the amount of oxygen you use.
A number of users there couldn’t seem to figure out what was actually being done, summarily dismissed it out of hand, and as a result, nearly anyone who attempted to explain what it was actually used for was roundly downvoted.
This was an early signifier to me that Reddit is 99.8% complete bullshit promulgated by narcissistic idiots who have no idea what they’re talking about.
I’d go further: The social engineer job, where you try to make sure everyone has the socially-optimal set of beliefs, is incompatible with being a scientist or reporter. To the extent you mix those two jobs, you become a worse scientist or reporter. And since you aren’t smart enough to know all the ways that people listening to you will use your knowledge (nobody is), the more you do social engineering while impersonating a scientist or reporter, the more likely you are to make the world a much worse place.
The world is full of people who were taught things that weren’t true, or read news stories that lied to them or omitted critical relevant facts to ensure that they got the “right” idea. Decades later, they’re still walking around with those incorrect beliefs about the world, and occasionally making decisions based on them that go disastrously wrong because they were lied to and never realized it.
Wait, is this fair?
AFAIK studying Bible Codes, Alchemy and other occultism may have been pretty common back then, at least among the educated.
I have no reason to believe that studying the Bible Codes was more open a experience than dedicating yourself to trying to unify Kepler and Galileo’s work. If anything, Newton trying to keep his work on the occult hidden from his science peers tells me he viewed this as low-class, and thus, way more common in that society. Same with Alchemy. It’s possible that the part that made Newton open to weird stuff was precisely studying physics.
On the other hand we never hear any praise for Kepler for his dedication to astrology, despite this being a huge enabler of his studies. Somehow he doesn’t get the reputation benefit of “openness” Newton gets for other things that may have been seen as normal at the time.
This would suggest that organisations, perhaps even countries, who rule in would have an “evolutionary” advantage over those that don’t – the kind of place that can tolerate or forgive a bit of abnormality is more likely to end up with an Einstein because they haven’t kicked him out or never took him in for having a messy appearance.
(Although, as far as I know the real Einstein did not show any signs of Einstein-level brilliance until after he’d left university.)
This definitely resonates.
In conversation about the merits (or demerits) of ‘geniuses’ or controversial thinkers (Elon Musk comes up a lot) I often hear that they are volatile and/or crazy as a reason to ignore and/or control them.
My response is usually along the lines of, “Exactly! XXXX is crazy and volatile and makes big waves. If EVERYONE were like that we would have chaos, but very few people are like that.”
I find that often there’s conflation of people we should VALUE and people we should LOOK UP TO as role models. And I think it’s clear that there are plenty of people of value to the species that aren’t necessarily models for behavior for others.
Doesn’t this also depend to some extent on the moral status of the genius’s wrong idea? I mean, Newton’s Bible Code stuff may be odd, but it’s not morally reprehensible. On the other hand, Heidegger.
I find it a bit odd to use a thought experiment that likens lone geniuses to literal black boxes to strengthen an idea for a more social, network based model of knowledge generation. I agree you should totally listen to people who have crazy ideas that are only sometimes right, but those people are themselves listening to an eclectic mix of people, or taking people seriously who others do not. Raising you tolerance for people with risky ideas isn’t capitalizing on some limited resource of thinkers, it’s a way of efficiently increasing the network of people whose ideas you are exposed to.
There are actually two separate (streams of) questions here. (1) What is the value of thinkers who are original but often wrong? (2) What should we read? How should good ideas be mainstreamed? It’s not obvious to me that both questions have answers that point in the same direction.
I think Scott’s claim that original but unreliable thinkers are high-value is a good one. But I’m not sure that for my own reading I want to read original thinkers. When I broach a new subject, I want to read textbooks – boring, basic, correct, accepted stuff in this field. Later on, if I become good at it, I might read some more advanced ideas. But in the early stages reading the far-out geniuses is unlikely to be helpful to me personally. And in most fields of knowledge, I’m a beginner, so…
Secondly, how should good ideas get mainstreamed. I guess this is about scientific and academic communities and the media. I don’t know much about how the scientific community works, so I’ll leave that aside for the moment. Now, the media… Is Scott really saying that the media should spend more time discussing wild ideas by creative people? I don’t think that’s right. The media does us more of a service by sticking quite rigidly to the line of boring accepted true stuff. Reminding us that trade is good, racism bad, violence bad, stability good for children… the world will be a better place when more people remember the basic boring stuff.
I’m not quite sure how to reconcile these two views: (1) creative people are high value (2) the world and I get more value from listening to uncreative ideas. I guess the solution would be something like a specific class of people whose job it is to examine crazy new ideas critically and winnow them for the best ones. And we have that class – academics. So while Scott is right in so far as he goes, I’m not sure that this idea has revolutionary implications. We’re basically doing things the right way.
It’s not just scientists and business types. The first example I thought of as I read this was Capt. Jeffrey C. Metzel, head of the U.S. Navy’s Readiness Section from 1942-45, who was widely known as “Thought-a-Minute Metzel.” He played a vital role as an innovation engine for the U.S. during WWII, coming up with a huge array of tactical and strategic innovations. Admiral King, the Commander in Chief, United States Fleet, and Chief of Naval Operations at the time, liked to say that “Jeff Metzel had a hundred new ideas every day, ninety-nine of which were no good. But there wasn’t another soul in the Navy Department who had a really good new idea every day – all you had to do was winnow out the good one.”
One Metzel brainstorm led directly to the modern Navy SEALs. Draper Kauffman, himself a famously “out of the box” thinker and innovator, is regarded by the SEALs as “the grandpappy bullfrog,” the creator of the Naval Combat Demolition Units (NCDUs) and Underwater Demolition Teams (UDTs) that eventually evolved into the SEALs. But it was Capt Metzel who realized that the obstacles Hitler was building on European beaches would have to be blown up by hand-placed charges before any invasion could take places. So it was Metzel who sold Admiral King on the idea and Metzel who pulled Kauffman out of bomb disposal and gave him the assignment to figure out how to scout and clear enemy beaches under intense enemy fire, invent any necessary technology, and train the teams to do it.
Only 13 months later, the NCDUs were in the first waves at Normandy, blowing paths through the obstacles for the landing craft. Without them, the invasion would almost certainly have failed. Without enough lead time, they would not have been trained, equipped, and ready. Metzel not only saw what was going to be necessary, he envisioned a solution far enough in advance and he found the right person to make it happen. The invasions of Saipan, Tinian, Iwo Jima, Okinawa, and other Pacific islands would also have been far more costly without the UDTs. (They were training for the invasion of Japan when the war ended, and Kauffman led the first team of Americans ashore in Tokyo harbor.)
Metzel pioneered or wholeheartedly supported a wide range of other untraditional, whacky-sounding, innovative operations, including the Navy “Beach Jumpers” (tactical deception units) and other key deception operations which kept Axis commanders confused about Allied intentions before the landings in North Africa, Sicily, Salerno, Normandy, and elsewhere. (Among his ideas were balloons covered in strips of metal foil and tethered to small boats to create the radar signature of a large invasion fleet.)
Metzel also fought tenaciously for funds, men, and logistical support for “the Sino-American Cooperative Association,” or SACO, the anodyne name for the 3,000 Naval commandos who operated with Chinese guerrillas and pirates far behind Japanese lines in China and SE Asia, conducting sabotage, ambushing truck and troop convoys, capturing cargo craft, providing intelligence and weather reports, and rescuing Allied pilots and sailors.
Capt. Metzel was a perfect example of the kind of thinker Scott is talking about, a brilliant innovator who was exceptionally productive and effective in part because he wasn’t afraid to be wrong.
Pray to Apollo and Dionysus for inspiration, channel spiritual energy into your Sacral Chakra, harness Blue and Red mana from water and fire and electricity, spend as many Sanity Points as you’re willing to part with to call upon Nylarthotep, go into a ritual trance state of self-induced madness, take whichever inebriatory psychoactive substances get your creative juices flowing, remember that consistency is the hobgoblin of little minds, and (most importantly) write down every idea that comes to mind, as fast as you can, without hesitation or reservation, for as long as you can before succumbing to physical or mental fatigue – and you too can become a “black box” for mostly daft but occasionally salient ideas. Of course, the more intelligent and knowledgeable you are, the better your signal-to-noise ratio is going to be, although this method is going to have a very high percentage of noise even in the best cases.
Is there value in this kind of spontaneous freeform approach to reasoning? Most likely, especially in a world where most of the low-hanging fruit has already been picked and a certain amount of stretching is required to reach new insights.
I don’t think it’s the way in which the intelligent, creative people I’ve known worked.
Dionysus much more than Apollo, for this kind of idea. Maybe Eris too.
“If someone has 99 stupid ideas and then 1 seemingly good one, obviously this should increase your probability that the seemingly good one is actually flawed in a way you haven’t noticed.”
Which also implies that Newton’s Bible codes may have actually something to them in a way we haven’t noticed. Let’s look at this way – out of the gazillion interesting problems in theology or Bible study, why would Newton pick this one if there wasn’t something actually interesting and promising about them? There is a lot of difference between an idea being false in the literal sense vs. being unworthy of attention. Worst case you learn something about human psychology. I think you can learn something about human psychology if you study why was everybody so much into luminoferous ether, for example.
Continuing the former thought and asking everybody: could you name one book that came the closest to convincing you something like supernatural / parapsychology / magick / occult stuff is actually real?
As I mentioned above there is a lot of difference between something being empirically untrue vs. being unworthy of attention. But most books in this regards are very fluffy-fuzzy. My nomination would be one of the Stargate Project books, forgot the title or author, sorry. That at least reads like a neutral description of events.
Much of magical realism makes me hopeful for a “greater purpose” and “ultimate healing of injustice all injustice”: Cloud Atlas, Phil K. Dick novels, some Shakespeare plays, G. K. Chesterton novels, some moments in Stephen King’s writing, Descent Into Hell by Charles Williams, “The Goddess of Everything Else”, …. and getting away from magical realism, No Country for Old Men (the novel), Lost in the Cosmos by Walker Percy, Martin Gardner’s philosophy.
This reminds me of a problem I’ve run into a few times, where someone’s presenting themselves as an expert, and they tell me:
1) Some stuff that sounds reasonable that matches with common wisdom I’ve heard before
2) Some stuff that I can’t really evaluate
3) One thing that I know for a fact is false
And I’ve wondered about how to update based on this.
In one case, it was the cookbook Nourishing Traditions, in which the author claims to have done a bunch of research on historic food, and thus come up with advice and recipes for how to eat more healthily.
1) Avoid refined sugar? Sure, makes sense.
2) Avoid raw foods and ferment food as much as possible? Can’t really evaluate, seems plausible.
3) Tuna is a low mercury fish, because mercury sinks to the bottom of the ocean and tunas live high in the water column? Absolutely wrong and not how mercury works.
The book is trying to push weird novel ideas, and the mercury advice was one small note in a side column, so maybe I should just discard it and look at the main food advice. On the other hand, she’s supposedly getting the main food advice from all this research she did, and it makes me reluctant to trust the quality of her research if she got wrong this one thing I can actually evaluate.
The other is a GP who recommended me, among other things, a fad pseudoscience diet. ¯\_(ツ)_/¯
If we’re talking about things like leafy vegetables that could easily be eaten raw, people in the past probably avoided raw food to reduce risk of disease at least as much as to get more nutrition: raw food in preindustrial settings tends to have literal shit on it, because that’s the most convenient preindustrial fertilizer. You can imagine the kind of stuff that comes with that, but cooking inactivates most of that stuff. This is still true to some extent (every so often you hear about an E. coli outbreak linked to e.g. salad bars), but we’ve gotten a lot better at keeping food sterile.
Fermenting food is historically more about preservation than anything else. It may or may not have any significant nutritional effects, I don’t know. But fermented food is delicious, so I eat a lot of it either way.
I’ve never read the book, but it sounds like conventional wisdom mixed with woo.
(Should religious writers and mystics be included too? Some religious insights are excellent, helpful, imaginative, or, even true. And there are tons of religious thinkers and mystics over the ages. I would be shocked if Moses Maimonides, the Gospels, and Upanishads were not on such a list.)
I strongly agree we should always be working on expanding the canon, including more thinkers, sifting through the dross to find the gold, but by Scott’s criteria essentially includes everyone that is ever suggested to me.
> If you took away Eddington, someone else would have tested relativity; the bottleneck is in Einsteins.
Maybe at the moment, but some fields are already at the stage where to confirm the latest theories requires ingenuity and cost beyond what is commonly available.
It’s pretty easy to imagine that at some point it will be easier to create new theories than to test them for most areas of human knowledge.
In today’s outrage culture, participants often struggle to separate the message from the messenger and find it easy to give in to things like guilt by association. It’s unfortunate because it causes us to outright dismiss things people say out of hand. (This can be because of experiences with bad actors in the past with similar views or just an assumption that anything they say is going to be disingenuous.)
Even while being aware of it, I find it hard to resist my own participation in it. Consider Vegans. For the most part, I simply dismiss so much of what they say out of hand because of their activists and types who are essentially street preachers. Thing is, they may have some relevant points about diets and the importance of it to one’s health.
My dad was starting to get type 2 diabetes – whatever one can be reversed – and it freaked him out. He ended up changing his diet completely. First he started just eating whole unprocessed foods, whole unpasteurized milk, etc. Eventually he went Vegan – but for specific dietary reasons. Well it improved his overall health dramatically. Now he is even exercising. He went from being 5 foot 7 and 370 pounds and is down to 240 and dropping.
Without this personal anecdotal perception into Vegan diets, I would have continued ignoring any actual reasonable points they make about diet.
I must admit it’s difficult for me to do this in other areas however. You have to essentially re-train your “instincts” when it comes to making assumptions and presuppositions about people and what they say – as well as their intentions. In today’s day and age, it’s rough.
Margulis is a bad example. It sounds more like figuring out mitochondria was something she accidentally got right, since she believed in Gaia and having two organisms cooperate fits with believing in Gaia. It’s like finding a misanthrope who believes that everyone is evil, and giving him credit for figuring out that Hitler was evil.
But it still resulted in a scientific breakthrough. The whole point is that it sometimes takes crazy people to generate correct hypotheses.
Perhaps the saying should be “If I have seen further, it is by looking in a different direction”.
Didn’t the originator of the big bang theory have some broadly religious reasons for thinking in that direction in the first place?
Lemaitre was a catholic priest, but opposed conflating the theory with his religion and wrote an open letter to the Pope opposing doing so. He also came to the solution by running Einstein’s equations and Hubble’s data, so it was more of a math thing than a particular dogma thing.
And yet with the exception of X/Google-X and research groups like Bell Labs, few companies are explicitly hiring (much less developing) originalists whose job it is to come up with actually novel and workable product ideas. Most job postings are looking for people who can improve efficiency or do within-the-box optimization. For every one hundred product managers who can be hired at a dating app company, fewer would seem to be able to capable of truly reinventing the very definition of one.
Grad school admission for research-based programs is also not explicitly about selecting people with a high capability for generating “creative ideas/hypotheses” and instead seems to select for people who can learn how to test hypotheses well. This is done via looking at test scores and GPA, the seeming reason is that people who do well in those things likely have the ability to understand T-tests, Bayesian stats, the implications of finite variance, experiment design, and learning/remembering what already been done in a given field. None of these explicitly the same thing as coming up with interesting hypotheses in the first place. Maybe this is why if you have a Ph.D. and can code, you too can be hired to use optimize the click-through-response rates of digital ads or run the A/B testing of a website. Yet if you wanted to be an institution known for training the best, you’d think you would invest in developing measures of “capability for learning to produce actually novel ideas” rather than the GRE subject tests.
Although a person capable of original ideas would seem to be more valuable in a societal sense, what are they to do, google for “Futurism Jobs?”
Consider Uber and Lyft. Both have self-driving car divisions, which is certainly innovative in one sense. But why didn’t the ride-sharing apps come up with the idea for scooters as another solution to the problem of transportation in urban cities? Lime is now valued at $2 Billion; either ridesharing company might have captured a similarish amount of value had they beaten Lime to the punch. Instead, they seemed to form or acquire their scooter divisions in a reactionary way. The question for Uber is whether it would have been cheaper for them to set up an ideas lab that figured out the scooters solution than the price they paid for Jump Bikes. If you look at Lime’s job postings, they are currently in the process of spinning up operations teams in many major cities around the planet, something that both Lyft and Uber already have done. Smells like Sears vs. Amazon (though, in fairness, both Uber and Lyft have responded much faster).
Perhaps SSC readers know of Harry Nyquist — as reported in The Great Idea Factory, the Bell Labs patent lawyers wanted to know why some people were so much more productive (in terms of patents) than others. After crunching a lot of data, they found that the only thing the productive employees had in common (other than having made it through the Bell Labs hiring process) was that “… Workers with the most patents often shared lunch or breakfast with a Bell Labs electrical engineer named Harry Nyquist. It wasn’t the case that Nyquist gave them specific ideas. Rather, as one scientist recalled, ‘he drew people out, got them thinking.” (Pg. 135)
Even fewer companies are hiring for Nyquists, though doing so would seem like a fantastic investment. Sure, it wouldn’t be trivial to find the right person. Traditional rounds of onsite interviews and takehome projects would not really tell you what you want to know, but the payoff could be well worth the price of rotating potential hires in with the day-to-day operations of an ideation team.
I’ll end by noting that it is interesting to compare tech and academia to the pop music industry. Clearly what counts as “innovation” in the music industry is best understood not in terms of what technology is being used, but in the industry’s internal “unit of original idea” which most of us call “catchy singles.” Max Martin, who has written /co-written a large number of the last 25 years worth of Pop music hits, is a clear analog to Harry Nyquist, as are his songwriting peers. Same goes for hit-making producers like Barry Gordie. (The music industry also uses songwriting speed-dating camps, wherein people are paired together in a round-robin fashion to see which team can come up with an original / “good” hook.) Is the pop music industry is more innovative than tech or academia when evaluated on its own terms? It explicitly hires creative people. I think this is a fascinating question.