I.
David Chapman keeps complaining that “Bayesianism” – as used to describe a philosophy rather than just a branch of statistics – is meaningless or irrelevant, yet is touted as being the Sacred Solution To Everything.
In my reply on his blog, I made the somewhat weak defense that it’s not a disaster if a philosophy is not totally about its name. For example, the Baptists have done pretty well for themselves even though baptism is only a small part of their doctrine and indeed a part they share with lots of other denominations. The Quakers and Shakers are more than just people who move rhythmically sometimes, and no one gives them any grief about it.
But now I think this is overly pessimistic. I think Bayesianism is a genuine epistemology and that the only reason this isn’t obvious is that it’s a really good epistemology, so good that it’s hard to remember that other people don’t have it. So let me sketch two alternative epistemologies and then I’ll define Bayesianism by contrast.
II.
Aristotelianism
Everyone likes to beat up on Aristotle, and I am no exception. An Aristotelian epistemology is one where statements are either true or false and you can usually figure out which by using deductive reasoning. Tell an Aristotelian a statement and, God help him, he will either agree or disagree.
Aristotelians are the sort of people who say things like “You can never really be an atheist, because you can’t prove there’s no God. If you were really honest you’d call yourself an agnostic.” When an Aristotelian holds a belief, it’s because he’s damn well proven that belief, and if you say you have a belief but haven’t proven it, you are a dirty cheater taking epistemic shortcuts.
Very occasionally someone will prove an Aristotelian wrong on one of his beliefs. This is shocking and traumatic, but it certainly doesn’t mean that any of the Aristotelian’s other beliefs might be wrong. After all, he’s proven them with deductive reasoning. And deductive reasoning is 100% correct by definition! It’s logic!
Anton-Wilsonism
Nobody likes to beat up on Robert Anton Wilson, and I consistently get complaints when I try. He and his ilk have seen through Aristotelianism. It’s a sham to say you ever know things for certain, and there are a lot of dead white men who were cocksure about themselves and ended up being wrong. Therefore, the most virtuous possible epistemic state is to not believe anything.
This leads to nihilism, moral relativism, postmodernism, and mysticism. The truth cannot be spoken, because any assertion that gets spoken is just another dogma, and dogmas are the enemies of truth. Truth is in the process, or is a state of mind, or is [insert two hundred pages of mysterianist drivel that never really reaches a conclusion].
Bayesianism
“Epistemology X” is the synthesis of Aristotelianism and Anton-Wilsonism. It concedes that you are not certain of any of your beliefs. But it also concedes that you are not in a position of global doubt, and that you can update your beliefs using evidence.
An Xist says things like “Given my current level of knowledge, I think it’s 60% likely that God doesn’t exist.” If they encounter evidence for or against the existence of God, they might change that number to 50% or 70%. Or if they don’t explicitly use numbers, they at least consider themselves to have strong leanings on difficult questions but with some remaining uncertainty. If they find themselves consistently over- or under-confident, they can adjust up or down until they reach either the certainty of Aristotelianism or the total Cartesian doubt of Anton-Wilsonism.
Epistemology X is both philosophically superior to its predecessors, in that it understands that you are neither completely omniscient nor completely nescient; instead, all knowledge is partial knowledge. And it is practically superior, in that it allows for the quantification of belief and therefore can have nice things like calibration testing and prediction markets.
What can we call this doctrine? In the old days it was known as probabilism, but this is unwieldy, and it refers to a variety practiced before we really understood what probability was. I think “Bayesianism” is an acceptable alternative, not just because Bayesian updating is the fundamental operation of this system, but because Bayesianism is the branch of probability that believes probabilities are degrees of mental credence and that allows for sensible probabilities of nonrepeated occurrences like “there is a God.”
III.
“Jason” made nearly this exact same point on David’s blog. David responds:
1) Do most people really think in black and white? Or is this a straw man?
2) Are numerical values a good way to think about uncertainty in general?
3) Does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?
The discussion between David and Jason then goes off on a tangent, so let me give my answer to some of these questions.
Do people really think in black and white? Or in my formulation, is the “Aristotelian” worldview really as bad as all that? David acknowledges the whole “You can’t really be an atheist because…” disaster, but says belief in God is a special case because of tribal affiliation.
I have consistently been tempted to agree with David – my conception of Aristotelianism certainly sounds like a straw man. But I think there are some inferential distances going on here. A year or so ago, my friend Ari wrote of Less Wrong:
I think there’s a few posts by Yudkowsky that I think deserve the highest praise one can give to a philosopher’s writing: That, on rereading them, I have no idea what I found so mindblowing about them the first time. Everything they say seems patently obvious now!
Obviously not everyone gets this Bayesian worldview from Less Wrong, but I share this experience of “No, everything there is obvious, surely I must always have believed it” while having a vague feeling that there had been something extremely revolutionary-seeming to it at the time. And I have memories.
I remember how some of my first exposure to philosophy was arguing against Objectivists in my college’s Objectivist Club. I remember how Objectivism absolutely lampshades Aristotelianism, how the head of the Objectivist Club tried very patiently to walk me through a deductive proof of why Objectivism was was correct from one of Rand’s books. “It all starts with A = A,” he told me. “From there, it’s just logic.” Although I did not agree with the proof itself, I don’t remember finding anything objectionable in the methodology behind it, nor did any of the other dozen-odd people there.
I remember talking to my father about some form of alternative-but-not-implausible medicine. It might have been St. John’s Wort – which has an evidence base now, but this was when I was very young. “Do you think it works?” I asked him. “There haven’t been any studies on it,” he said. “There’s no evidence that it’s effective.” “Right,” I said, “but there’s quite a bit of anecdotal evidence in its favor.” “But that’s not proof,” said my father. “You can’t just start speculating on medicines when you don’t have any proof that they work.” Now, if I were in my father’s shoes today, I might still make his same argument based on a more subtle evidence-based medicine philosophy, but the point was that at the time I felt like we were missing something important that I couldn’t quite put my finger on, and looking back on the conversation, that thing we were missing is obviously the notion of probabilistic reasoning. From inside I know I was missing it, and when I asked my father about this a few years ago he completely failed to understand what relevance that could possibly have to the question, so I feel confident saying he was missing it too.
I remember hanging out with a group of people in college who all thought Robert Anton Wilson was the coolest thing since sliced bread, and it was explicitly because he said we didn’t have to believe things with certainty. I’m going to get the same flak I always get for this, but Robert Anton Wilson, despite his brilliance as a writer and person, has a really dumb philosophy. The only context in which it could possibly be attractive – and I say this as someone who went around quoting Robert Anton Wilson like nonstop for several months to a year – is if it was a necessary countermeasure to an even worse epistemology that we had been hearing our entire lives. What philosophy is this? Anton Wilson explicitly identifies it as the Aristotelian philosophy of deductive certainty.
And finally, I remember a rotation in medical school. I and a few other students were in a psychiatric hospital, discussing with a senior psychiatrist whether to involuntarily commit a man who had made some comments which sort of kind of sounded maybe suicidal. I took the opposing position: “In context, he’s upset but clearly not at any immediate risk of killing himself.” One of the other students took the opposite side: “If there’s any chance he might shoot himself, it would be irresponsible to leave him untreated.” This annoyed me. “There’s “some chance” you might shoot yourself. Where do we draw the line?” The other student just laughed. “No, we’re being serious here, and if you’re not totally certain the guy is safe, he needs to be committed.”
(before Vassar goes off on one of his “doctors are so stupid, they don’t understand anything” rants, I should add that the senior psychiatrist then stopped the discussion, backed me up, and explained the basics of probability theory.)
So do most people really think in black and white? Ambiguous. I think people don’t account for uncertainty in Far Mode, but do account for it in Near Mode. I think if you explicitly ask people “Should you take account of uncertainty?” they will say “yes”, but if you ask them “Should you commit anybody who has any chance at all of shooting themselves?” they will also say yes – and if you ask them “What chance of someone being a terrorist is too high before you let them fly on an airplane, and don’t answer ‘zero’?” they will look at you as if you just grew a second head.
In short, they are not actually idiots, but they have no coherent philosophical foundation for their non-idiocy, and this tends to show through at inconvenient times.
Probability theory in general, and Bayesianism in particular, provide a coherent philosophical foundation for not being an idiot.
Now in general, people don’t need coherent philosophical foundations for anything they do. They don’t need grammar to speak a language, they don’t need classical physics to hit a baseball, and they don’t need probability theory to make good decisions. This is why I find all the “But probability theory isn’t that useful in everyday life!” complaining so vacuous.
“Everyday life” means “inside your comfort zone”. You don’t need theory inside your comfort zone, because you already navigate it effortlessly. But sometimes you find that the inside of your comfort zone isn’t so comfortable after all (my go-to grammatical example is answering the phone “Scott? Yes, this is him.”) Other times you want to leave your comfort zone, by for example speaking a foreign language or creating a conlang.
When David says that “You can’t possibly be an atheist because…” doesn’t count because it’s an edge case, I respond that it’s exactly the sort of thing that should count because it’s people trying to actually think about an issue outside their comfort zone which they can’t handle on intuition alone. It turns out when most people try this they fail miserably. If you are the sort of person who likes to deal with complicated philosophical problems outside the comfortable area where you can rely on instinct – and politics, religion, philosophy, and charity all fall in that area – then it’s really nice to have an epistemology that doesn’t suck.
Is there supposed to be more to this? This feels like it ends in the middle.
Scott, thank you very much for this measured response.
It’s because I find so much right with LessWrong, and that I admire its aims so much, that I’m so frustrated with its limitations and (seeming) errors. I’m afraid my careless expressions of frustration may sometimes offend. They may also baffle, because I haven’t actually offered a substantive critique (or even decided whether to do so). I apologize for both.
Meanwhile: my complaint about Bayesianism is not that it is wrong (because it isn’t), but that it is extremely limited as an account of effective and accurate thought. Perhaps that is (indeed) an accusation of arrogance.
In my tweet you linked to,
I was alluding the problem that when I say “Bayesianism is extremely limited”, I am told “Bayesianism doesn’t really have anything much to do with Bayes, it means being generally rational and paying attention to evidence.” Those are unambiguously good things, but if this is an accurate characterization of Bayesianism, it makes the term pretty nearly vacuous. A substantive discussion is going to need to find some middle ground between “always apply Bayes’ Rule” and “don’t be an idiot”.
One candidate for middle ground is “don’t be an Aristotelian or Wilsonian”, which I’d certainly agree with; but isn’t it pretty close to “don’t be an idiot”? I do think those are probably strawmen. (But, we’d need more empirical evidence about that!)
Also, “don’t be an Aristotelian or Wilsonian” leaves an enormous space of other possible epistemological views and methods, and it doesn’t seem fair to claim them all in the name of Bayesianism. (Like Cortez planting the flag for Spain. Would that I could conjure oceans of meaning, like the George Chapman you allude to!)
Coincidentally, muflax challenged me on this point two days ago: if not Bayesianism, then what? I went on a long hike yesterday, and my brain was autonomously writing a long blog post about that, while I was enjoying the scenery and avoiding falling off cliffs. I’m afraid its autonomous work is usually quite scattered, but I might post it soon anyway.
I think many people overestimate the amount that their thoughts are visible to others. (The LW article about this general tendency is Illusion of Transparency, and to explicitly avoid it, I’ll state the reason I’m linking to LW: to make clearer what sort of patterns LW ingrains into people, and how knowing about those patterns can make one more effective.) If you go back and reread the first three paragraphs of this post, what impression do you get of LW? Is it a “I respect them but think that with nudging they could be even more awesome”? That may have been when you wanted to convey, but I think that’s far from the average reader’s impression.
It’s not clear to me what you mean by strawmen. There are actual people who actually believe those, and Scott gave several examples in his post. There are walking, talking Aristotelians and Wilsonians, and they make predictable mistakes because of their worldviews. A Bayesian (in the sense of someone who sees their beliefs as probabilistic entities, linked by conditional probabilities) will not make those predictable mistakes. They have an explicit methodology for operating in an uncertain world.
That’s the short answer I’d give you about what a Bayesian is: someone who sees uncertainty as a fact about their mind rather than external reality, who quantifies their uncertainty as both credence in facts and probabilistic dependence between those facts, and who makes decisions by expected value judgments.
That’s only the short version, of course, and what made it into my top three might not make it into another Bayesian’s top five. The main thing that I think is useful about Bayesianism the way I think about it is that it’s mechanistic. It’s one thing to say “your current beliefs should reflect all your knowledge,” and another thing to say “this is how you take a new piece of information, and try to find all of your beliefs that should be changed by that information.”
This is what seems useful about Less Wrong to me- many posts will suddenly crystallize for people something that they’ve sort of felt intuitively in some situations, and now they can apply it in many more situations. When you have a phrase like 2-place word, you can dissolve many contentious discussions very quickly. Perhaps in the heat of the discussion you could spontaneously realize that questions of subjectivity are relevant and invent a way to communicate that to the other discussant in a way that doesn’t make them think you’re undermining their position, but I have not seen that go well very frequently. But after reading that post, I now have the phrase “2-place word” pop into my head whenever I’m about to misuse terms that way, or I see someone else misuse them.
“this is how you take a new piece of information, and try to find all of your beliefs that should be changed by that information.”
This is where I wonder how you avoid getting caught in a loop: you need to evaluate that piece of information (is it reliable, is it pertinent, does it really affect every single one of all my beliefs, is there a good reason to make such a widescale change and so on) but if it is not on the bad old Aristotleian model of “True or false” but the shiny new model of “probability that it is correct/probability that my belief about its correctness is correct”, how do you cut through the tangle of “I estimate such-and-such a probability that this particular piece of information is accurate, conditional to certain factors; then I estimate this degree of confidence in my estimation; then I assign a value to my estimation of my estimation” and keep on going?
You could – it seems to me – end up in the Charybdis of “Wilsonism” where nothing is certain at all while trying to flee the Scylla of “Aristotleianism” where black is black and white is white.
I’m not saying you would, or that people don’t have firm grounds to base their interpretation of “Bayesianism” on, but I’m asking how does saying ‘I attribute this numerical value to my belief’ really differ all that much from the example given in a comment below, where someone says ‘I believe it because it is true’? After all, that could just be the verbal shorthand for someone saying “After considering the evidence of studies on diets high in saturated fat correlating with increased incidence of cardiovascular disease, the current consensus on best medical practice, anecdotal evidence of improvement in health by change of diet, and personal experience, I believe saturated fats cause heart disease to be a true statement” even if they are using an Aristotleian model rather than a mathematical formulation of “I believe it is true”?
Most probabilistic techniques accept fuzzy inputs. Generally there’s no value is going more than one or two steps deep- “I’m 90% certain I’ll arrive between 20 and 40 minutes from now” is generally good enough, but it’s sometimes worthwhile to also do the calibration step of “when I feel 90% certain, it actually only happens 70% of the time.”
One of the reasons that expected value calculations made it into my top 3 is because I think those are important for guiding epistemology. Most beliefs I could have a narrow uncertainty on are not worth having a narrow uncertainty on. (For example, I could calculate and memorize prime numbers- but that would be a tremendous misuse of my life.) I try to make the amount of effort I spend on narrowing my uncertainty proportional in some way to the value gained from narrowing that uncertainty.
@ Vaniver — Thanks for the point about transparency! After re-reading those three paragraphs, I made a small change in wording, which might help slightly.
Re the strawmen. I believe that everyone does understand that everything is uncertain. So I have an alternative theory about what is happening in cases in which they appear not to. It may be impossible to convey it in a blog comment, but here’s the short version:
There are various “folk explanations of belief,” among them Aristotelianism and total relativism (“Anton-Wilsonism,” except he didn’t actually advocate that). These stories about belief are not accurate models for how anyone’s beliefs work. Moreover, at the meta level, no one even believes that these explanations are correct.
Rather, if we are going to have a discussion about belief, we need some simple shared conceptual framework in order to get ideas across. (The way belief actually works is incredibly complicated and mostly unknown, so that’s not an option.) These “folk explanations” are Schelling points for structuring a conversation about belief. We deploy different ones depending on what we’re trying to accomplish in the conversation.
As a hypothetical example, when someone says “I believe saturated fat causes heart disease because it is true,” it might sound like their brain actually runs on Aristotelian principles, or at least that they believe that’s how their brain works.
But maybe this was shorthand for:
“There’s all kinds of conflicting theories about diet and what we should eat, and most of it is Deepak-style woo. People have quasi-religious beliefs about food that just ignore the facts. However, the nutritional mainstream is based on solid science. I am aligned (tribally) with science, which finds actual truth, and against woo-meisters, and the nutritional mainstream says saturated fat causes heart disease.”
In other words, “because it is true” is shorthand for “because scientists say so, and they know how to find the truth.”
Do you quantify your uncertainty? How? How often? When addressing what sorts of problems? How well does that work?
(I do, very occasionally, quantify my uncertainty and apply probability theory. But if there’s a way of doing this frequently in everyday life, I don’t know it—and would like to!)
I agree. I think that most conversations about belief are benefited by a probabilistic structure, and after a few minutes of thought I can’t think of any where the Aristotelian or Wilsonian structures would be obviously superior.
Imagine a Wilsonian court room: “well, we can’t be certain the defendant is guilty, or even that a crime took place.” We currently have Aristotelian courts, and they seem insane. Replacing “beyond the shadow of a doubt” with something like “posterior estimate of guilt > .999” would do a lot to unify the epistemic standards of courts and make differences in belief more explicit.
(Anecdote time: I remember hearing recently about someone who swallowed a bottle of pills and a bottle of vodka and died; the coroner was asked if there was any chance that he had done so on accident. The coroner was about to confidently say no when the lawyer noticed that, and interrupted the coroner, asking “was there at least a one in a million chance that he did it on accident?” Well, the coroner was forced to admit that he couldn’t be that certain, and so the death was ruled an accident, much to the delight of the family and the lawyer and to the dismay of the insurance company.)
Yes, but on different levels, depending on the needs of the situation.
For example, when making plans with other people, I try to give probability estimates. A “I think it’s 80% likely that I’ll show up to game night tonight” is much more informative than a “yes” or a “maybe.” Is that 80% well-calibrated? I don’t have enough data to say yet, but it will be eventually.
When evaluating beliefs internally, I don’t think I use specific numbers very frequently on a conscious level, but I’m sure I use specific numbers on an unconscious level. This is in the sense of measuring angles when playing pool- I couldn’t tell you where I need to shoot in degrees, but I can tell whether a shot looks wrong or right when lining it up, and there must be some math embedded in that algorithm.
I think the most benefit I’ve gotten out of trying to quantify uncertainty is by explicitly imagining other alternatives. It’s one thing to say “I’m pretty certain that X,” but I find when trying to assign a number to X after observing O I can’t just use P(O|X), the likelihood (which is easy to calculate); I need to come up with alternatives Y, estimate P(O|Y), and then use the likelihood ratio to parcel out probability to X and Y. My beliefs are noticeably more accurate and I am significantly less likely to be surprised when I do this than when I don’t.
> For example, when making plans with other people, I try to give probability estimates. A “I think it’s 80% likely that I’ll show up to game night tonight” is much more informative than a “yes” or a “maybe.”
Is it really? I can understand how that is exasperating. You’re treating a variable whose outcome you have control over as if it were random. As a gaming buddy, an answer like “I’m a little tired, but I might” gives me a lot more useful insight; it doesn’t matter much to me whether you are 80% or 60% likely to come, but if we are gaming buddies, it matters to me how you’re feeling.
But the reason a criminal court puts the bar of proof so high (“beyond the shadow of a reasonable doubt”) is not because it’s grappling with a vague notion of what estimate of guilt is most likely, it’s because the penalties (imprisonment, execution for capital crimes) are so severe that it is considered better to risk letting the guilty go free rather than to punish the innocent*.
Naturally, lawyers have developed ways to win their cases (rather than be concerned with ‘did x do it or not’), and politics plays a part, and human corruption – if you’re up before this kind of judge, you want a system with wiggle room rather than one that may be a beautiful mathematical model for finding a verdict but has less flexibility.
*Or at least, it used to be so held – the coarsening of society seems to be inclining towards “Lock ’em up and don’t worry too much about if they did it or not”.
@ Vaniver (August 7, 2013 at 12:01 pm):
I suspect that people often frame discussions of belief in absolute, or absolutely relative, terms for simplicity and conciseness. Let’s say we’re talking about whether it would be ethical for us to have sex. I ask if you believe in God. If you say “p=0.0381”, we’ll have to have a discussion of Divine Command Theory and whether it applies. That might be tiresome in the circumstances. (Unless you’re one of those people for whom philosophy is foreplay. I had one girlfriend who— …but I digress.) If you say “no,” we can skip that.
On the other hand, I think everyone has conversations that go like this:
So I think everyone is capable of using and discussing degrees of belief, and I do think the straw men are straw. However, ozymandias‘s point that people may fail to do what they are capable of, in some contexts, is an important one.
What you said about using explicit probabilities was interesting. I’ll do that more often and see how it goes.
The issue of numbers not always being the best way of representing uncertainty is an interesting one. Perhaps an alternative to Bayesianism would be some sort of ordinal ranking of beliefs then?
I don’t see this as a big problem though, you can simply use ranges of values to indicated your uncertainty.
Ordinal rankings of beliefs reduce to real number representations given transitivity (if A>B and B>C then A>C) and comparability (for any two beliefs A and B, either A>B, A<B or A=B must hold)
No, that requires quite a few more additional assumptions! There are all sorts of total orderings that can’t be embedded in the reals. And one wants to be able to do more with probabilities than just compare them less and greater — for instance, one needs a notion of conditional probabilities. (And really from a pure Bayesian point of view, all probabilities are conditional.)
If we do accept that probabilities should be represented by real numbers, and accept that there are several things we want to do with probabilities other than just compare them for less-or-greater (specifically, have conditional probabilities and negation), than we find that Cox’s Theorem tells us that (subject to a “big-world” assumption) the resulting system must be isomorphic to ordinary probability (except for the whole “countable additivity” thing — which at the level we’re discussing here doesn’t seem that essential). However this still leaves us with the question of why we would want to use real numbers in the first place.
In fact there actually is a good reason for using real numbers to represent probabilities, namely Savage’s Theorem. However it comes not from considering degree of belief by itself, but from considering decisions — Savage’s Theorem deals with agents that act according to certain axioms, and shows that such an agent must be acting as if it had a utility function over possible outcomes and a probability distribution over possible worlds and picking the action that yields the greatest expected utility. (Except that once again the “probability distribution” need only be finitely additive.) So note that the resulting probability distribution, which we typically think of as representing the agent’s beliefs, may also depend on its preferences! An event may be assigned probability 0 either because the agent simply considers it impossible, or because it’s an “end-of-the-world-fatal-error” scenario wherein if such a thing occurs, the agent no longer cares one way or another what else might happen.
Anyway my point is you don’t get real numbers without additional assumptions. In particular, you’re going to need some sort of Archimedean assumption, or else some sort of connectedness assumption or something.
I did forget to add finite to that list, but those three things guarantee that for any set of propositions, there exist real numbers that satisfy the ordering relations. Once you have those, any type of Cox’s Theorem derivation will result in a system equivalent to the real number system.
I’m not convinced that Cox’s theorem applies to actual existing real-world agents, just because I doubt transitivity actually holds.
But not uniquely. Having a numeric probability lets me calculate which bets I should take, but having only an ordinal ranking does not.
Sure, but once you have a numerical representation, Cox’s Theorem gives you a unique numerical representation given the desiderata.
Ugh, I meant Savage’s foundations, not Cox’s theorem, sorry. Been thinking about it more, and I think I can formalize my objection to the transitivity assumption, which seems to torpedo the whole framework.
Suppose an agent is composed of simpler submodules, and its expressed preferences (i.e. actions) are assembled by polling its submodules.
Bam, voting paradox.
So transitivity is out.
It seems something like
1. Bayesianism is a good example of and reasonable label for the process of holding specific degrees of uncertainty of belief as a matter of course, not only as special exceptions, which as Scott demonstrates is a significant advance over other philosophical worldviews.
1a. Albeit one that, once you have it, seems obvious, and pointless to have a special name for. Surely everyone does that? But Scott’s point (which I’m not sure of, but seems right) is that we CAN’T take it for granted.
2. However, Bayseianism or rationalism as a culture tacks on to that everything they see as generally a good idea for rationalism which they see as flowing from Bayesianism, and I’m mostly ok with[1], but someone else might legitimately think is stupid.
[1] I think LW has systematic OTHER problems, but I’d not previously thought about that one 🙂
>Those are unambiguously good things, but if this is an accurate characterization of Bayesianism, it makes the term pretty nearly vacuous.
Goal statements like this are usually in some measure vacuous. For example, feminism “aims for equal economic, political and social rights for women”. Most westerners agree trivially with the aims as stated (they’re now applause lights, in LW-speak), and yet many westerners are not on board with how feminists cash out the above in terms of actual laws & norms – hence the need for a feminist movement.
The substantive content of Bayesianism/rationalism is not “be more rational”. It’s “here’s how we think you should be more rational”, followed by a (tentative) list of techniques.
>One candidate for middle ground is “don’t be an Aristotelian or Wilsonian”, which I’d certainly agree with; but isn’t it pretty close to “don’t be an idiot”? I do think those are probably strawmen.
No, those are not strawmen. I promise you, they are not strawmen. If you think they are strawmen, I can only assume your social circle is very exclusive (if so, fair enough). A good exercise might be to ask somebody outside your normal social circles a question like the ones Scott mentioned above. You will probably be shocked. An educated, fairly smart relative of mine recently refused to make a certain very obvious lifestyle change for health reasons on the grounds that “I’ll go when I go, there’s nothing I can do about it.”
Seconding the case that these are not strawmen, even amond intelligent and high-powered people.
Bayesian probability theory can be derived from a handful of desiderata that are pretty close to “don’t be an idiot”. From Jaynes’s Probability Theory: The Logic of Science:
Now it may be true that knowing Bayes’ Theorem doesn’t help one’s reasoning, or that the members of Less Wrong are all suffering from some failure in reasoning. But if you’re looking for middle ground between “don’t be an idiot” and Bayes, there’s not much to find.
Catholics believe they can *rationally* prove the existence of God by pure, Aristotleian reasoning. Lots of people who aren’t Bayesians would be insulted if you suggested they weren’t rational. So “Bayesianism” is a useful term because it denotes a particular method of trying to be rational.
Maybe the best way to solidify this disagreement is for you to give an example of a non-Aristotelian, non-Wilsonian, intelligent and useful epistemology that you think contradicts Bayesianism?
(I’m not sure whether I should be saying “contradicts” or “is different than”. Certainly there are non-Bayesian things which are very useful to thought and reasoning, but I would expect them to either work in a different domain than Bayesian reasoning, or to be equivalent to it. To give an example, evolution is a correct Grand Unified Theory Of Biology, but genetics, molecular biology, etc are also correct Grand Unified Theories of Biology, and selfish-gene evolution sounds a lot different than evolution-by-organism even though both are broadly correct.)
Yes, “is different than” the actual technical Bayesian methods. (Although “contradicts” Bayesianism if that is the claim that Bayesian methods are all you need to know about epistemology.)
The technical methods are more-or-less right ways of solving the problems they address. If “epistemology” means “updating real-valued belief probabilities,” then (setting aside technical quibbles) Bayes is the right answer. So, yes, alternative methods address other problems, rather than contradicting Bayes.
But “epistemology” is not that. “Epistemology” is figuring out what is true or useful.
The point I keep making is that situations in which Bayesian methods actually apply rarely arise in practice. And no one ever replies “Yes they do! I do Bayesian calculations with numbers all the time!” Instead, they say that Bayesianism is a whole theory of epistemology, despite the fact that you can rarely actually use its technical methods. And, so far, I can’t make sense of that claim (except as religious dogma).
What I’d suggest instead is that probability is a small (important, but small) part of epistemology. The reason you can rarely use Bayesian methods is that Bayesianism is not a general theory of epistemology, it’s a theory of subjective probability.
The danger in Bayesianism is that by positing Bayes as THE ANSWER, it leads you to ignore the rest of epistemology, i.e. figuring stuff out.
I’m working on a blog post about this; I hope to finish it later this week.
Thanks for the explanation, this makes your position nicely clear. Yes, I would agree that Bayes is certainly not the whole of epistemology, or at least no one has shown it to be so. For one thing there is the problem of dealing with logical vs empirical uncertainty which I haltingly tried to figure out here, without much success I think.
Thanks, Ian, I enjoyed the jellyfish!
One of my complaints about Bayesianism as a theory of epistemology (rather than as a theory of probability) is that it lumps all kinds of uncertainty together. Here is an off-the-top-of-my-head list of types:
* inherent effective randomness, due to dynamical chaos
* physical inaccessibility of phenomena
* time-varying phenomena (so samples are drawn from different distributions)
* sensing/measurement error
* model/abstraction error
* one’s own cognitive/computational limitations
These need to be dealt with in different ways, and mushing them all into a single number reliably leads to suboptimal inference.
Maybe the best way to solidify this disagreement is for you to give an example of a non-Aristotelian, non-Wilsonian, intelligent and useful epistemology that you think contradicts Bayesianism?
‘Contradicts’ is a moveable goalpost, since the Sequences contain a lot of heuristic argument that isn’t directly related to Bayes but can be included or excluded from pop-Bayesianism depending on how much Eliezer one has read.
Having noted that:
— Modern finite-valued logic
— Quantum logic
— Any of the high traditions of Buddhist epistemology, e.g. Nagarjuna’s
— Platonism, which apparently a lot of working mathematicians still believe in
— Korzybski’s General Semantics
— Robert Anton Wilson’s meta-system, which emphatically isn’t “Anton-Wilsonian”
May I suggest ‘postmodernist’ instead of ‘Anton-Wilsonian’?
Quantum logic? What do you even mean by that?
Re: intelligent and useful epistemologies, what about Frequentist Statistics?
This is a good one! I definitely know of several elite, highly intelligent people who think much more in terms of significance testing than Bayesian updating.
I feel like both sides are mostly talking past each other here. I view you as having mounted four main critiques:
1) Bayesianism is an example of eternalism, in so far as it eliminates meta-uncertainty (uncertainty regarding how uncertain one should be).
2) It is difficult to pin down exactly what Bayesianism means, as people associated with CFAR/LW put forth many different ideas, and you have received many different responses so far.
a) If Bayesianism just means that we should think using numbers, then great! However, this is obvious/generally accepted on an intuitive level, it’s not clear why this should be labelled “Bayesianism,” and LW/CFAR people have put forth many other ideas as well.
3) Using probabilities and Bayes’ Theorem is generally infeasible in real life, and is therefore the wrong thing to teach. Frequencies have been shown to be more effective at debiasing.
4) LW/Eliezer Yudkowsky’s writings in particular are similar to a religion.
Is this an accurate summary of your position?
Alex,
Sorry, no, that’s not at all accurate. But it’s natural that you would have difficulty summarizing my critique, because I haven’t written it yet!
My blog post which Scott wrote about was explicitly not a critique. It was meta to that. It discussed reasons I might write a critique, and some non-technical reasons it would be difficult. And it wondered out loud whether it would be worth writing, and asked readers for opinions.
Actually writing it would be a lot of work, and I’m still unsure if it’s worth doing. The question is “who would benefit and how?” I’m definitely not interested in just beating up on bad ideas (much less on a community I admire in many ways). It’s only if the critique leads to a better alternative that it could be worthwhile.
The post I’m currently writing points to a possible better alternative, skipping most of the critique. Unfortunately I don’t have much time to write, so it’s a big brain dump, and it might not be any use.
But if there’s enthusiasm for it, follow-up is possible, including the detailed critique of Bayesianism.
I think the most enlightening part of this post for me is the reminder that you only need to think through things (like epistemology) when you are outside your comfort zone. For example, you don’t need to worry whether 2+2=3 unless you are a mathematician. You only need to worry about ballistics if your job is to design a better football, not to throw it.
So, a related point. Does your version of Bayesianism use the concept of truth? Is it something like “high enough probability for a given purpose”? Or maybe “something is true (for me) when it’s inside my comfort zone and I don’t need to think about it”? Until some new evidence comes along and a belief that was inside one’s comfort zone suddenly becomes suspect. Like if you throw the ball straight up high enough and it consistently veers in one direction, leading you to ponder the Coriolis force or something. Until you develop a new set of intuitions and don’t have to think about what’s true in relation to a given phenomenon anymore.
Yes, the comfort zone bit was particularly insightful.
For me, when the product of probability with relevance of being wrong gets low enough, I stop thinking about it. Does that work?
This…uh, actually, was explicitly RAW’s approach. (He used a 0-10 scale.) He lacked Bayes’ analytic framework for formally updating, possibly because he got sidetracked trying to apply Shannon to the problem. But your frame is unfair.
Which of his books is this in? I thought I’d read most of them, and I don’t remember this even though I do remember a lot of very explicit “I do not believe anything” type of stuff.
I don’t remember this example either, but I do remember RAW saying such things. OTOH, he empirically didn’t say them often enough, as I have had to defend him on this point to multiple people.
From http://www.deepleafproductions.com/wilsonlibrary/texts/raw-inter-99.html:
A scale of 0 to 10 is nonstandard, of course, but we already have two scales: the mathematician’s [0,1] and the more common 0%-100%. Normalizing, his last statement translates to a probability interval 0.698 < p < 0.708.
RAW’s ‘I do not believe anything’ is a statement about avoiding certainty–that is, if your epistemology tells you that p = 0 or p = 1, your epistemology is broken. Bayes also tells you this, if you read it correctly. No amount of evidence can ever update you to p = 0 or p = 1 unless you start with a bad prior.
Pretty sure your Heuristic 81 is Nassim Taleb.
Yep https://twitter.com/nntaleb/status/329958458747150338
Great post. I agree that Bayesianism is the best epistemology that humanity has discovered, and is less obvious than it seems in hindsight. I also attach a high probability to the hypothesis that a more accurate epistemology has yet to be discovered.
One rough edge is that probability theory is derived from axioms that are normally taken for granted. If the axioms are assigned a probability less than 1—as seems correct—then the epistemology at least becomes less simple and beautiful. I recall that MIRI recently published a paper on this kind of problem.
A second rough edge is that statements, mathematical or otherwise, about “agents” that assign probability to hypotheses contain concepts that are still quite vague, and don’t necessarily cleave reality at its joints. If humanity were to learn something new and surprising about the meaning of agenthood or computation, Bayesianism might be relegated to the status of good approximation.
Does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?
Does anyone consistently use deductive logic in everyday situations? If by consistently, you mean always, then no. Otherwise, yes. And the same applies to numerical probabilities.
Also, like using verbal logic, using numerical probabilities is something that must be learned and that doesn’t work well the first n times you try it. It may take years to become good at it. But yes, some of us are trying to do precisely that.
Your source:
https://twitter.com/nntaleb/status/329958458747150338
I immediately thought your version of Aristotelianism was a straw man until I realized that I used to think like that and several of my friends still do. For example, I asked one of them why they believed something* and she responded with “Because it’s true”.
*I asked why she thought that saturated fat causes heart disease.
“The Arguments From My Opponent Believes Something are a lot like accusations of arrogance. They’re last-ditch attempts to muddy up the waters. If someone says a particular theory doesn’t explain everything, or that it’s elitist, or that it’s being turned into a religion, that means they can’t find anything else.
Otherwise they would have called it wrong.”
Fantastic point.
Wow, that post was awesome (caveat below). It had about five different statements where I felt “wow, that was so right, and it’s so obvious, but I didn’t think of it like that before”.
(My caveat is that I need to go grok David’s post before I endorse this as a rebuttal :))
Your disagreement wrt St John’s Wort seems to me much more to do with what Eliezer calls Traditional Rationality. TR emphasises reason as combat. Beliefs must be justifiable – ie you must be able to put a good case in an imagined argument about them. Ideas like fallacies and burden of proof are all to do with these imagined arguments. In suggesting that anyone take St John’s Wort, you are making a claim, a claim for which you must provide sufficient evidence. No-one need provide any reason not to take it; no claim is thus implied.
I don’t think this is actually true. Oftentimes a person’s first impression of a new philosopher or philosophy is a System 1 emotional response to status or tone rather than a measured argument about the content. As a result, calling someone arrogant is not always a rhetorical strategy designed to muddy the waters–it can be a genuine reaction to how a new idea feels, or a kind of catharsis. I’ve seen people react this way to LessWrong, and after they’ve calmed down they are willing to talk about things they agree and disagree with.
In addition, if someone disagrees with you or your ideas, they may announce to everyone that you are arrogant, because this is very socially damaging, but not even bother to talk about whether you are right or wrong. They might even have good arguments against your position, but not bother to use them, because social contests are not won by good arguments. (See also: American politics.) Given all of this, “Heuristic 81 from Twitter” doesn’t seem very trustworthy.
Perhaps it might be phrased less catchily but more accurately as “If a charitable person is trying to have a serious discussion with you, and they are very familiar with a position and had a lot of opportunities to respond to it, but all they ever say is that it’s arrogant, it provides some evidence that they can’t find anything else” ?
I definitely agree with that formulation, but it’s a heck of a lot weaker than the original. There’s not a lot of charity in these sorta-political debates about where to use probabilities, and there’s a great deal of inferential distance to cover.
Yes, but it’s possible that that the reason that they can’t find anything else to criticise is because there isn’t anything else there – i.e. the position is “not even wrong.” Ironically, that is definitely my opinion on Taleb, from whom the quote seems to originate.
That would also be the extreme version of David Chapman’s critique.
If a charitable person consistently repeats “X is arrogant” without justifying or explaining, it provides some evidence that I was wrong to think of zir as charitable.
“Those arguments aren’t dangerous because they’re never true. They’re dangerous because you can always make them, whether they’re true or not.”
In other words, the likelihood ratio is one, so by Bayes’ Theorem the posterior is the prior. I just thought it was neat that Bayes’ Theorem turned up in a post defending Bayesianism!
Speaking of likelihood ratios I consider the Bayesian definition of evidence to be a core facet of Bayesianism, directly linked to Bayes theorem, and not just common sense (its compatible with most intuitions but I don’t think most people have a clear formulation of it). Thus I’m disappointed that neither Scott nor David mention it.
The problem with Bayesian epistemology (ignoring the fact the historical Bayes probably would not agree) is that it rests on the assumption that empirical evidence can be relied upon, the assumption that memory is reliable enough for science, and the assumption that the laws of physics won’t suddenly change.
IF all three assumptions hold then Baysieanism works. However, a proper Bayesian solution to the problem of the Skeptic doesn’t exist. The only real merit to an Aristotlean system is that it is at least capable of acknowledging and considering the problem.
Actually, you can (and should) assign non-0 or non-1 probabilities that a given piece of observational evidence actually occurred (rather than apparently occurred), that a particular memory was correctly stored and retrieved (without corruption), and that the laws of physics will remain constant, so no, Bayesianism does not require any of those assumptions. Each of those assumptions are pretty decent as far as these things go, but it would be no problem for a consistent Bayesian to modify the probabilities assigned to them in light of new evidence.
If you are looking for assumptions that need to be true for Bayesianism to get off the ground, you should look towards the axioms of probability, Cox’s theorem, etc…
How are you meant to assign a probability (other than 0.5 based on a certain famous idea) to the general idea that memories have a correlation with reality, that the laws of physics remain constant etc, without reference to memory, the laws of physics etc?
> the general idea that memories have a correlation with reality
That’s not a hypothesis, it’s a fuzzy description of a subspace of the possible hypotheses. The maxent prior that any of those hypotheses are correct is the fraction of the hypothesis-space they collectively cover.
Imagine an ensemble of theories to explain the apparent coherence of your subjective experience. Those that do not specifically predict such a degree of coherence will be strongly suppressed. Among those that predict the coherence, the explanations with by far the highest priors (due to lack of excess complexity) will be those that claim that memory and senses work decently well.
This whole exercise seems like it’s just confusing the heuristic used to generate the map with the territory. Even by LW’s standards, it saddens me a bit inside.
Could you elaborate on this, please? I’m rather curious.
Bayesianism is a heuristic that operates entirely on belief-space. Both Aristotelianism and Anton-Wilsonism assume that you can and probably do have heuristics for developing your beliefs. Maybe Bayesianism is a good heuristic, but that doesn’t mean much.
I definitely don’t understand what you mean by this.
I think he’s saying arguing over whether Beyes is ‘right’ assumes that there is something objectively true about a method, and therefore… something something.
Bah.
‘One of the wisest things I ever saw on Twitter (which is a low bar, sort of like “one of Hitler’s most tolerant speeches”)’
What the Hell, man?
I’ve been reading RAW a lot recently. Maybe I’m just filtering what he says through what I already believe, but I don’t really see his message being as simple as “there is no absolute truth, so be unsure of everything”. I feel like he’s mainly concerned with bringing about inner change (unlike Yudkowsky, who’s mainly concerned with saving the world), and his concept of multiple “reality-tunnels” is very useful for that purpose. David Chapman (while we’re on the subject) talks about something similar here. An example of when multiple reality tunnels are useful in everyday life is seeing good intentions in everyone you meet, which allows you to be nice to them more easily, while simultaneously acknowledging the self-serving way in which most of us act, e.g. status signalling, in order to better predict people’s actions.
I also get the impression from RAW that he seems to think his philosophy can be dangerous to people who don’t have a strong concept of rationality. I think he mentions both Leary and Crowley requiring that people study science, math, and logic before letting them do acid and occult rituals, respectively.
But maybe my interpretation is flawed.
BTW, my experience with re-reading LessWrong is identical to yours and Ari’s.
David Chapman link didn’t work: http://approachingaro.org/visionary-and-objective-truths
I agree that most of his work is about inner change and so on. I just think that when he explicitly digresses into epistemology, that is the epistemology he supports. Have you read Cosmic Trigger?
The first one, yeah.
In Prometheus Rising, he is explicitly and especially critical of third-circuit types who he calls… Rationalists, with a capital R. In that book, he is explicitly trying to guide people toward inner growth, and lays on pretty thick the message that while it’s a stage of one’s development that you have to go through, it would be a deadly mistake to think that such things are of central importance, rather than one tool that occasionally is useful.
If you want an example of a field whose name refers to a part, how about Bayesian Statistics?
Baptists – yes, other sects have baptism, but they’re Doing It Wrong; adult baptism is an important differentiator, though hardly the only one. “Quaker” and “Shaker” are insults adopted by the targets, which is why they aren’t terribly useful. The official names of the sects are more descriptive. The Quaker sect is the (Religious) Society of Friends, and the adherents are Friends. The name refers to a lack of priests, though it isn’t clear if you don’t already know what it means. The Shakers are the United Society of Believers in Christ’s Second Appearing. Sometimes they are called “Believers,” but that is even less convenient than “Friends,” hence the niche for another name. The eschatology is pretty important.
David Chapman: I think you can make Bayesianism more specific than simply not being an Aristotelian or a Anton-Wilsonian. The Bayesian interpretation of probability is that probability is a measure of subjective uncertainty about the state of the world.
Thus, Bayesianism is the idea that we should represent our uncertainty with probabilities.
Probabilities aren’t the only non-stupid (by which I mean non-Aristotelian and non-Anton-Wilsonian) way to deal with uncertainty. There’s also non-classical logics like fuzzy logic.
If we take Bayesianism to mean “Use Probabilities for Everything!” then it’s clearly non-stupid and non-trivial. I think you can make the following critiques of Bayesianism though:
1. The Problem of Ambiguity: Consider an Aristotelian and an Anton-Wilsonian talking about whether a particular movie is a good movie. The Aristotelian says “Either the movie is good or it isn’t good.”, The Anton-Wilsonian says “Movie quality is a state of mind.” The Bayesian whips out his calculator, does some multiplications, and declares: “Gentlemen: it is 70% likely that the movie is good.”
The Bayesian Strategy of using a single real number (between 0 and 1) to handle uncertainty glib and inappropriate. What would be better is to disambiguate the statement, or ask a series of sub-questions whose answers would deal with the motive for the original query.
2. The Mystery of the Missing Priors: Where do we get our initial probability estimates from? Bayesians have techniques to elicit estimates of probabilities, such as offering bets, but one can’t help feeling that the numbers may as well be produced by consulting tea leaves. (Note: The Work of Ray Solomonoff provides a theoretically sound, but uncomputable (and hence practically infeasible) method for answering the above question.)
3. Computational Inefficiency: Ignoring the prior problem, applying Bayes Theorem and rigorously keeping track of all the probability estimates we make is just too darn difficult. Nobody has the time to do a multiplication for every new piece of evidence and every question that this piece of evidence may shed light on.
“The Bayesian Strategy of using a single real number (between 0 and 1) to handle uncertainty glib and inappropriate”
I don’t see how this follows from your example. 70% subjective probability seems like a reasonable number given that it was calculated for that particular person or group.
” Nobody has the time to do a multiplication for every new piece of evidence and every question that this piece of evidence may shed light on.”
Yes, applying bayes to everything is intractable and one of the points of bayesianism is to aim for the best approximation that you can do with your time and resources.
The point of the example was that there are some statements, like “this movie is good” that require further disambiguation before we can attach probabilities.
Some things that “this movie is good” might mean:
1. This movie will win an Oscar.
2. This movie will have a score of 90% or more on meta-critic.
3. I will not feel buyer’s remorse after watching this movie.
4. This movie will make a profit.
5. I will think differently about some issue after watching this movie (and the change in thinking will be regarded by my peers as laudable)
6. This movie will be remembered 10 years from now.
Slapping on a number, even a reasonable number, is no substitute for disambiguation. Using probabilities only really works when you’re quantifying your uncertainty over fairly precise claims.
Precisely this. Part of David’s point, and something I’ll blog about soon myself, is precisely that this kind of disambiguation is a big part of any working epistemology–indeed, substantially bigger than probability theory.
Patrick, you are completely right that such generalizations of probabilities would be bad and it is possible that some wanna-be bayesians fall into the trap of thinking that a single probability for your reading of a question answers all possible readings of the question.
However, Bayesianism has nothing against disambiguation of this form. The point is that you choose the specific statement that you care about in a particular instance – e.g. whether *you* will like the movie at this particular point in time and then do the math to come up with a subjective probability of what the chances for that are.
No decent bayesian (by LessWrong’s standards) will tell you that the number that they came up with is going to answer the other meanings of the question – in fact they are likely to try and do some new math and give you different numbers for every specific reading of the question which you point out.
@ Patrick Robotham — Thanks, yes, these are some of the major problems with applying Bayesian methods in practice.
For these reasons, actually using Bayes’ Rule is very rare, except in special domains (mainly ones where there is enough frequency data to get sensible priors).
So what happens when one points this out to pop Bayesians? Do they say “no, you’re wrong, I calculate with Bayes’ Rule a hundred times a day, like when I can’t remember where I put my keys?” No. Some advocate just inventing numerical priors (Scott took this tack recently), but there are big problems with that, and it seem to be a minority approach.
Mostly, it seems pop Bayesians retreat to “qualitative Bayes.” What’s that? I haven’t yet seen a clear statement of how it works. (Nor any evidence that it does work. If you’re a Bayesian, shouldn’t you want to validate “qualitative Bayes” empirically?)
It’s something like “always bear in mind that the world is uncertain and you should take evidence into account.” Which is good advice, but not hugely more than “don’t be an idiot.”
Your argument takes the form “If you really believed X then you would believe Y. But Y is absurd hence ~X”*
Obviously its the first part of the argument that I have a problem with. It is sort of like asking a basketball player who
“believes in” physics why he doesn’t write down the trajectory of the ball in mid air on a piece of paper and then calculate where its going to land and then react accordingly. Well its because there are other things like time efficiency to consider. Physics correctly predicts the path of the ball but in practice its hard to apply.
To make the connection more explicit. The Bayesian argues that probability is the most accurate way of updating your beliefs based on evidence. He does not argue that it is the most efficient in practice.
*Its a frighteningly common argument. For example “If you were really an atheist you must believe that the universe exploded out of nothing and life spontaneously appeared”.
This prompts my “shifting target” complaint. Elsewhere in this thread, there are multiple definitions of Bayesianism that contradict this; for example Vaniver’s “a Bayesian is someone who quantifies their uncertainty.”
It would be unreasonable to expect complete agreement among Bayesians, but it could be useful for you all to agree on a finite list of definitions, so so you’d have a clear idea of which one you were actually advocating, and they could be evaluated individually, instead of “Bayes!” being an applause light.
I don’t agree that “probability is the most accurate way of updating beliefs” (there’s a ton of background assumptions needed for that to work). But let’s assume it for the sake of the argument. Then what? If actual Bayesian calculations aren’t useful in practice (as you seem to agree), then what’s the alternative? What good does knowing the “most accurate” thing if you can’t use it?
“Approximations,” maybe; but which approximations? How well do they work? (How do you know?) How many situations can they actually be used in?
I don’t claim to speak for other people however: quantifies their uncertainty =! assigns a specific numerical probability to every belief they hold.
Gah! Not applicable in every problem domain =! not useful in practice.
Newtonian physics is not useful for playing table tennis if by Newtonian physics you mean writing down the trajectories and then calculating their future positions. But if by Newtonian physics you mean generally understanding how objects change their trajectories then it is useful. Moreover, writing down trajectories and calculating forces IS useful in many problem domains. Table tennis just isn’t one of them.
Heuristics.
I’m not particularly interested in X-rationality (using things like Bayes and Decision theory to improve one’s thinking), however, I think that CFAR addresses your question in much greater depth.
Hmm. I would like more specificity. What heuristics? (Could you please name three, say?) How well do they work? How do you know?
David: “This prompts my ‘shifting target’ complaint. Elsewhere in this thread, there are multiple definitions of Bayesianism that contradict this.”
Do you agree with “epistemology X” as described above – that is, beliefs are degrees of probability which can be represented (more or less accurately) by numbers?
Scott — I agree that “you are not certain of any of your beliefs [but] you are not in a position of global doubt, and that you can update your beliefs using evidence.” (With possible quibbles about what “certain” means, maybe.)
I don’t agree that this implies that numerical belief probabilities are a good way to think about uncertainty or epistemology in general. Sometimes yes; usually no.
Sure
1: If a piece of evidence A makes B more likely to be true, then A is not necessarily good evidence for B if B was very improbable to begin with.
For example: If I have several symptoms of an extremely rare disease that I am not at risk of contracting, then I shouldn’t panic and assume I have the disease.
Here are some other heuristics
2: If a piece of evidence A makes B less likely to be true, then given A you should be less confident in B.
I know it seems completely obvious but many people do the opposite.
3: Explanations should be penalized for being complex. Given a piece of data A, even if a complex explanation B is more likely to result in A being true than a simple explanation C, B should not necessarily be preferred over C
This is essentially a special case of 1.
As I understood it, a true Bayesian was an agent that was actually able to apply Bayes rule in all situations. This requires ways to quantify uncertainty in a way that is both very hard and impractical for humans to do in almost all situations.
One of those situations where it would be very useful is building a Friendly AI. No human is a true Bayesian, but a Friendly General AI might be. If you pay attention EY often describes himself as a Bayesian wannabe.
Using fake numbers (based on subjective feelings of uncertainty) is about the best thing I can think of, just because it slows you down and makes you think of conditional probabilities. Like a Fermi calculation it asks a lot of your intuition, but is still better than your gut alone.
David:
“I don’t agree that this implies that numerical belief probabilities are a good way to think about uncertainty or epistemology in general. Sometimes yes; usually no.”
It seems that there are a lot of fuzzy belief states like “pretty sure”, “really really sure”, “actually not that sure”, and “vanishingly small chance but might happen”, that it’s important to preserve distinctions among them in order to avoid collapse into a yes/no/maybe logic, but that it’s also important not to reify them – ie it ought to be obvious that they merge into each other gradually rather than there being this sudden jump from “pretty sure” to “really sure”.
Numerical beliefs seem to be a really easy way to do this, not to mention having the other advantages I mentioned above regarding things like calibration training, so I’d need to hear a *really* strong argument against them before I thought represnting belief strengths numerically (whenever for some reason accuracy is important) is anything but a no-brainer.
There’s a famous physicist (unfortunately, I’ve forgotten which one) who said something along the lines of:
That’s what I mean by qualitative Bayes: once you get used to probabilistic thinking, you rarely have to explicitly use the equations because you are implicitly using the equations, in the form of habits inspired by the equations.
Here’s a short and incomplete list of habits I would include in qualitative Bayes:
1. Base rate attention.
2. Consider alternative hypotheses.
3. Compare hypotheses by likelihood ratios, not likelihoods.
4. Search for experiments with high information content (measured by likelihood ratio) and low cost.
5. Conservation of evidence.
6. Competing values should have some tradeoff between them.
Each one of those is a full post to explain, I think. I also think they’re strongly reinforcing; 3 and 4 are listed as separate insights, here, but I don’t think one is very useful without the other.
Thanks, I much prefer your explanation/examples to mine.
> There’s a famous physicist (unfortunately, I’ve forgotten which one) who said something along the lines of: “I understand an equation when I can derive the general properties of a solution without solving the equation.”
“I consider that I understand an equation when I can predict the properties of its solutions, without actually solving it.”
— Paul A. M. Dirac
Quoted in F Wilczek, B Devine, Longing for the Harmonies
Steve: Thanks! I was 60% sure it was Dirac but didn’t find it easily.
@ Vaniver (August 7, 2013 at 12:29 pm) —
Thank you very much indeed for that list of qualitative habits! It’s really great and I’m really embarrassed. Kaj Sotala gave a similar list in a comment on my blog, and I agreed they are all good things and important, and then apparently I promptly forgot about them. I’ll try not to do that again!
@David: You’re welcome!
David, let me give you a couple of examples of “qualitative Bayes” in action. These are off the top of my head and in no way canonical for the rationalist “community” if such exists.
For me it essentially involves using the *structure* of Bayes theorem to guide qualitative thinking about what to make of evidence.
I usually use it in odds form, so
O(H|E) = O(H)*P(E|H)/P(E|~H)
where E~evidence and H~hypothesis.
A few things that you will notice if you use the structure of this equation as a guide, BUT which people forget all the time if they don’t:
(1) O(H), the prior, matters a lot (cf base rate fallacy)
(2) P(E|~H) is the probability that you would see the evidence you are seeing IF your hypothesis is false. This is a prompt to (a) generate all the remotely plausible alternative hypotheses (thus avoiding false dichotomies), and (b) consider whether the evidence is expected given any of them.
An example I gave in a post I wrote once is in the movie “12 Angry Men”, where 12 jurors are trying to decide if a poor kid is guilty of stabbing his father to death in a city slum. One of the jurors thinks that he is a likely suspect precisely because he is a poor kid from the slums (and crime is prevalent among kids like him).
I want to emphasize that LOTS AND LOTS of people think this is plausible reasoning. And in fact the movie calls this juror out for *bigotry* but not for *bad reasoning*, even though “He’s from the slums” applies to almost every remotely plausible suspect and so does not make the boy’s guilt RELATIVE to them any higher.
Note how even without using the equation, its structure directs our attention to the right places. We feel the boy is probably innocent. Why? Because boys like him almost never commit murder (the prior is low)? Because lots of other people could have left the evidence he did (P(E|~H)? Because it’s unlikely that he would have left the evidence seen if he were guilty (P(E|H))?
Without understanding the structure of the equation, you can still talk about these things, but usually with much less clarity about how they fit into a general argument about the boy’s guilt. They just go into buckets labelled “arguments for the prosecution” and “arguments for the defense”.
This is my general understanding of qualitative Bayes. I do not think it is the end point of epistemology, but I do think it packs a lot of sanity-inducing thought processes into a small package.
“It concedes that you are not certain of any of your beliefs.”
I’m really, really certain that if I stick a knife in my hand, I am going to bleed. I’m 100% certain of that. I don’t know how more certain I can be. And it’s not because I previously stuck a knife in my hand and I bled, so that I’m going on past performance. It’s purely the belief that “If I do this, that will follow”.
Re: St. John’s Wort – but what other kinds of evidence did your father think would be definite proof? The presence of an active agent that could be chemically tested and demonstrated to be present in significant amounts? I mean, if “I asked Jane if taking those supplements helped her and she said ‘yes'” isn’t evidence type evidence, how is “I got a clipboard, advertised for 100 people who took the supplements, asked them if they helped, and 65 said ‘yes'” more evidence type evidence, when it’s still anecdotal/self-reporting? I’m not saying that clinical trials are valueless, but when it comes to things like “are you feeling less depressed”, you can’t really hook someone up to a dolorimeter and say “Yes, your readings have dropped six millipleures since the last test”, unlike a blood glucose reading.
And on the other side, what if a patient says “Sorry, doctor, taking ibuprofen does me no good”? Do you insist “Well, it must work because there is the chemically-tested significant amount of active ingredient present so you are mistaken” or do you accept what he/she says about “Aspirin works but it kills my stomach”?
> I’m really, really certain that if I stick a knife in my hand, I am going to bleed. I’m 100% certain of that. I don’t know how more certain I can be.
You shouldn’t be 100% certain of that. If, for instance, you clamped the arteries leading to your hand, then drained it of blood, and then stuck a knife in it, it would not bleed. If the blade were red-hot, it might cauterize the wounds created before they could bleed. If “stick a knife into my hand” were interpreted as “firmly press the handle into my hand, in an orientation which makes it easy to grasp,” you would not bleed.
More importantly, by being 100% certain, you lose your ability to update on evidence. If you stuck what seemed to be an ordinary knife through your hand, removed it, and didn’t see any blood coming out, what would you want to believe? Under those circumstances, would you really want to still believe that you were, in fact, bleeding?
This all seems silly.
The statement about the knife cutting has an implied “without draining the blood from the area or heating the knife.” If she were to observe otherwise, she’s more likely to assume it is a trick knife than that she doesn’t bleed from an open wound made by sharp objects.
One most certainly can update beliefs that are held with certainty. Or do you posit no one has ever genuinely changed their mind without being ‘Beyesians’? In which case, how does one become one?
I am completely certain my wife has affection for me today. No doubt. I can nonetheless envisage a hypothetical where I go home and get stabbed repeatedly by her. In my dying moments, I would update my belief accordingly, despite my present 100% certainty. Substitue in “earth continues rotating” or “gravity continues functioning” etc.
*Note, I might very well be wrong about my wife, or the earth’s rotation, or what have you. But that doesn’t mean it’s impossible to realize it due to current certainty.
Or were you just saying “You can’t update with Beyes theorom if you are absolutely certain?” ie, “You can’t use Beyes on something you aren’t using Beyes on?” Trivially true, but I can use it for somethings and not others, and have some items I don’t bother updating due to sufficient confidence. My reasoning model for all practical purposes doesn’t go to so many significant digits, generally.
No, see, you’re misunderstanding. When you say “I’m certain,” you mean “there is such a tiny probability I’m wrong that I’m not going to bother about it in everyday life.” In the case of “If I cut myself with the knife I’ll bleed,” there are all kinds of ways it could be wrong: the knife is a trick knife, you have an odd medical condition that you don’t know about that prevents you from bleeding, the cut was too shallow, the laws of physics have been temporarily suspended, the knife is a hallucination, you’re actually a brain in a vat somewhere…
Everyone *does* use Bayesian reasoning, more-or-less. It’s just that they don’t *realize* they do, and so when reasoning outside of their comfort zone they can fall into traps like Anton-Wilsonism and Aristotleanism.
So admitting the logical possibility of being incorrect means that one is not in fact 100% certain?
I’m certain beyond a reasonable doubt that my daughter is in fact mine. But I’m certain beyond all doubt that my daughter is my wife’s–despite the fact that I could write out hypotheticals where she was swapped for another child when I was out of the house at some point, or that she was implanted with someone elses witout telling me.
Let’s apply the equation from above.
O(H|E) = O(H)*P(E|H)/P(E|~H)
Hypothesis: My wife loves me.
Evidence: She stabs me repeatedly.
O(H) is very clearly quite close to 1. How do I estimate P(E|H) and P(E|~H)? Both seem incredibly unlikely. The story I tell to justify a conditional probability estimate is necessarily going to take many other factors into account. What’s the probability that she racked up gambling debt and needs to collect insurance? What’s the probability that someone who doesn’t seem capable of hurting a fly would stab me just because she happens to not love me anymore? Plenty of people live in loveless marriages without stabbing their spouse to death.
Given that the event is so highly unlikely in your model, it’s most likely more probable that you’ve simply modeled other major features completely incorrectly. That destroys any hope you have of legitimately computing all the millions of details that have to be shoved into each term.
Furthermore, this whole reasoning suffers the exact same fate that suffers Bayesian artificial intelligence. The hard part is not applying some mathematical formula that is formally correct. The hard part is category determination. What is shoved into different categories of hypotheses/evidence? How do I group them together on the fly? How do I split them apart on the fly? This is something that Bayes tells us nothing about. Bayes says, “Once you have sufficiently well-defined categories, this rule is mathematically correct.” Well, no wonder Bayesian artificial intelligence works fantastically on massively constrained problems that allow us a convenient set of well-defined categories…
Thank you anonymous. Your argument that Bayesianism is an incomplete epistemology is clear and compelling. Much more convincing than Chapman’s:
“Those that worship at the alter of Bayes think that probability can explain everything but it can’t!”
Alex,
That doesn’t sound like David’s argument. Is that something he actually wrote? If not, it probably shouldn’t be in quotation marks.
I apologize profusely, (the lack of an edit button has bitten me in the rear). No it was a paraphrase. (I use block quotes for actual quotations, but that is no excuse)
Yes! This is the most important aspect of the critique, I think.
I’m writing a blog post about this now—although I’m putting it positively: what can we say about choosing a vocabulary within which to frame a problem so that suitable formal methods can be applied?
Answers to that are necessarily informal and heuristic, but I think very important.
Steve, that’s a special case. In that case, if I did clamp my arteries and heat the knife to cauterise the wound and all the rest of it, then I would change my estimation of the likelihood of bleeding. The fact remains, unless I act to reduce or eliminate the bleeding that will follow from sticking a knife in my hand, I will bleed if I stick a knife in my hand.
So let us say I have a Bayesian, an Aristotleian, and an Anton-Wilsonian all lined up. And I have a sharp knife. And I say “I am going to stick this knife into your hands, one by one. It is not a trick knife. I will not be putting tourniquets on you. This is not a drug-induced hallucination. So far as I know, none of you are brains in jars. Do you think that you will bleed if I do so?”
It seems to me that:
(A) our Aristotleian, by the arguments I am seeing here as to how they operate, will say “I will bleed”
(B) our Anton-Wilsonian, by the arguments I am seeing as to what their errors or ways or operating are, will say “I cannot be sure one way or the other; I may think I am bleeding, I may bleed in reality, this may be a hallucination despite what you say, if I think hard enough by mental effort I can change reality so I do not bleed when stuck with a knife.”
(C) our Bayesian will say “The probability is very great that I will bleed, if the conditions you impose are true.”
Now, it appears to me that the Bayesian and the Aristotleian have more in common with each other (for instance, they both seem to accept that this is a real knife, that humans generally bleed when injured, that nobody here has special powers or that special precautions are being taken to prevent bleeding etc.) and I don’t really see what practical difference it makes (if we accept the “You don’t have to write out the mathematical equation governing the trajectory of a basketball to make the shot” explanation put forward in various comments) which method we use to decide “Will I or will I not bleed if I am stuck with a knife?”
Now, if you are going to argue about “But one method may be better than another”, then (a) you really do need to be able to understand the mathematics of making a basketball throw and (b) isn’t that just as Aristotleian in that you’re saying better?
Deiseach, the virtues of Bayesianism vs Aristotelianism is when you are trying to determine what is true.
As such, something everyone (no matter whether they’re Aristotelian or Bayesian) is really certain about like “yes, you will bleed”, is really a bad example of trying to show the virtues of Bayesian thought — it’s the difference between being 100% and 99.9999999999999999% certain. Not relevant for practical purposes.
Find a belief that people *actually disagree about*. Then you’ll see some people (Aristotelians) have just three settings (“proven” “disproven” “uncertain”), and other people (Bayesians) treat the ranges of uncertainties quantitatively or semi-quantitatively — understanding the difference between 90% certain, 70% certain and 1% certain.
I’ve encountered Aristotelianism in action quite recently (when I was discussing the Zimmerman-Martin case) so I’m glad that you’ve made this post. I now have something to link to when I want to explain to other people why they shouldn’t treat uncertainties as if they’re complete unknowables.
It annoys me when my boss says, “We can’t put someone on mental health watch because of a probability.”. Actually, that’s exactly why we put people on mental health watch (the equivalent of committing them.)
“(my go-to grammatical example is answering the phone “Scott? Yes, this is him.”)”
… I don’t get it.
“doctors are so stupid, they don’t understand anything”??? ME???
No way. System 2 is (generally) so stupid, it doesn’t understand anything. System 1 understands all sorts of things, it just doesn’t care much what system 2 wants or what system 2 is thinking and certainly doesn’t tell system 2 anything like the full picture. Doctors are, on average, 95th percentile smart, WRT the general population’s standards for system 2 performance, marginally better than George W Bush (pre-recentering SAT scores here http://www.insidepolitics.org/heard/heard32300.html).
According to table 3 here,
http://www.udel.edu/educ/gottfredson/reprints/1997whygmatters.pdf
This means that they are 50% likely to know what the word travesty means (in the sense of getting at least partial credit), and in most cases can be told not to plagiarize because the majority of the time they will at least partially understand what that word means. They almost all have ‘compassion’ (in their vocabulary) and should be able to figure out what tranquilizers do, since they know the word ‘tranquil’. According to figure 2 and table 8, however, only a minority of doctors are likely to be able to read prose well enough to “compare approaches stated in a narrative on growing up”, “summarize two ways lawyers may challenge prospective jurors (based on a passage describing such practices)” or interpret a brief phrase from a lengthy news article”, or manipulate documents well enough to “use information in table to complet a graph including labelling axes”, “use table comparing credit cards. Identify the two categories used and write two differences between them” or “using a table depicting information about parental involvement in school survey to write a paragraph summarizing extent to which parents and teachers agree”. It’s also uncommon for their system 2 to be able to complete quantitative test items such as “determine shipping and total costs on an order form for items in a catalog”, “using information in news article, calculate differences in times for completing a race” or “using a calculator, determine the total cost of carpet to cover a room”.
Bravo though on having at least one psychiatrist in your hospital who can, when primed by you, recall the basics of probability theory.
Yes, you.
Since replying directly to Steve’s post doesn’t seem to be working:
If the odds that memory has a reliability correlating reasonably with that human intuitions say it has is below 0.5, there is still a problem.
Of course, you haven’t addressed the induction question.
Another example of the Aristotelean epistemology in practice: http://www.futilitycloset.com/2013/08/01/fish-story-3/
This is just odd to look at from a Bayesian perspective. The reasoning presented is so obviously wrong it’s almost remarkable that anyone could ever have presented it seriously.
On Hume’s point, I agree. However, when it comes to Wallace, it is rhetoric along the lines of the Proving Too Much Concept and doesn’t seem so bad.
In addition, come to think of it, something I missed earlier- verbal reasoning can help to show a concept is internally incoherent. Take a discussion of free will and how to define it, or how to define a soul.
Scott, this is completely off-topic and not at all appropriate to discuss in this thread, but I have no-one else to rant to and it’s 3.50 a.m. in the morning here in Ireland.
Why I am about ready to start strangling doctors: a rant. Please excuse the excessive amount of swearing that will happen. Also, why Venlafaxine is fucking evil (and I only heard of the damn thing four days ago).
Scott, you’re a psychiatrist or specialising in that, if I’m correct. Training/working in a hospital right now? Are you aware your patients don’t tell you everything?
Let me rephrase that: you may think you are aware your patients don’t tell you everything but you’re really not aware of it.
My sister is down home with me for the week. Four days in and she is wired to the fucking moon. She’s on antidepressants (the aforesaid venlafaxine) on the infamous merry-go-round of “we’ll put you on these until they stop working and then switch you to another drug until that stops being effective and keep on for infinity like this”. She is either not telling her doctor or is unaware because she hasn’t damn well been told of the side-effects.
As in manic episodes (okay, hypomania). See the “wired to the moon” statement above. She is also taking unprescribed medication (her husband’s one-off prescription of diazepam) on top of the venlafaxine to calm herself down at night because she has palpitations and racing heart and jitters and tremors and all that fun stuff.
She is also drinking on top of that, and I only became aware of how much she’s drinking now she’s under my eye. Oh, lest I forget: alcoholism is A Thing in our family.
And I’m damn sure she’s not telling her GP and therapist any of this.
And I’m damn sure they never told her the side-effects of venlafaxine because, until she told me her symptoms and I got on Google for her to look it up, she had No. Fucking. Idea. that one cause of her resting heart rate being 103 bpm might be a drug interaction or side effect (she thought it was just stress).
So the moral of this little story (beside the fact that I am ready right now to rush out and form a pitchfork wielding, torch bearing mob to burn down the entire Irish medical system) is that YOUR PATIENTS ARE NOT TELLILNG YOU EVERYTHING AND PLEASE, PLEASE, PLEASE BE DAMN SURE TO DIG IN AND GET A THOROUGH MEDICAL HISTORY – INCLUDING FAMILY TENDENCIES TO BE MENTAL (ALL ONE SIDE OF THE FAMILY) AND ALCOHOLICS (BOTH SIDES) – BEFORE PRESCRIBING GODDAMN PILLS.
Thank you, goodnight, and apologies for the yelling and swearing but as I said, I am ready to massacre doctors and go back to illiterate old biddies living in hovels dosing the villagers with St John’s Wort.
For years I’ve been extremely sceptical of Yudkowsky’s claim than Bayesianism is some sort of ultimate foundation of rationality. Analogical inference is my suggested alternative epistemology.
I think David got to the heart of the matter when he pointed out that Bayes cannot define the space of hypotheses in the first place, it only works once a set of pre-defined concepts is assumed. As David states;
“The universe doesn’t come pre-parsed with those. Choosing the vocabulary in which to formulate evidence, hypotheses, and actions is most of the work of understanding something.”
Exactly so!
This is the task of knowledge representation, or categorization, which is closely related to the generation of analogies, and is PRIOR to any probabilistic calculations. Now it may turn out to be the case that these things can be entirely defined in Bayesian terms, but there is no reason for believing this, and every reason for disbelieving it. Some years ago, on a list called the everything-list, I argued the case against Bayes and suggested that analogical inference may turn out to be a more general framework for science, of which Bayes will only be a special case.
Here’s the link to my arguments:
https://groups.google.com/forum/#!topic/everything-list/AhDfBxh2E_c
In my summing up, I listed ‘5 big problems with Bayes’ and pointed out some preliminary evidence that my suggested alternative (analogical inference) might be able to solve these problems. Here was my summary:
(1) Bayes can’t handle mathematical reasoning, and especially, it
can’t deal with Godel undecidables
(2) Bayes has a problem of different priors and models
(3) Formalizations of Occam’s razor are uncomputable and
approximations don’t scale.
(4) Most of the work of science is knowledge representation, not
prediction, and knowledge representation is primary to prediction
(5) The type of pure math that Bayesian inference resembles (functions/
relations) is lower down the math hierarchy than that of analogical
inference (categories)
For each point, there’s some evidence that analogical inference *can* handle the
problem:
(1) Analogical reasoning can engage in mathematical reasoning and
bypass Godel (see Hoftstadler, Godelian reasoning is analogical)
(2) Analogical reasoning can produce priors, by biasing the mind in
the right direction by generating categories which simplify (see
Analogy as categorization)
(3) Analogical reasoning does not depend on huge amounts of data thus
it does not suffer from uncomputibility.
(4) Analogical reasoning naturally deals with knowledge representation
(analogies are categories)
(5) The fact that analogical reasoning closely resembles category
theory, the deepest form of math, suggests it’s the deepest form of
inference
“I think David got to the heart of the matter when he pointed out that Bayes cannot define the space of hypotheses in the first place, it only works once a set of pre-defined concepts is assumed.”
This sounds to me like “Evolution can’t explain how the Earth formed – and yet you call yourself an evolutionist!”
Some theories don’t explain everything, but still do what it says on the tin very well.
It’s important to distinguish Bayesianism as a theory of probability, as a theory of rationality, and as a theory of epistemology.
As a theory of probability, one can quibble, but basically it’s dandy.
My objection is to treating it as a theory of rationality or epistemology, because it just doesn’t do those things.
“Evolution can’t explain how the Earth formed – and yet you call yourself an evolutionist” is a false analogy because evolution isn’t meant to do that. But a theory of rationality or epistemology must do many things a theory of probability can’t.
A better analogy would be “You have no clue how a car engine works, you just know how to use a wrench—and you call yourself an auto mechanic!”
If to you “a theory of epistemology” means it has to completely explain every epistemological question without any help, then I agree Bayes is not a theory of epistemology and I’m not sure such a theory is possible. Certainly Aristotelianism, which everyone always calls an attempt at a theory of epistemology, doesn’t even try explaining things like how one generates ideas.
To keep the evolution analogy, evolution doesn’t solve all of biology – you can’t explain how the kidney concentrates urine just by understanding the principles of natural selection – but it’s a very central theory that most other theories will connect with, more fundamental than something very specific like the theory that HIV causes AIDS. All this talk of “fundamental theory” versus “nonfundamental theory” seems kind of silly and it’s hard for me to remember what we’re debating since no one claims Bayes explains everything and no one claims Bayes explains nothing, but since you seem pretty sure we’re debating something that would be my contribution to whatever it is we’re debating.
Oh, and the correct use of your analogy seems to be something more like “You don’t know how a car engine works, you only know how to use a wrench – and yet you call yourself a Wrenchist and say that wrenches are really important for mechanics and that everyone should learn how to use one and that if you don’t understand wrenches you are unlikely to repair cars optimally.”
Many in the “Less Wrong” community promote the Bayesian framework as the theoretical ideal for epistemology, that is to say, they claim that probabilistic reasoning can do everything that can be done in the domain ‘epistemology’.
The point being made here Scott, is that there appear to be some very serious gaps in the Bayesian framework which haven’t yet been filled in, suggesting that the Bayesian framework might turn out not to live up to its billing.
It’s been pointed out that knowledge representation appears to be doing most of the work in science, not mere prediction.
So if knowledge representation/categorization turns out not to be reducible to probabilistic reasoning, this would mean that the Bayesian framework does not live up to its “Less Wrong” billing.
“It’s been pointed out that knowledge representation appears to be doing most of the work in science, not mere prediction.”
If by ‘it’s been pointed out’ you mean ‘it’s been pointed out in the comments here’, then I must say that a quick search shows me only your comments promoting that idea. KR is important but it does not do ‘most of the work in science’ – it is just one of the (many) prerequisites and one of the many things that can bias the results.
@ Scott — I agree that a complete theory of epistemology is unlikely!
I also agree that we’ve taken the discussion as far we can in the absence of either (a) an actual critique of Bayesianism or (b) a sketch of what a broader epistemology might look like.
I’m working on (b). I don’t have much time, so instead of being a systematic account, it will be a collection of anecdotes about experiences in auto repair, so to speak. I’ll talk about ways that different tools were useful in solving particular problems.
Maybe after I’ve posted that, your reaction will be “but Bayes is still most of what you need!” in which case I will politely differ.
Or maybe you’ll say “oh, I see—we were just talking about different things—my interest is strictly in adjusting belief strength, not in epistemology broadly, so Bayes is all I care about; but I can see why you’d find other tools interesting.”
Or maybe you’ll say “Oh! Now I get it! We do need screwdrivers and pliers and many other tools, and they are no more or less important than wrenches. You have to use several to get any major job done.”
There actually aren’t any Wrenchists, because everyone would agree that knowing how to use a wrench is indispensable, so that’s uncontroversial, but also learning to use a wrench isn’t terribly difficult, and by itself it isn’t terribly useful. I feel the same way about Bayes.
@ Tenoke — I think Marc may have been referring to my saying:
That was in a discussion with Kaj Sotala on my blog.
@David: “There actually aren’t any Wrenchists, because everyone would agree that knowing how to use a wrench is indispensable, so that’s uncontroversial, but also learning to use a wrench isn’t terribly difficult, and by itself it isn’t terribly useful.”
Wait wait wait wait wait what?
My guess is maybe less than 1% of humans and less than 10% of people who consider themselves epistemologists/philosophers of knowledge/et cetera have ever even heard of Bayes. Does that sound wrong to you? Like, if you believe using a wrench is “uncontroversially” “indispensible”, and that only a tiny subset of mechanics has any idea wrenches exist let alone how to use them, aren’t you positing CFAR et cetera are hugely important, probably even more important than I would be willing to?
Also, it’s possible we have different definitions for “epistemology”. You say above that “‘Epistemology’ is figuring out what is true or useful.”
If that’s true, I agree Bayes is only a small part of the picture (albeit an important part, something you seem to agree with).
However, I define epistemology more theoretically, something like “the study of what the heck knowledge even means, and whether it is possible in theory to obtain it”. If that’s true, do you agree that Bayes and the idea of probabilistic knowledge become a much bigger part of the “solution” to “epistemology”?
In my original post, I wrote:
So yeah, it’s possible that I do think CFAR’s mission is more important than you do! I agree that probably <1% of people are able to apply basic probability theory, and that's a disaster.
It's likely that differing ideas about the scope of "rationality" and "epistemology" are part of the confusion and apparent disagreement here. We could hash out different definitions, but I'm not sure how productive that would be.
Maybe as an alternative you could check out my follow-up post which is about a broader conception of rationality/epistemology, and see what you think.
Analogical reasoning seems to be implicit in the way many people write Bayes:
P(H | E & B) = [P(E | H & B) x P(H | B)] / P(E | B)
B is background knowledge, meaning that analogies/categories are given weight towards probabilities.
Also, to me this seems to be waffling between how we *should* reason and how we *actually* reason. Most people intuitively reason using analogies, but if intuition were the be-all end-all of rationality, then we would have no use for other formal methods.
Also, analogical reasoning has its flaws, e.g. Zeno’s Paradoxes, fallacy of composition, etc. How do you determine what’s a valid analogy and what’s not? Electrons being thought of as little balls and light being thought of as a wave are themselves analogies but both are more wrong than the current mathematical representation of quantum behavior.
As far as people thinking in black and white I think global warming is the perfect example.
If you follow the IPCC report the correct belief in global warming is something between p=0.9 and p=0.99.
At the same time we have people walking around claiming that the evidence for global warming is similar to the evidence we have for evolution.
Our evidence for evolution is better than p=0.99. The only way you can believe that the evidence for both claims is the same is through black and white thinking.
I’ve now posted a follow-up piece, “How To Think Real Good.” It’s a stab at answering: “what is rationality if not Bayesianism?” Or more accurately, “what is rationality besides Bayesianism,” because there’s nothing wrong with Bayes as such.
The post is a very crude map of a broader world: how to think effectively and accurately. (“Rationality,” in other words.) Bayes is one island in that archipelago; the post marks the approximate locations, and vague shapes, of some others.
I hope this out will be interesting and useful to the LW community. Unfortunately, I was writing in a hurry, to make a timely response to Scott‘s post here, so it’s much less thought-through, and much longer, than I would like.
In any case, comments are very welcome!
Pingback: The Charge Of Hyperskepticism/Hyposkepticism | διά πέντε / dia pente
“You can take any position in any argument and accuse the proponents of believing it fanatically. And then you’re done. There’s no good standard for fanaticism. Some people want to end the war in Afghanistan? Simply call them “anti-war fanatics”. You don’t have to prove anything, and even if the anti-war crowd object, they’re now stuck objecting to the “fanatic” label rather than giving arguments against the war.
(if a candidate is stuck arguing “I’m not a child molester”, then he has already lost the election, whether or not he manages to convince the electorate of his probable innocence)
And then when the war goes bad and hindsight bias tells us it was a terrible idea all along, you can just say “Yes, people like me were happy to acknowledge the excellent arguments about the war. It was just you guys being fanatics about it all the time which turned everyone else off.””
Scott, are you ever going to explain your Deep Insights into this sort of thing? You’ve been talking about the plight of us low-social-skills … people a lot recently. Any chance you’ll ever share these advanced social skills you learned during your Five Thousand Years?
Obviously, some of these techniques may be evil. But you have unparalleled access to the rationalist community, so I’m guessing the instrumental value could be high.
Oh, and I’m crazy curious, of course. Hmm, I think I’ll ask this on a few posts in the hope it’ll be seen.
I absolutely agree. If I have to add something to the discussion, people just don’t seem to discern some easy examples:
– Difference between belief based on evidence (e.g. sun will rise tomorrow) and “belief” as “hope”. There are people going around saying stuff like “I was so lucky today, I should buy some lotto tickets. I believe I will win.”. Or “Tim, I know you did not do well the whole year. But I believe that this time you will make it. Go for it champ!”
– Difference between correctly predicting single outcome and correctly predicting how likely an outcome is. I seriously saw following argument by a person regarding sports results: “You predicted that team A is underdog and will probably lose in any of the next 6 matches. You therefore have to believe that they go 0:6”
I’m not sure I can rely on instinct for planning my career. I have so little feedback about the success/failure of career decisions. Can’t I play the career simulator 1000 times first?
I enjoyed hearing a defense of Bayesianism. I also appreciated your delineation of people’s surface cognizance of statistical beliefs without necessarily being able to incorporate them into a deep understanding of their own beliefs.
I find this fascinating and it brings several questions to mind.
1) How possible is it for us to grasp multiple, conflicting models? For a binary example: could a person both fully grasp and understand the universe from a religious perspective and an atheist perspective. The impetus for this question being that if I ascribe x% probability to model A and if x is not 100, I must, in order to truly conceive of probability x, be able to conceive of not only model A but at least one model in ~A. I believe this to be a superior (in terms of being further developmentally) to a surface concept of belief based on Bayesian updating. I say this because, while I find plenty of people expressing levels of certainty, their concepts and structure of their beliefs appear completely dependent on the model to which they ascribe a >50% probability. (Though I recognize that this is anecdotal)
2) Does Bayesianism provide a sufficiently robust epistemic model? While it allows us to ascribe probabilistic reasoning to a belief already conceived of, it seems to suffer still from overconfidence. We often fail to evaluate our meta-beliefs when assigning probabilities. For example: we will often preconceive a dichotomy, then ascribe either-or probability to the model (i.e. our constraints for ease of use lead to large fundamental gaps in our aggregated belief systems). To make that example more concrete, look back at the religion debate. By asking, “does God exist,” we are implicitly restricting the field of metaphysical possibilities: seemingly breaking it into two regions. However, given our epistemic limitations, does that not seem presumptuous? Can we ever reasonably do this?
3) As a segue, by robust, I meant is it an epistemic model that fits our needs. In that same vein, is a quantitative assessment of model adoption universally applicable? Can we decide which model to adopt simply through an assessment of how likely we think it is in all cases? Can we in any case?
In conclusion, I think that to serve as an epistemology Bayesianism may need to be fleshed out a bit.