Predictions For 2019

At the beginning of every year, I make predictions. At the end of every year, I score them. So here are a hundred more for 2019.

Rules: all predictions about what will be true on January 1, 2020. Any that involve polling will be settled by the top poll or average of polls on Real Clear Politics on that day. Most predictions about my personal life, or that refer to the personal lives of other people, have been redacted to protect their privacy. I’m using the full 0 – 100 range in making predictions this year, but they’ll be flipped and judged as 50 – 100 in the rating stage, just like in previous years. I’ve tried to avoid doing specific research or looking at prediction markets when I made these, though some of them I already knew what the markets said.

Feel free to get in a big fight over whether 50% predictions are meaningful.

US
1. Donald Trump remains President: 90%
2. Donald Trump is impeached by the House: 40%
3. Kamala Harris leads the Democratic field: 20%
4. Bernie Sanders leads the Democratic field: 20%
5. Joe Biden leads the Democratic field: 20%
6. Beto O’Rourke leads the Democratic field: 20%
7. Trump is still leading in prediction markets to be Republican nominee: 70%
8. Polls show more people support the leading Democrat than the leading Republican: 80%
9. Trump’s approval rating below 50: 90%
10. Trump’s approval rating below 40: 50%
11. Current government shutdown ends before Feb 1: 40%
12. Current government shutdown ends before Mar 1: 80%
13. Current government shutdown ends before Apr 1: 95%
14. Trump gets at least half the wall funding he wants from current shutdown: 20%
15. Ginsberg still alive: 50%

ECON AND TECH
16. Bitcoin above 1000: 90%
17. Bitcoin above 3000: 50%
18. Bitcoin above 5000: 20%
19. Bitcoin above Ethereum: 95%
20. Dow above current value of 25000: 80%
21. SpaceX successfully launches and returns crewed spacecraft: 90%
22. SpaceX Starship reaches orbit: 10%
23. No city where a member of the general public can ride self-driving car without attendant: 90%
24. I can buy an Impossible Burger at a grocery store within a 30 minute walk from my house: 70%
25. Pregabalin successfully goes generic and costs less than $100/month on GoodRx.com: 50%
26. No further CRISPR-edited babies born: 80%

WORLD
27. Britain out of EU: 60%
28. Britain holds second Brexit referendum: 20%
29. No other EU country announces plan to leave: 80%
30. China does not manage to avert economic crisis (subjective): 50%
31. Xi still in power: 95%
32. MbS still in power: 95%
33. May still in power: 70%
34. Nothing more embarassing than Vigano memo happens to Pope Francis: 80%

SURVEY
35. …finds birth order effect is significantly affected by age gap: 40%
36. …finds fluoxetine has significantly less discontinuation issues than average: 60%
37. …finds STEM jobs do not have significantly more perceived gender bias than non-STEM: 60%
38. …finds gender-essentialism vs. food-essentialism correlation greater than 0.075: 30%

PERSONAL
39. SSC gets fewer hits than last year: 70%
40. I finish and post [redacted]: 90%
41. I finish and post [redacted 2]: 50%
42. I finish and post [redacted 3]: 50%
43. [redacted 1] post gets at least 40,000 hits: 40%
44. [redacted 2] post gets at least 40,000 hits: 20%
45. New co-blogger with more than 3 posts: 20%
46. Repeat adversarial collaboration contest with at least 5 entries: 60%
47. [redacted]: 90%
48. [redacted]: 70%
49. I start using Twitter again (5+ tweets in any month): 60%
50. I start using Facebook again (following at least 5 people): 30%
51. I get the blood tests I should be getting this year: 90%
52. I try one biohacking project per month x at least 10 months: 30%
53. I continue taking sceletium regularly: 70%
54. I switch from [redacted] for at least 3 months: 20%
55. I find at least one new supplement I take or expect to take regularly x 3 months: 20%
56. Minoxidil use produces obvious progress: 50%
57. I restart [redacted]: 20%
58. I spend one month at least substantially more vegetarian than my current compromise: 20%
59. I spend one month at least substantially less vegetarian than my current compromise: 30%
60. I weight more than 195 lbs at year end: 80%
61. I meditate at least 30 minutes/day more than half of days this year: 30%
62. I use marijuana at least once this year: 20%
63. I finish at least 10% more of [redacted]: 20%
64. I completely finish [redacted]: 10%
65. I finish and post [redacted]: 5%
66. I write at least ten pages of something I intend to turn into a full-length book this year: 20%
67. I practice calligraphy at least seven days in the last quarter of 2019: 40%
68. I finish at least one page of the [redacted] calligraphy project this year: 30%
69. I finish the entire [redacted] calligraphy project this year: 10%
70. I finish some other at-least-one-page calligraphy project this year: 80%
71. I attend the APA Meeting: 80%
72. [redacted]: 50%
73. [redacted]: 40%
74. I still work in SF with no plans to leave it: 60%
75. I still only do telepsychiatry one day with no plans to increase it: 60%
76. I still work the current number of hours per week: 60%
77. I have not started (= formally see first patient) my own practice: 80%
78. I lease another version of the same car I have now: 90%
79. I still live in my current house with no specific plans to leave: 80%
80. I set up a decent home library: 60%
81. We have obtained a second trash can: 90%
82. The gate is fixed with no problems at all: 50%
83. The ugly paint spot on my wall gets fixed: 30%
84. There is some kind of nice garden: 60%
85. …and I am at least half responsible: 20%
86. I get my own washing machine: 20%
87. [redacted]: 60%
88. [redacted]: 70%
89. [redacted]: 80%
90. [redacted]: 80%
91. [redacted] is widely considered a success: 70%
92. …with plans (vague okay) to create a second [redacted]: 20%
93. I find a primary partner: 30%
94. I go on at least one date with someone who doesn’t already have a primary partner: 90%
95. I remake an account on OKCupid: 80%
96. [redacted]: 10%
97. [redacted]: 20%
98. [redacted]: 20%
99. [redacted]: 20%
100. [redacted]: 20%
101. [redacted]: 30%
102. [redacted]: 10%
103. [redacted]: 30%
104. [redacted]: 50%
105. [redacted]: 10%
106. [redacted]: 50%
107. I am still playing D&D: 60%
108. I go on a trip to Guatemala: 90%
109. I go on at least one other international trip: 30%
110. I go to at least one Solstice outside the Bay: 40%
111. I go to at least one city just for an SSC meetup: 30%
112. [redacted]: 40%
113. [redacted]: 50%
114. [redacted]: 20%
115. [redacted]: 80%
116. [redacted]: 60%
117. [redacted]: 60%
118. [redacted]: 80%

This entry was posted in Uncategorized and tagged . Bookmark the permalink.

97 Responses to Predictions For 2019

  1. dividebyzeroes says:

    Scott, regarding prediction #56 regarding Minoxidil use: I hope it does work for you, although it did not for me. A quick heads-up about a possible side effect: when I was using it, I had frequent occurrences of rapid and sometimes irregular heartbeat as well as a pretty constant low-grade chest pain. I had not bothered to read about possible side effects, and did not realize at the time that there was any connection between my Minoxidil use and the cardiac side effects, which were frequent and pronounced enough to honestly scare the hell out of me. I ended up seeing a cardiologist multiple times, wearing a Holter monitor for a month, having a stress ECG test, etc. The cardiac symptoms all resolved soon after I stopped using Minoxidil (I didn’t make that connection at the time either), and it was not until later that I happened to read about angina and rapid / irregular heartbeats being known side effects. So, just wanted to let you know not to panic as I did if you experience these symptoms.

  2. Don_Flamingo says:

    @ScottAlexander
    re:61
    So, you’ve been reviewing all those meditation books and (unwillingly, I imagine) hosted
    a flamewar between two enlightened beings.
    Why not a more specific goal, like become enlightened by month x/y/z/not this year or reach stage a,b,c in TMI?
    What are you using for your personal meditation practice, if you don’t mind sharing?

  3. Don_Flamingo says:

    52. I try one biohacking project per month x at least 10 months: 30%

    I don’t know how to parse this statement. I’m gonna give it 15% (illegibility penalty).

  4. Don_Flamingo says:

    13.5:
    “Trump pretends to yield and lift the government shutdown on April First.
    Then he says, it was an April fools prank and the shutdown lasts at least another month.”
    0.25%

  5. 420BootyWizard says:

    Bold claim on [redacted].

  6. akarlin says:

    I got tired of doing predictions and stopped doing them (in this format) as of this year. However, thanks to Scott for introducing me to the format!

    Comments without having read any other replies:

    All pretty much agreed, except:
    Kamala Harris leads the Democratic field: 20%. I have a feeling she’ll take it (the nomination, and the Presidency). So I’ll make that 30%.
    Bitcoin above 5000: 20%. More like 40%-50%. BTC seems to be cyclical, surging about once every two years, as the last round’s hype is forgotten. I suspect will start inclining upwards again towards end of 2019.
    20. Dow above current value of 25000: 80%. Optimistic prediction, what can I say. I’d give it 30%.
    33. May still in power: 70%. More like 40%, at best.
    66. I write at least ten pages of something I intend to turn into a full-length book this year: 20%. Good luck!

  7. MartMart says:

    I think that 3-6 is hacking the system. Not the actual prediction (that 4 of the leading contenders have a 20% chance of securing the nomination) but the way they are stated.
    If I understand this exercise correctly, the idea is to make predictions with confidence levels, then at the end of a time period group those by confidence levels, and try to get the a percent of each group correct that is equal to the confidence level. In other words, ideally 20% of the 20% confidence group will be correct. The point of this exercise is to practice assigning proper confidence to predictions. Hope I’m right about all this.
    Because of this, I don’t think that 50% confidence level predictions are meaningless. If I make several a long the lines of “the sky will be blue, the sun will rise tomorrow” and assign a confidence of 50%, I will end up with near 100% correct and get a terrible calibration score.
    However, if I make the same obvious predictions, and also a reverse for each of them (50 that the sky will be green, 50% that the sun will not rise, etc) I will have a perfect 50% record (just as I was trying to get), and I’ll get it no matter how weird the world decides to get assuming each pair covers all possibilities between them.
    I can do the same with other numbers lower than 50. If I assign 20% to the sky tomorrow being blue,green,purple,yellow in separate prediction, I’ll have a perfect record no matter what happens to the sky tomorrow.
    Since the number of predictions in each group is necessarily small, padding each group with these multiples will bring up my overall score.

  8. monistowl says:

    [DATA EXPUNGED]

  9. matthewravery says:

    As long as we’re still talking about forecasting, I went back and analyzed Scott’s performance over the past five years using Brier scores, which include a useful decomposition of prediction accuracy into (1) the total underlying uncertainty in the events, (2) the overall calibration of the forecaster, and (3) how good the forecaster was at assigning high probabilities to events that happened.

    This seems to me to bridge some of the disagreements in discussions on this board about whether predictions should just get the overall probabilities correct or focus on getting things “correct” in some sense. The gist of my results is:

    YMMV, but my reading of this is that 2018 was an odd year for Scott. He chose a set of events that had substantially more variation in the past, and while he did well at discriminating, he was also more miscalibrated than he’s been in the past. This lead to a much higher Brier score than he usually sees. Prior to that, his aggregate Brier score had been stable despite quite a bit of variation in the components..

    I can’t include pictures in the comments, but if you want to see more, here’s a link.

  10. joncb says:

    Just to confirm something… by “impeached” you refer to “commencing proceedings” not “removal from office” right?

    Otherwise the first two probabilities don’t make sense to me. (Maybe i’m looking at it wrong?)

    • dodrian says:

      The House votes on whether or not to impeach the President. Then the Senate votes on whether or not to remove the President from office.

      President Bill Clinton was impeached by the House, but he was not removed from office by the Senate. It seems likely a similar situation may occur.

  11. fivemack says:

    Why Guatemala?

    • inedibill says:

      I can’t speak for Scott, but speaking for myself, I’ve traveled to (almost) every Central American country, and Guatemala has been my favorite. Lake Atitlan, in particular, is unlike any sight I’ve seen. I took a friend with me to Atitlan last year, and he declared it “the most beautiful view he had seen,” which implied that (in his mind, at that moment) it had beat out all sights from the ~20 European countries he had visited previously.

      Also: in my experience the locals are still happy to meet tourists, which also separates it from CA countries like Nicaragua. Also: you can rent a nice room on Lake Atitlan for $50/night. There’s a lot to like.

      • fivemack says:

        I’ve been there too, the volcanoes are stunning, Atitlan is really very amazing, and Tikal is astounding even by comparison to Rome or Pompeii … but I was wondering whether there might also be some association with the charter-city experiment one country over in Honduras.

  12. carsonmcneil says:

    Something a little squirrelly about that chain of mutually-exclusive 20% predictions. Forcing them to all be in the same bucket means that their errors will amortize each other out even if they are miscalibrated. I believe the probabilities you assigned are your genuine estimates, just an interesting thing to note about the calibration process.

  13. shakeddown says:

    Annoying nitpick (because we wouldn’t be rationalists without them): When you say a 30-minute walk from your house, do you mean your current house, or whatever house you live in when the year ends (assuming you move)?

  14. John Schilling says:

    If #15 refers to Ruth Bader Ginsburg, a 50% survival rate would be well under the norm for an 85-year old woman (85% or better), and her documented health problems seem to me to be things that would have killed her years ago if they were going to kill her at all (the 2009 pancreatic cancer) and things that are <50% likely to kill her in the next year (last month's lung cancer & treatment).

    Is the 50% survival prediction based on a more sophisticated evaluation of her known medical condition, on the stress of her job causing a threefold increase in general mortality, on the GOP hiring a hit team to secure another SCOTUS pick while they have the chance, or something else?

    • onyomi says:

      What’s the survival rate for 85-year old women who have been treated for cancer in three different parts of the body and had a serious fall+surgery in the past year?

      Scott’s odds on her survival seem optimistic to me.

      • John Schilling says:

        Ginsburg has spent five years as an octogenarian who has been treated for cancer in two different parts of her body and survived all five; she’s spent two years as a multiple cancer survivor who had surgery in the past year and survived both; I think that alone would make the Bayesian estimate of surviving one more year at about 75%. Then we throw in the fact that she’s a fitness freak under the care of first-rate and very attentive physicians.

        I think you are overweighing the fact that the media is basically on permanent deathwatch for Ginsburg on the grounds that “RBG sneezed; enslavement of all women imminent” is so clickworthy even if the actual threat to her health is small. Most 85-year-old women have a list of health problems and still make it to 86, most old women who fall and break bones or otherwise undergo surgery last at least another year.

  15. tmk says:

    We may already have some open to interpretation.

    11. Current government shutdown ends before Feb 1: 40%
    12. Current government shutdown ends before Mar 1: 80%
    13. Current government shutdown ends before Apr 1: 95%

    Trump Bows to Democrats, Temporarily Ends Shutdown Without Wall
    https://www.bloomberg.com/news/articles/2019-01-25/trump-congress-said-to-near-three-week-deal-to-end-shutdown?srnd=premium

  16. John Schilling says:

    So, last year IIRC one prediction was settled within 24 hours of posting. Now we’re at two and counting.

    ETA: 15, 26, 29, and 34 are still within reach, along with some of the ones within Scott’s personal control. If anyone has any ideas about boosting the score, let’s just buy Scott a washing machine rather than rushing out to off RBG.

    • shakeddown says:

      Which one aside from the shutdown?

      • John Schilling says:

        Shutdown ending, and Trump getting wall money out of the (explicitly current) shutdown. And really, all three of the “shutdown ends” predictions are now closed, but I’m counting daisy-chained predicts like that as a single prediction of a PDF.

  17. For 26, you should put it as “No further CRISPR-edited babies are publicly known to have been born.”

    You should post a hash of the full list without redactions. That way, at the end of the year, you can post the full list and prove it is what you had at the beginning of the year.

    • grendelkhan says:

      I don’t think Scott un-redacts things publicly; they’re either personal things he doesn’t want to share, or that he doesn’t have the right to share. So we’ll find out that [redacted] happened or didn’t happen, but not what [redacted] was.

      (Well, maybe with post titles, but not with the personal stuff.)

      • CheshireCat says:

        Yeah hashing wouldn’t work, since the list would still be redacted, there’s no way to compute the hash a second time to verify.

        • Hoopyfreud says:

          He could post a hash of the unredacted list and let us hash the unredacted one when released to verify. But since he doesn’t unredact everything it doesn’t matter.

      • bullseye says:

        He unredacted his prediction of his breakup.

    • Paul Brinkley says:

      Good idea, but it would probably have to be the full list as he would publish it in 2020. Some of those items might remain redacted.

      Ideally, post a hash of each individual prediction, since each one might be published depending on what happens.

      • Lambert says:

        And I doubt he knows exactly what will get underacted.
        Hashing each individual prediction sounds brute-forceable.

        • Tatterdemalion says:

          Not if you artificially add entropy before hashing. “Alice will get divorced” can be guessed, “Alice will get divorced 1848575703285” is not.

          • Lambert says:

            But then how do we know Scott didn’t choose the text post-facto and find random digits to make the hash correct?

          • g says:

            Lambert: Because that’s really hard to do (meaning: no one knows how to do it with an actually-available amount of computer power), if you use a good hash function.

          • The Nybbler says:

            If he does that with a currently unbroken cryptographic hash function he’ll at least get a mention at computer security conferences, for what that’s worth.

    • John Schilling says:

      That way, at the end of the year, you can post the full list and prove it is what you had at the beginning of the year.

      At least some of the redacted predictions will probably still have to remain redacted in order to maintain the trust of Scott’s true friends. And I’m going to guess that “proving” anything to the subset of us who wouldn’t just take his word for it, is a much much lower priority than protecting the confidences of his true friends. Or, for that matter, spending an extra fifteen minutes with his true friends instead of fiddling with hash tables so he can prove things to people who don’t trust him.

  18. Fakjbf says:

    Wow, from 80% certainty that at least one city would allow a self-driving car without a back-up driver to now a 90% that no city would allow it. Was there any specific reason for such a hard reversal, or was it just seeing the lack of progress in the past year and figuring that the tech is still too new?

    • Douglas Knight says:

      One big change is Google’s public pronouncements. In 11/2017, they announced that their early rider program (400 non-employees, but under non-disclosure agreements) would be testing with back-up drivers not in the driver’s seat, in preparation for rolling out to the general public in late 2018. In 11/2018 they canceled this testing, before rolling out the program to the general public (ie, with people in the driver’s seat). Because of the NDA, it’s also not clear how much of this testing really happened.

      Here is a twitter thread ending:

      Waymo One is a non-story.
      The suspension of true level 4 testing should have been a bigger story.
      Waymo is still far ahead of anyone else, but still a while away from introducing a real level 4 service. A long time away from doing so at scale.

      I think it’s too far to call Waymo One a non-story. It may not be progress, but I think that it is a big deal because it commits to a certain level of transparency.

    • John Schilling says:

      There’s also the Arizona Uber crash. The first unambiguous self-driving car fatality had to come sometime; it came in 2018 and it came in a way that was almost maximally bad for Uber – and for self-driving cars generally except that it happened to Uber rather than Waymo.

      But if the self-driving car industry can come through this relatively unscathed because everyone sort of knows that Uber is the reckless one and Waymo is the competent professional one we trust that Waymo or someone like them can implement safely, then that still leaves the “…by the end of this year” predictions dominated in 2018 by the odds of Uber pushing out something that isn’t ready for prime time and in 2019 by the surviving players being extra cautious.

  19. negativez says:

    3. Kamala Harris leads the Democratic field: 20%
    4. Bernie Sanders leads the Democratic field: 20%
    5. Joe Biden leads the Democratic field: 20%
    6. Beto O’Rourke leads the Democratic field: 20%

    I thought we were going to avoid correlated predictions this year. Saying they’re all equally likely (and mutually exclusive) will pad your accuracy in the 20% band. Make this “One of [set] will lead the Democratic field: 80%”

  20. Slice says:

    Why not put the non-redacted (and most of the personal) predictions into a Google poll and see the wisdom of the crowd? Like
    Donald Trump remains President:
    -True
    -False

  21. michaelkeenan0 says:

    I mildly grumble about making a prediction about the Dow rather than a well-designed stock index such as the S&P 500. As Eliezer noted, the Dow is crazy, and the only reason people ever talk about it is because other people talk about it. We should not grant it more power, and let obscurity destroy it.

    • Douglas Knight says:

      Yeah, and it would be totally reasonable for Scott to change this prediction after posting, since it’s defined by reference to the current level and is, apparently, intended as a generic index.

  22. Bacon_Wrapped says:

    I would eat something called a “[redacted] Burger”

  23. Douglas Knight says:

    35. …finds birth order effect is significantly affected by age gap: 40%

    Do you just mean statistical significance? With your survey size, everything is statistically significantly from zero. At least with the other three questions you have predicted a direction of effect.

  24. Douglas Knight says:

    26. No further CRISPR-edited babies born: 80%

    Are you aware that at the announcement, JK He said that he had implanted another CRISPR embryo and that it had registered as a chemical pregnancy?

  25. thepatternmorecomplicated says:

    Can you comment on the effects you get from sceletium? I don’t think you’ve mentioned it here before.

  26. Yaleocon says:

    @Scott: what kind of character(s) do you play in D&D? Or do you GM?

    Side question—is there a specifically rationalist D&D group? And if so is it purely SF based or could it be possible to join a (future) campaign through roll20?

    • Ventrue Capital says:

      1. Le Maistre Chat runs an online D&D game on Discord.

      2. RainyDayNinja has set up a Discord server for SSC members who are interested in roleplaying games. https://[ remove this if you’re not a spambot ]discord.gg/PpwJtQ (Link expires in 24 hours.)

      3. I run a roleplaying campaign (on Roll20 and also on Discord) set on a world called Terramar and I specifically want SSC members and/or libertarians as players.

      Draft campaign wiki at http://terramar.obsidianportal.com/

      Campaign also has a Discord server at https://[ remove this if you’re not a spambot ]discord.gg/SgZNhn. (Link expires in one day from posting. Contact me if you want in and the link has expired.)

  27. Rudi says:

    Interesting predictions, as always. However I always find it a bit odd to bet on people’s death/survival (Nr. 15).

  28. rlms says:

    Some predictions where I disagree reasonably confidently:
    3. Kamala Harris leads the Democratic field: 12%
    4. Bernie Sanders leads the Democratic field: 12%
    5. Joe Biden leads the Democratic field: 12%
    6. Beto O’Rourke leads the Democratic field: 12%
    7. Trump is still leading in prediction markets to be Republican nominee: 80%
    9. Trump’s approval rating below 50: 95%
    10. Trump’s approval rating below 40: 60%
    11. Current government shutdown ends before Feb 1: 20%
    12. Current government shutdown ends before Mar 1: 60%
    13. Current government shutdown ends before Apr 1: 85%
    17. Bitcoin above 3000: 60%
    20. Dow above current value of 25000: 70%
    27. Britain out of EU: 75%

    • shakeddown says:

      Re 3-6, who do you think makes up the slack?

      Re 10, any particular reason? 50% seems a good estimate based on the last two years.

      • Douglas Knight says:

        Here is a coherent list of odds that also assigns about 50% to the top 4 but goes on to include 20 more people.

        • shakeddown says:

          Interesting. I’m guessing the truth is somewhere inbetween – It’s worth giving more probability to a wide field, but there’s also a general bias in betting markets towards overvaluing very low-probability scenarios (anything below 15% tends to get rounded up a bit).

          • Douglas Knight says:

            Some payment structures encourage a long-shot bias, but does betfair have those failings? Another problem is reporting or emphasizing point estimates, rather than bid/ask, but my link does not have that problem.

            NYT lists 9 declared candidates, including only 1 of Scott’s candidates and 1 candidate not even on Betfair’s list. The next 6 includes Biden and Sanders, but not O’Rourke. Of course, many of them are not running to win, but for some kind of publicity.

          • shakeddown says:

            Some payment structures encourage a long-shot bias, but does betfair have those failings?

            I’m not sure what you mean (do you mean something game-theoretic? biased questions?)
            My point was that in general there’s a psychological bias, when guessing odds, to assign higher than deserved odds to long-shot guesses. I’m not quite sure how much this affects betting markets (I’d guess that low-stakes markets like betfair see it more than, say, the actual stock market).

      • rlms says:

        3-6: a combination of Elizabeth Warren and other known entities, and someone appearing out of nowhere (if you’d been making predictions on the Republican nominee this time last year, Trump wouldn’t have been on your radar at all).

        10: I expect this year to be worse for Trump than the previous two.

  29. noahmotion says:

    40% [redacted] predictions seems at least as meaningless as predictions with 50% confidence.

  30. Noah Luxton says:

    For readability, it might be nice if you put all of the [redacted] predictions in each category together since that’ll probably help readability.
    You have two posts in the works that you expect to get greater than 40k hits at like 50% confidence if you finish them? I’m excited, considering that from a quick look at the graph in the black swan book review, that’s a pretty high level of accomplishment for you.
    I’m surprised that you rated government shut down will end before Feb 1st at 40%. Trump (from a glance) seems to still be in ‘not backing down’ mode, and a week probably won’t change a ton. I would think that the shutdown will end by stuff just piling on to the extent that it become infeasible / unpopular for him to keep this dragging out.
    I’m suprised that you didn’t include a (unredacted) editing unsong prediction, since you’ve been making them for a while.

  31. gbdub says:

    Have you investigated how dependent predictions might warp your calibration results?

    For example, you assign 20% probability of being the party front runner to 4 Democratic candidates. (This leaves an implied 20% for “the field”). You will either get 1 out of 4 or zero out of 4. If you added the “field” option, you’d be guaranteed to be perfectly calibrated – exactly 1 out of 5 of your 20% predictions would hit. But even without the field option, you’ve guaranteed yourself 4 “reasonably calibrated” predictions at the 20% level when really you’ve only made 1 prediction – that there is a roughly even chance for your 4 selected candidates and the field.

    Isn’t that cheating?

  32. Silverlock says:

    What is food-essentialism?

  33. HaraldN says:

    I am a bit sad/worried about your prediction that SSC will get less hits this year.

    Does this mean you are intending to reduce the quality/quantity of your content, or just its virality?

    Or you are directing your work at another writing project (whole lot of redacteds in there) which will be hosted elsewhere.

    • albertborrow says:

      Scott’s most popular posts are posts that are typically controversial. He doesn’t like posting controversial things as much as he likes posting accurate, thorough, well-sourced things. It stands to reason that if he keeps posting essays like that, the hits on the site will decrease. I think the prediction is a third that, a third current general trends, and a third self-deprecation.

  34. J Mann says:

    I don’t get the argument for how 50% predictions could not be meaningful.

    Let’s say you’re going to flip 50 different coins and roll 50 different six sided dice. I predict that the coins will come up heads 50% of the time, and that the dice will come up as a 1 17% of the time. Aren’t those predictions both testable and potentially useful?

    • muskwalker says:

      It can be useful as a probability, and for the stated purpose of calibrating his probability estimates, but I can see the argument that it may not mean much as a prediction.

      Saying “heads is just as likely as tails” is one thing; saying “heads is just as likely as tails, and I predict it will be heads” is another: either your probability estimate is actually a little higher than 50%, or you are choosing to make the prediction in order to signal a preference/affiliation, or you have chosen a side at random (in which case it’s a stretch to call it your prediction).

    • Alex Zavoluk says:

      What would mean to not be calibrated at the 50% level? What would that actually look like? The obvious answer is “you get many more/fewer than 50% of those predictions correct.” That’s what it usually means to be poorly calibrated. But this measure is entirely determined by how the predictions are phrased. What does it mean for a prediction to be “correct”? If I say “50% chance for the coin to come up heads,” then I’m “correct” on heads and “wrong” on tails. If I instead say “50% chance for tails”, the reverse is true, but I’ve made the same prediction.

      As someone else pointed out lower down, you can choose the wording at random, and this ensures your 50% predictions will be perfectly calibrated on average. So any imperfect calibration is due to bad luck or some bias in the phrasing, rather than under-confidence. What you *could* do is use a Brier or log score of the entire set of predictions which provides something of an incentive to make an estimate other than 50% whenever possible, but those can be hard to interpret without lots of people making the same predictions.

      • hls2003 says:

        @J Mann:

        I will say that I have the same intuition as you – if Scott is regularly not close to 50% it suggests he’s bad at picking which wagers are 50-50. But working through it more, I think I’ve at least grasped the other perspective mathematically, even if I haven’t fully embraced it mentally. I also think there’s some confusion in the terminology.

        Let’s say that Scott predicts “I give a 50% chance that this 6-sided die rolls a 1.” Scott has made a bad prediction; there is only a 17%(ish) chance. I think the part where the “other side” didn’t make sense to me was that I was hearing “just flip the prediction and it evens out.” But of course flipping the prediction would simply make Scott right 83% of the time – and winning or losing too much is also evidence that it is not a true 50% choice. We know it is not a 50% choice, so this seems right to me.

        But expanding on Alex Zavoluk above, I think there’s a different way of hearing the “switch sides” position that does make sense to me. Imagine Scott has 200 six-sided dice in front of him. He intends to roll one after the other. Prior to each roll, he will make a prediction, and his prediction will be one of two things: either “This die will roll a 1” or “This die will not roll a 1.” He will use each phrasing 50% of the time (or, you could say, he will flip a coin which to choose). So “This die will roll a 1” should appear 100 times; “This die will not roll a 1” should appear 100 times. On average, we know his “will roll” predictions will be correct about 17 times. And on average, we know his “will not roll” predictions will be correct about 83 times. That means that over the 200 dice rolls, his predictions will be correct 100 times – 50%. That looks like calibration at 50% even though the event does not have a 50% probability.

        This is I think the point about phrasing. If phrasing is literally just “A” and “not A,” then randomizing the two equally will always give about 50%. If phrasing, however is, “A” or “B”, then it will not automatically default to 50%.

        EDIT: Sorry, wrong level to reply.

    • gbdub says:

      So in sports betting, there’s the concept of the “Over/Under” bet. The book sets an O/U value on the final total score of a game (or other score-like event, e.g. “number of regular season wins”). The idea is that the final score is equally likely to be “over” as “under” (well, really the book tries to set it so that 50% of the total wager value comes down on each side, but let’s ignore that layer of abstraction for now). If a bookie is badly calibrated, e.g. their “over” bets consistently hit 60% of the time, someone will quickly notice this and bet the bookie out of business. Thus the calibration of a 50/50 prediction is definitely meaningful to the bookie.

      Scott’s 50% prediction could be reasonably meaningful if we can treat them as O/U lines. For example, rather than his current 3 predictions on the value of Bitcoin, he could collapse those to one prediction: “The over/under on the value of Bitcoin at EOY 2019 is $3000”. Over time, we could see if his O/Us are consistently biased one way or another (although it is tricky – we can see where “NFL scores” are a common set and thus your O/U accuracy there could be reasonably judged over time. It’s less clear that things like “value of Bitcoin”, “number of malaria deaths in Africa”, and “regular season wins by the New York Yankees” are similar enough that something like “Scott’s O/U predictions are biased toward the Under” would be meaningful statement).

      Really Scott should phrase his predictions as “It is equally likely that [event happens] / [event does not happen]” and “The value of X is equally likely to be greater than / less than Y”. This would give the 50/50 predictions consistently definable directionality. From that we could, I suppose, parse out whether Scott has a systemic bias towards “event happens” or “value is high”. Although even this is tricky – values might have easily definable ordinality, but is “Britain remains in EU” an event happening or an event (Brexit) NOT happening? Perhaps he could define “not happening” as the status quo and we could fight about that “Britain is in the EU today – that’s the status quo!” “But Brexit occurs no later than March is ALSO the status quo!”

      What he must definitely not do is take the suggestion of randomly flipping half his 50/50 predictions, because that really will turn out 50/50 regardless of the underlying distribution.

      • hls2003 says:

        Yes, this is why I struggled so hard with the concept. I’m quite used to thinking about probabilities in terms of bets, where being improperly calibrated is very meaningful.

        Glad to see another degenerate gambler on the board!

    • tmk says:

      50% predictions are fine. It’s calculating the calibration at 50% that is meaningless. Then again, I don’t think calibration is good way of evaluating the success of predictions in general.

  35. JPNunez says:

    No prediction for Elizabeth Warren? She just announced a v aggressive tax plan and social reforms.

    • Incandenza says:

      Yeah, the implied prediction that Warren + the whole rest of the field has the same probability of being the leader for the Dem nomination as Beto O’Rourke (or Joe Biden, for that matter) would be one of the two predictions I disagree with the most here.

      My other big disagreement would be the over-ghoulishness re: Ginsburg. An 86-year-old woman has a life expectancy of over 6 years and a death probability of only 8.4%.

      • hls2003 says:

        I mean, she did just have surgery for lung cancer. I wish her well, but she’s not a standard point on a life expectancy chart.

        • Incandenza says:

          Seems like the condition you call “lung cancer” was a pretty typical thing for an 85-year-old (she turns 86 in March). Most folks at that age have something or another breaking down at that point. I mean, a 6-year life expectancy isn’t great, after all. And plus she’s got access to about the best health care you can get, I’m sure. Also she only needs to last slightly more than 11 months at this point.

          I’d buy an argument that she has less than a 92% chance of making it to the end of the year. But 50%… that’s about the odds that a 108-year-old woman dies in a given year. I think she’s healthier than the average 108-year-old.

          • Ghillie Dhu says:

            Your average 108-year-old doesn’t have Russian billionaires making veiled threats on her life.

  36. Nancy Lebovitz says:

    I’m surprised there’s no mention of a plausible Democratic candidate appearing who isn’t currently famous.l

    What kind of calligraphy are you doing?

  37. sty_silver says:

    On 50% predictions: Imagine I make twenty predictions as such:

    Trump leaves office at 2019/04/23
    Tesla stock is at 303$ 35ct at the EOY
    Five additional countries leave the EU

    and I give each a 50% probability and get exactly half of them right. I assume that everyone would be incredibly impressed (or more likely assume fraud) even though it’s just 50% predictions. The impressive / meaning is a function of how far away the probability it is from common wisdom, which may or may not be far in case of a 50% prediction.

    Interesting thought experiment: make any number of 50% predictions, then flip half of them to predict the opposite. You’ll now always be perfectly calibrated on average, because the odds of any prediction coming true is 0.5*p + 0.5 * (1-p) where p is the probability of the original distribution. This term always equals 0.5, so it doesn’t depend on p. I think this corresponds to the fact that if I flip half of them, then my predictions are, on average, conforming with common knowledge exactly, so good callibration seizes to be impressive. But I think it’s clearly not inherently unimpressive.

    Thoughts?

    • orin says:

      This is why the variance is the expectation of the squared deviation from the mean, and why you use the variance (i.e. chi squared), not the average of the deviation from the mean, when testing goodness of fit.

  38. Rachael says:

    70% on May remaining UK Prime Minister seems high. She’s had two no-confidence votes triggered against her recently, and my impression is that she only survived them because people recognised that ousting her would make the current chaos even worse.
    By the end of March, we will either a) leave the EU, b) withdraw article 50, or c) have negotiated a long extension period. I predict that shortly after the current uncertainty is resolved or delayed by settling on one of those options, she will cease to be PM, either by resigning, no-confidence, or general election.

    • Jack V says:

      That’s what I thought. She’s done an excellent job steering between tory remainers and tory brexiters, but only by postponing everything as long as possible, there’s no way she can get through March without totally alienating SOME of her support.

      But on the other hand, no-one competent wants the job and she has been successful so far, I expected her to be in an even worse position now, she’s survived several challenges, so maybe she will continue to do so.

    • Robert Jones says:

      I think there’s a significant possibiilty that by the end of March we will have (d) agreed a short extension period faute de mieux. I agree that by the end of May, one of (a)-(c) will have occurred. Conditional on (c) occurring, I think there’s a significant likelihood that May will remain PM. I expect her to go if (a) or (b) occurs.

    • Robert Jones says:

      It’s worth noting that 5/2 is currently available for May to remain as PM until 2020, which implies a probability of no more than 29%.

    • Fitzroy says:

      Also worth noting that May won the Conservative leadership contest, which makes her immune from internal challenge for 12 months, so the only way she can go is resignation or if the Conservatives lose a general election.

      The next general election isn’t scheduled until 2022, I see no prospect of May calling an early one and, for the time being, it looks like any no confidence vote is doomed to fail (it is jokingly said that the only person MPs have less confidence in than May is Corbyn).

      As for the position by the end of March, a) or c) are possible. B) won’t happen without going back to the country which will necessitate c) in any event, and almost certainly take us beyond year end.

      Assuming we either leave with something like May’s deal, or continue while we await another referendum, May will stay – she’s either just achieved victory (option a) or there is still work to be done to do so (options b / c).

      There’s always the risks of death / ill-health but they are reasonably negligible, so to my mind we’re only arguing about the possibility of ‘no deal’ and May throwing her hands up and saying “b******s to the lot of you, I tried my best, you lot sort it out.”

  39. eterevsky says:

    Have you considered evaluating not just your calibration (i.e. how many predictions at a given level of confidence turn out to be true), but also cross-entropy? After all, what matters is making good predictions at as high confidence level as possible. As everyone knows, it is possible to be perfectly calibrated by making all predictions at 50% confidence.

  40. toastengineer says:

    I’m excited for [redacted], but [redacted] and [redacted] worry me. Overall, [data expunged]