Introducing Astral Codex Ten

Thanks for bearing with me the past few months. My new blog is at https://astralcodexten.substack.com/. I’ll try to have a less unwieldy domain name working soon.

Update On My Situation

It’s been two and a half months since I deleted the blog, so I owe all of you an update on recent events.

I haven’t heard anything from the New York Times one way or the other. Since nothing has been published, I’d assume they dropped the article, except that they approached an acquaintance for another interview last month. Overall I’m confused.

But they definitely haven’t given me any explicit reassurance that they won’t reveal my private information. And now that I’ve publicly admitted privacy is important to me – something I tried to avoid coming on too strong about before, for exactly this reason – some people have taken it upon themselves to post my real name all over Twitter in order to harass me. I probably inadvertently Streisand-Effect-ed myself with all this; I still think it was the right thing to do.

At this point I think maintaining anonymity is a losing battle. So I am gradually reworking my life to be compatible with the sort of publicity that circumstances seem to be forcing on me. I had a talk with my employer and we came to a mutual agreement that I would gradually transition away from working there. At some point, I may start my own private practice, where I’m my own boss and where I can focus on medication management – and not the kinds of psychotherapy that I’m most worried are ethically incompatible with being a public figure. I’m trying to do all of this maximally slowly and carefully and in a way that won’t cause undue burden to any of my patients, and it’s taking a long time to figure out.

I’m also talking to Substack about moving to their blogging platform. While part of me wants to jump right back into blogging here and pretend nothing ever happened, the Substack option has grown on me. I think I’d feel safer as part of a big group that specifically promises to defend their bloggers when needed. And also, I’d feel safer with a lot of diverse income streams, and Substack has made me an extremely generous offer. Many people gave me good advice about how I could monetize my blog without Substack – I took these suggestions very seriously, and without violating a confidentiality agreement all I can answer is that Substack’s offer was extremely generous.

When I originally asked readers about this possibility, they raised a lot of valid concerns: some of them were confused by Substack’s commenting system, others annoyed by its pop-up reminders to subscribe, others were concerned about being stuck outside a paywall. I’ve talked to Substack about this, and they’ve made some really impressive promises to address these things – they’re going to code a maximally-SSC-like commenting experience, they’re going to let me opt out of the subscription reminders, I won’t have to “paywall” anything besides some Hidden Open Threads. This isn’t the time for me to go over the dozens of examples of concerns I had that Substack went above and beyond to address, but assume I had most of the same ones you did and put a lot of work into addressing them.

(and if you’re worried about the Hidden Open Threads, check out Data Secrets Lox, a forum that has done a great job keeping the SSC Open Thread tradition going over the past few months.)

So that’s where I am right now – trying to wind things down at my day job, very preliminarily planning a private practice, and negotiating writing details with Substack. I’m also looking into some other things to protect my physical safety. When all of that is done, I’ll start blogging again. Right now I’m expecting that to be some time between October and January – and obviously when it happens I’ll let you know. I would appreciate if people continued to respect my preferences about anonymity until then. After that I’ll stop caring as much – though I’ll still go by “Scott Alexander” to keep the brand the same, and I’ll still do what I can to avoid publicity.

I might have hinted at this already, but I should say it explicitly – I’m really grateful for all the support I got throughout this whole incident. You people are all great. I’ll say so at more length later, and talk more about some specific examples, but for now just accept on faith that you’re all great.

I still plan to do the book review contest! I’ll do it sometime after I start the new blog! Those of you who sent me reviews didn’t waste your time! It’s going to happen! Pestilence may afflict every corner of the world, the skies may turn red as blood and the sun go dark at noon, the earth may shake and plagues of locusts cover the land, but never doubt that there will be a book review contest someday, in the golden future, when all of this is over.

As all the kids are saying these days, “thank you for your continuing support during these difficult times”.

Posted in Uncategorized | Tagged | 76 Comments

NYT Is Threatening My Safety By Revealing My Real Name, So I Am Deleting The Blog

[EDIT 2/13/21: This post is originally from June 2020, but there’s been renewed interest in it because the NYT article involved just came out. This post says the NYT was going to write a positive article, which was the impression I got in June 2020. The actual article was very negative; I feel this was as retaliation for writing this post, but I can’t prove it. I feel I was misrepresented by slicing and dicing quotations in a way that made me sound like a far-right nutcase; I am actually a liberal Democrat who voted for Warren in the primary and Biden in the general, and I generally hold pretty standard center-left views in support of race and gender equality. You can read my full statement defending against the Times’ allegations here. To learn more about this blog and read older posts, go to the About page.]

So, I kind of deleted the blog. Sorry. Here’s my explanation.

Last week I talked to a New York Times technology reporter who was planning to write a story on Slate Star Codex. He told me it would be a mostly positive piece about how we were an interesting gathering place for people in tech, and how we were ahead of the curve on some aspects of the coronavirus situation. It probably would have been a very nice article.

Unfortunately, he told me he had discovered my real name and would reveal it in the article, ie doxx me. “Scott Alexander” is my real first and middle name, but I’ve tried to keep my last name secret. I haven’t always done great at this, but I’ve done better than “have it get printed in the New York Times“.

I have a lot of reasons for staying pseudonymous. First, I’m a psychiatrist, and psychiatrists are kind of obsessive about preventing their patients from knowing anything about who they are outside of work. You can read more about this in this Scientific American article – and remember that the last psychiatrist blogger to get doxxed abandoned his blog too. I am not one of the big sticklers on this, but I’m more of a stickler than “let the New York Times tell my patients where they can find my personal blog”. I think it’s plausible that if I became a national news figure under my real name, my patients – who run the gamut from far-left to far-right – wouldn’t be able to engage with me in a normal therapeutic way. I also worry that my clinic would decide I am more of a liability than an asset and let me go, which would leave hundreds of patients in a dangerous situation as we tried to transition their care.

The second reason is more prosaic: some people want to kill me or ruin my life, and I would prefer not to make it too easy. I’ve received various death threats. I had someone on an anti-psychiatry subreddit put out a bounty for any information that could take me down (the mods deleted the post quickly, which I am grateful for). I’ve had dissatisfied blog readers call my work pretending to be dissatisfied patients in order to get me fired. And I recently learned that someone on SSC got SWATted in a way that they link to using their real name on the blog. I live with ten housemates including a three-year-old and an infant, and I would prefer this not happen to me or to them. Although I realize I accept some risk of this just by writing a blog with imperfect anonymity, getting doxxed on national news would take it to another level.

When I expressed these fears to the reporter, he said that it was New York Times policy to include real names, and he couldn’t change that.

After considering my options, I decided on the one you see now. If there’s no blog, there’s no story. Or at least the story will have to include some discussion of NYT’s strategy of doxxing random bloggers for clicks.

I want to make it clear that I’m not saying I believe I’m above news coverage, or that people shouldn’t be allowed to express their opinion of my blog. If someone wants to write a hit piece about me, whatever, that’s life. If someone thinks I am so egregious that I don’t deserve the mask of anonymity, then I guess they have to name me, the same way they name criminals and terrorists. This wasn’t that. By all indications, this was just going to be a nice piece saying I got some things about coronavirus right early on. Getting punished for my crimes would at least be predictable, but I am not willing to be punished for my virtues.

I’m not sure what happens next. In my ideal world, the New York Times realizes they screwed up, promises not to use my real name in the article, and promises to rethink their strategy of doxxing random bloggers for clicks. Then I put the blog back up (of course I backed it up! I’m not a monster!) and we forget this ever happened.

Otherwise, I’m going to lie low for a while and see what happens. Maybe all my fears are totally overblown and nothing happens and I feel dumb. Maybe I get fired and keeping my job stops mattering. I’m not sure. I’d feel stupid if I caused the amount of ruckus this will probably cause and then caved and reopened immediately. But I would also be surprised if I never came back. We’ll see.

I’ve gotten an amazing amount of support the past few days as this situation played out. You don’t need to send me more – message very much received. I love all of you so much. I realize I am making your lives harder by taking the blog down. At some point I’ll figure out a way to make it up to you.

In the meantime, you can still use the r/slatestarcodex subreddit for sober non-political discussion, the not-officially-affiliated-with-us r/themotte subreddit for crazy heated political debate, and the SSC Discord server for whatever it is people do on Discord. Also, my biggest regret is I won’t get to blog about Gwern’s work with GPT-3, so go over and check it out.

There’s a SUBSCRIBE BY EMAIL button on the right – put your name there if you want to know if the blog restarts or something else interesting happens. I’ll make sure all relevant updates make it onto the subreddit, so watch that space.

There is no comments section for this post. The appropriate comments section is the feedback page of the New York Times. You may also want to email the New York Times technology editor Pui-Wing Tam at pui-wing.tam@nytimes.com, contact her on Twitter at @puiwingtam, or phone the New York Times at 844-NYTNEWS. [EDIT: The time for doing this has passed, thanks to everyone who sent messages in]

(please be polite – I don’t know if Ms. Tam was personally involved in this decision, and whoever is stuck answering feedback forms definitely wasn’t. Remember that you are representing me and the SSC community, and I will be very sad if you are a jerk to anybody. Please just explain the situation and ask them to stop doxxing random bloggers for clicks. If you are some sort of important tech person who the New York Times technology section might want to maintain good relations with, mention that.)

If you are a journalist who is willing to respect my desire for pseudonymity, I’m interested in talking to you about this situation (though I prefer communicating through text, not phone). My email is scott@slatestarcodex.com. [EDIT: Now over capacity for interviews, sorry]

Posted in Uncategorized | Comments Off on NYT Is Threatening My Safety By Revealing My Real Name, So I Am Deleting The Blog

Slightly Skew Systems Of Government

[Related To: Legal Systems Very Different From Ours Because I Just Made Them Up, List Of Fictional Drugs Banned By The FDA]

I.

Clamzoria is an acausal democracy.

The problem with democracy is that elections happen before the winning candidate takes office. If somebody’s never been President, how are you supposed to judge how good a President they’d be? Clamzoria realized this was dumb, and moved elections to the last day of an official’s term.

When the outgoing President left office, the country would hold an election. It was run by approval voting: you could either approve or disapprove of the candidate who had just held power. The results were tabulated, announced, and then nobody ever thought about them again.

Clamzoria chose its officials through a prediction market. The Central Bank released bonds for each candidate, which paid out X dollars at term’s end, where X was the percent of voters who voted Approve. Traders could provisionally buy and sell these bonds. On the first day of the term, whichever candidate’s bonds were trading at the highest value was inaugurated as the new President; everyone else’s bonds were retroactively cancelled and their traders refunded. The President would spend a term in office, the election would be held, and the bondholders would be reimbursed the appropriate amount.

The Clamzorians argued this protected against demagoguery. It’s easy for a candidate to promise the sun and moon before an election, but by the end of their term, voters know if the country is doing well or not. Instead of running on a platform of popular (but doomed) ideas, candidates are encouraged to run on a platform of unpopular ideas, as long as those unpopular ideas will genuinely make the country richer, safer, stronger, and all the other things that lead people to approve of a President’s term after the fact. Of course, you’re still limited by bond traders’ ability to predict which policies will work, but bond traders are usually more sober than the general electorate.

This system worked wonderfully for several decades, until Lord Bloodholme’s administration. He ran for President on an unconventional platform: if elected, he would declare himself Dictator-For-Life, replace democracy with sham elections, and kill all who opposed him. Based on his personality, all the bond traders found this completely believable. But that meant that in the end-of-term election, he would get 100% approval. His bond shot up to be worth nearly $100, the highest any bond had ever gone, and he won in a landslide. Alas, Lord Bloodholme was as good as his word, and – after a single sham election to ensure the bondholders got what they were due – that was the end of Clamzoria’s acausal democracy.

II.

Cognito is a constitutional mobocracy.

It used to be a regular mobocracy. It had a weak central government, radicals would protest whenever they didn’t like its decisions, the protests would shut down major cities, and the government would cave. Then people on the other side would protest, and that would also shut down major cities, and the government would backtrack. Eventually they realized they needed a better way, made a virtue out of necessity, and wrote the whole system into their constitution.

The Executive Branch is a president elected by some voting system that basically ensures a bland moderate. They have limited power to make decrees that enforce the will of the legislature. The legislature is the mob. One proposes a bill by having a protest in favor of it. If the protest attracts enough people – the most recent number is 43,617, but it changes every year based on the population and a few other factors – then the bill is considered up for review. Anyone can propose amendments (by having a protest demanding amendments) or vote against it – (by having a protest larger than the original protest demanding that the bill not be passed). After everyone has had a fair chance to protest, the text of the bill supported by the largest protest becomes law (unless the largest protest was against any change, in which case there is no change).

The Cognitans appreciate their system because protests are peaceful and nondisruptive. The government has a specific Protesting Square in every city with a nice grid that lets them count how many protesters there are, and all protests involve going into the Protesting Square, standing still for a few minutes to let neutral observers count people up, and then going home. It’s silly to protest beyond this; your protest wouldn’t be legally binding!

There’s been some concern recently that corprorations pay protesters to protest for things they want. Several consumer watchdog organizations are trying to organize mobs in favor of a bill to stop this.

III.

Yyphrostikoth is a meta-republic.

Every form of government has its own advantages and disadvantages, and the goal is to create a system of checks and balances where each can watch over the others. The Yyphrostikoth Governing Council has twelve members:

The Representative For Monarchy is a hereditary position.

The Representative For Democracy is elected.

The Representative For Plutocracy is the richest person in the country.

The Representative For Technocracy is chosen by lot from among the country’s Nobel Prize winners.

The Representative For Meritocracy is whoever gets the highest score on a standardized test of general knowledge and reasoning ability.

The Representative For Military Dictatorship is the top general in the army.

The Representative For Communism is the leader of the largest labor union.

The Representative For Futarchy is whoever has the best record on the local version of Metaculus.

The Representative For Gerontocracy is supposedly the oldest person in the country who is medically fit and willing to serve, but this has been so hard to sort out that in practice they are selected by the national retirees’ special interest group from the pool of willing candidates above age 90.

The Representative For Minarchy is an honorary position usually bestowed upon a respected libertarian philosopher or activist. It doesn’t really matter who holds it, because their only job is to vote “no” on everything, except things that are sneakily phrased so that “no” means more government, in which case they can vote “yes”. If a Representative For Minarchy wants to vote their conscience, they may break this rule once, after which they must resign and be replaced by a new Representative.

The Representative For Republicanism is selected by the other eleven members of the council.

The Representative For Theocracy is the leader of the Governing Council, and gets not only her own vote but a special vote to break any ties. She is chosen at random from a lottery of all adult citizens, on the grounds that God may pick whoever He pleases to represent Himself.

Long ago, the twelfth Councilor was the Representative For Kratocracy (rule by the strongest). The Representative For Kratocracy was whoever was sitting in the Representative For Kratocracy’s chair when a vote took place. This usually involved a lot of firefights and hostage situations, which was fine in principle – that was the whole point – except that the rest of the Governing Council kept getting caught in the crossfire. During the Nehanian Restoration, the Representative For Kratocracy’s chair was moved to a remote uninhabited island, with the Representative permitted to vote by video-link, but environmentalist groups complained that the constant militia battles there were harming migratory birds. Finally, a petition was sent to the Oracle of Yaanek, asking what to do. The God recommended that the position be eliminated, and offered to decide who filled the newly vacated seat Himself; thus the beginning of the Representative For Theocracy.

The Constitution was never fully amended, so technically the position is still the Representative For Kratocracy, and technically anyone who kills the Representative For Theocracy can still take his seat and gain immense power. But for some reason everyone who tries this dies of completely natural causes just before their plan comes to fruition. Must be one of those coincidences.

Posted in Uncategorized | Tagged | 190 Comments

Open Thread 156.25 + Signal Boost For Steve Hsu

[UPDATE: As of 6/19, Professor Hsu resigned as VP of Research. He still encourages interested people to sign the petition as a general gesture of support.]

Normally this would be a hidden thread, but I wanted to signal boost this request for help by Professor Steve Hsu, vice president of research at Michigan State University. Hsu is a friend of the blog and was a guest speaker at one of our recent online meetups – some of you might also have gotten a chance to meet him at a Berkeley meetup last year. He and his blog Information Processing have also been instrumental in helping me and thousands of other people better understand genetics and neuroscience. If you’ve met him, you know he is incredibly kind, patient, and willing to go to great lengths to help improve people’s scientific understanding.

Along with all the support he’s given me personally, he’s had an amazing career. He started as a theoretical physicist publishing work on black holes and quantum information. Then he transitioned into genetics, spent a while as scientific advisor to the Beijing Genomics Institute, and helped discover genetic prediction algorithms for gallstones, melanoma, heart attacks, and other conditions. Along with his academic work, he also sounded the alarm about the coronavirus early and has been helping shape the response.

This week, some students at Michigan State are trying to cancel him. They point an interview he did on an alt-right podcast (he says he didn’t know it was alt-right), to his allowing MSU to conduct research on police shootings (which concluded, like most such research, that they are generally not racially motivated), and to his occasional discussion of the genetics of race (basically just repeating the same “variance between vs. within clusters” distinction everyone else does, see eg here). You can read the case being made against him here, although keep in mind a lot of it is distorted and taken out of context, and you can read his response here.

Professor Hsu will probably land on his feet whatever happens, but it would be a great loss for Michigan and its scientific community if he could no longer work with them; it would also have a chilling effect on other scientists who want to discuss controversial topics or engage with the public. If you support him, you can sign the petition to keep him on here. If you are a professor or other notable person, your voice could be especially helpful, but anyone is welcome to sign regardless of credentials or academic status. See here for more information. He says that time is of the essence since activists are pressuring the college to make a decision right away while everyone is still angry.

This was supposed to be a culture-war free open thread, but I guess the ship has sailed on that one, so, uh, just do your best, and I’ll delete anything that needs deleting.

Posted in Uncategorized | Tagged | 2,996 Comments

The Vision Of Vilazodone And Vortioxetine

I.

One of psychiatry’s many embarrassments is how many of our drugs get discovered by accident. They come from random plants or shiny rocks or stuff Alexander Shulgin invented to get high.

But every so often, somebody tries to do things the proper way. Go over decades of research into what makes psychiatric drugs work and how they could work better. Figure out the hypothetical properties of the ideal psych drug. Figure out a molecule that matches those properties. Synthesize it and see what happens. This was the vision of vortioxetine and vilazodone, two antidepressants from the early 2010s. They were approved by the FDA, sent to market, and prescribed to millions of people. Now it’s been enough time to look back and give them a fair evaluation. And…

…and it’s been a good reminder of why we don’t usually do this.

Enough data has come in to be pretty sure that vortioxetine and vilazodone, while effective antidepressants, are no better than the earlier medications they sought to replace. I want to try going over the science that led pharmaceutical companies to think these two drugs might be revolutionary, and then speculate on why they weren’t. I’m limited in this by my total failure to understand several important pieces of the pathways involved, so I’ll explain the parts I get, and list the parts I don’t in the hopes that someone clears them up in the comments.

II.

SSRIs take about a month to work. This is surprising, because they start increasing serotonin immediately. If higher serotonin is associated with improved mood, why the delay?

Most research focuses on presynaptic 5-HT1A autoreceptors, which detect and adjust the amount of serotonin being released in order to maintain homeostasis. If that sounds clear as mud to you, well, it did to me too the first twenty times I heard it. Here’s the analogy which eventually worked for me:

Imagine you’re a salesperson who teleconferences with customers all day. You’re bad at projecting your voice – sometimes it’s too soft, sometimes too loud. Your boss tells you the right voice level for sales is 60 decibels. So you put an audiometer on your desk that measures how many decibels your voice is. It displays an up arrow, down arrow, or smiley face, telling you that you’re being too quiet, loud, or just right.

The audiometer is presynaptic (it’s on your side of the teleconference, not your customer’s side). It’s an autoreceptor, not a heteroreceptor (you’re using it to measure yourself, not to measure anything else). And you’re using it to maintain homeostasis (to keep your voice at 60 dB).

Suppose you take a medication that stimulates your larynx and makes your voice naturally louder. As long as your audiometer’s working, that medication will have no effect. Your voice will naturally be louder, but that means you’ll see the down arrow on your audiometer more often, you’ll speak with less force, and the less force and louder voice will cancel out and keep you at 60 dB. No change.

This is what happens with antidepressants. The antidepressant increases serotonin levels (ie sends a louder signal). But if the presynaptic 5-HT1A autoreceptors are intact, they tell the cells that the signal is too loud and they should release less serotonin. So they do, and now we’re back where we started – ie depressed.

(Why don’t the autoreceptors notice the original problem – that you have too little serotonin and are depressed – and work their magic there? Not sure. Maybe depression affects whatever sets the autoreceptors, and causes them to be set too low? Maybe the problem is with reception, not transmission? Maybe you have the right amount of serotonin, but you want excessive amounts of serotonin because that would fix a problem somewhere else? Maybe you should have paid extra for the premium model of presynaptic autoreceptor?)

So how come antidepressants work after a month? The only explanation I’ve heard is that the autoreceptors get “saturated”, which is a pretty nonspecific term. I think it means that there’s a second negative feedback loop controlling the first negative feedback loop – the cell notices there is way more autoreceptor activity than expected and assumes it is producing too many autoreceptors. Over the course of a month, it stops producing more autoreceptors, the existing autoreceptors gradually degrade, and presynaptic autoreception stops being a problem. In our salesperson analogy, after a while you notice that your audiometer doesn’t match your own perception of how much effort you’re putting into speaking – no matter how quietly you feel like you’re whispering, the audiometer just keeps saying you’re yelling very loud. Eventually you declare it defective, throw it out, and just speak naturally. Now the larynx-stimulating medication can make your voice louder.

In the late ’90s, some scientists wondered – what happens if you just block the 5-HT1A autoreceptors directly? At the very least, seems like you could make antidepressants work a month faster. Best case scenario, you can make antidepressants work better. Maybe after a month, the cells have lost some confidence in the autoreceptors, but they’re still keeping serotonin somewhere below the natural amount based on the anomalous autoreceptor reading. So maybe blocking the autoreceptors would mean a faster, better, antidepressant.

Luckily we already knew a chemical that could block these – pindolol. Pindolol is a blood-pressure-lowering medication, but by coincidence it also makes it into the brain and blocks this particular autoreceptor involved in serotonin homeostasis. So a few people started giving pindolol along with antidepressants. What happened? According to some small unconvincing systematic reviews, it did seem to kind of help make antidepressants work faster, but according to some other small unconvincing meta-analyses it probably didn’t make them work any better. It just made antidepressants go from taking about four weeks to work, to taking one or two weeks to work. It also gave patients dizziness, drowsiness, weakness, and all the other things you would expect from giving a blood-pressure-lowering medication to people with normal blood pressure. So people decided it probably wasn’t worth it.

III.

And then there’s buspirone.

Buspirone is generally considered a 5-HT1A agonist. The reality is a little more complicated; it’s a full agonist of presynaptic autoreceptors, and a partial agonist of postsynaptic autoreceptors(uh, imagine the customer in the other side of the teleconference also has an audiometer).

Buspirone has a weak anti-anxiety effect. It may also weakly increase sex drive, which is a nice contrast to SSRIs, which decrease (some would say “destroy”) sex drive. Flibanserin, a similar drug, is FDA-approved as a sex drive enhancer.

Buspirone stimulates presynaptic autoreceptors, which should cause cells to release less serotonin. Since high serotonin levels (eg with SSRIs) decreases sex drive, it makes sense that buspirone should increase sex drive. So far, so good.

But why is buspirone anxiolytic? I have just read a dozen papers purporting to address this question, and they might as well have been written in Chinese for all I was able to get from them. Some of them just say that decreasing serotonin levels decreases anxiety, which would probably come as a surprise to anyone on SSRIs, anyone who does tryptophan depletion studies, anyone who measures serotonin metabolites in the spinal fluid, etc. Obviously it must be more complicated than this. But how? I can’t find any explanation.

A few books and papers take a completely different tack and argue that buspirone just does the same thing SSRIs do – desensitize the presynaptic autoreceptors until the cells ignore them – and then has its antianxiety action through stimulating postsynaptic receptors. But if it’s doing the same thing as SSRIs, how come it has the opposite effect on sex drive? And how come there are some studies suggesting that it’s helpful to add buspirone onto SSRIs? Why isn’t that just doubling up on the same thing?

I am deeply grateful to SSC commenter Scchm for presenting an argument that this is all wrong, and buspirone acts on D4 receptors, both in its anxiolytic and pro-sexual effects. This would neatly resolve the issues above. But then how come nobody else mentions this? How come everyone else seems to think buspirone makes sense, and writes whole papers about it without using the sentence “what the hell all of this is crazy”?

And here’s one more mystery: after the pindolol studies, everyone just sort of started assuming buspirone would work the same way pindolol did. This doesn’t really make sense pharmacologically – pindolol is an antagonist at presynaptic 5-HT1A receptors; buspirone is an agonist of same. And it doesn’t actually work in real life – someone did a study, and the study found it didn’t work. Still, and I have no explanation for this, people got excited about this possibility. If buspirone could work like pindolol, then we would have a chemical that made antidepressants work faster, and treated anxiety, and reduced sexual dysfunction.

And here’s one more mystery – okay, you have unrealistically high expectations for buspirone, fine, give people buspirone along with their SSRI. Lots of psychiatrists do this, it’s not really my thing, but it’s not a bad idea. But instead, the people making this argument became obsessed with the idea of finding a single chemical that combined SSRI-like activity with buspirone-like activity. A dual serotonin-transporter-inhibitor and 5-HT1A partial agonist became a pharmacological holy grail.

IV.

After a lot of people in lab coats poured things from one test tube to another, Merck announced they had found such a chemical, which they called vilazodone (Viibryd®).

Vilazodone is an SSRI and 5-HT1A partial agonist. I can’t find how its exact partial agonist profile differs from buspirone, except that it’s more of a postsynaptic agonist, whereas buspirone is more of a postsynaptic antagonist. I don’t know if this makes a difference.

The FDA approved vilazodone to treat depression, so it “works” in that sense. But does its high-tech promise pan out? Is it really faster-acting, anxiety-busting, and less likely to cause sexual side effects?

On a chemical level, things look promising. This study finds that vilazodone elevates serotonin faster and higher than Prozac does in mice. And this is kind of grim, but toxicologists have noticed that vilazodone overdoses are much more likely to produce serotonin toxicity than Prozac overdoses, which fits what you would expect if vilazodone successfully breaks the negative feedback system that keeps serotonin in a normal range.

On a clinical level, maybe not. Proponents of vilazodone got excited about a study where vilazodone showed effects as early as week two. But “SSRIs take four weeks to work” is a rule of thumb, not a natural law. You always get a couple of people who get some effect early on, and if your study population is big enough, that’ll show up as a positive result. So you need to compare vilazodone to an SSRI directly. The only group I know who tried this, Matthews et al, found no difference – in fact, vilazodone was nonsignificantly slower (relevant figure). There’s no sign of vilazodone working any better either.

What about sexual side effects? Vilazodone does better than SSRIs in rats, but whatever. There’s supposedly a human study – from the same Matthews et al team as above – but sexual side effects were so rare in all groups that it’s hard to draw any conclusions. This is bizarre – they had a thousand patients, and only 15 reported decreased libido (and no more in the treatment groups than the placebo group). Maybe God just hates antidepressant studies and makes sure they never find anything, and this is just as true when you’re studying side effects as it is when you’re studying efficacy.

Clayton et al do their own study of vilazodone’s sexual side effects. They find that overall vilazodone improves sexual function over placebo, probably because they used a scale that was very sensitive to the kind of bad sexual function you get when you’re depressed, and not as sensitive to the kind you get on antidepressants. But they did measure how many people who didn’t start out with sexual dysfunction got it during the trial, and this number was 1% of the placebo group and 8% of the vilazodone group. How many people would have gotten dysfunction on an SSRI? We don’t know because they didn’t include an active comparator. Usually I expect about 30 – 50% of people to get sexual side effects on SSRIs, but that’s based on me asking them and not on whatever strict criteria they use for studies. Remember, the Matthews study was able to find only 1.5% of people getting sexual side effects! So we shouldn’t even try to estimate how this compares. All we can say is that vilazodone definitely doesn’t have no sexual side effects.

I can’t find any studies evaluating vilazodone vs. anything else for anxiety, but I also can’t find any patients saying vilazodone treated their anxiety especially well.

There’s really only one clear and undeniable difference between vilazodone and ordinary SSRIs, which is that vilazodone costs $290 a month, whereas other SSRIs cost somewhere in the single digits (Lexapro costs $7.31). If you’re paying for vilazodone, you can take comfort in knowing your money helped fund a pretty cool research program that had some interesting science behind it. But I’m not sure it actually panned out.

V.

Encouraged by Merck’s success…

(not necessarily clinical success, success at getting people to pay $290 a month for an antidepressant)

…Takeda and Lundbeck announced their own antidepressant with 5-HT1A partial agonist action, vortioxetine. They originally gave it the trade name Brintellix®, but upon its US release people kept confusing it with the unrelated medication Brilinta®, so Takeda/Lundbeck agreed to change the name to Trintellix® for the American market.

Vortioxetine claimed to have an advantage over its competitor vilazodone, in that it also antagonized 5-HT3 receptors. 5-HT3 receptors are weird. They’re the only ion channel based serotonin receptors, and they’re not especially involved in mood or anxiety. They do only one thing, and they do it well: they make you really nauseous. If you’ve ever felt nauseous on an SSRI, 5-HT3 agonism is why. And if you’ve ever taken Zofran (ondansetron) for nausea, you’ve benefitted from its 5-HT3 antagonism. Most antidepressants potentially cause nausea; since vortioxetine also treats nausea, presumably you break even and are no more nauseous than you were before taking it. Also, there are complicated theoretical reasons to believe maybe 5-HT3 antagonism is kind of like 5-HT1A antagonism in that it speeds the antidepressant response.

After this Takeda and Lundbeck kind of just went crazy, claiming effects on more and more serotonin receptors. It’s a 5-HT7 antagonist! (what is 5-HT7? No psychiatrist had ever given a second’s thought to this receptor before vortioxetine came out, but apparently it…exists to make your cognition worse, so that blocking it makes your cognition better again?) It’s a 5-HT1B partial agonist! (what is 5-HT1B? Apparently a useful potential depression target, according to a half-Japanese, half-Scandinavian team, who report no conflict of interest even though vortioxetine is being sold by a consortium of a Japanese pharma company and a Scandinavian pharma company). It’s a 5-HT1D antagonist! (really? There are four different kinds of 5-HT1 receptor? Are you sure you’re not just making things up now?)

If we take all of this seriously, vortioxetine is an SSRI with faster mechanism of action, fewer sexual side effects, additional anti-anxiety effect, additional anti-nausea effect, plus it gives you better cognition (technically “relieves the cognitive symptoms of depression”). Is any of this at all true?

A meta-analysis of 12 studies finds vortioxetine has a statistically significant but pathetic effect size of 0.2 against depression, which is about average for antidepressants. In a few head-to-head comparisons with SNRIs (similar to SSRIs), vortioxetine treats depression about equally well. Patients are more likely to stop the SNRIs because of side effects than to stop the vortioxetine, but SNRIs probably have more side effects than SSRIs, so unclear if vortioxetine is better than those. Wagner et al are able to find a study comparing vortioxetine to the SSRI Paxil; they work about equally well.

What about the other claims? Weirdly, vortioxetine patients have more nausea and vomiting than venlafaxine patients, although it’s not significant. Other studies confirm nausea is a pretty serious vortioxetine side effect. I have no explanation for this. Antagonizing 5-HT3 receptors does one thing – treats nausea and vomiting! – and vortioxetine definitely does this. It must be hitting some other unknown receptor really hard, so hard that the 5-HT3 antagonism doesn’t counterbalance it. Either that, or it’s the thing where God hates antidepressants again.

What about sexual dysfunction? Jacobsen et al find that patients have slightly (but statistically significantly) less sexual dysfunction on vortioxetine than on escitalopram. But the study was done by Takeda, and the difference is so slight (a change of 8.8 points on a 60 point scale, vs. a change of 6.6 points) that it’s hard to take it very seriously.

What about cognition? I was sure this was fake, but it seems to have more evidence behind it than anything else. Carlat Report (paywalled) thinks it might be legit, based on a series of (Takeda-sponsored) studies of performance on the Digit Symbol Substitution Test. People on vortioxetine consistently did better on this test than people on placebo or duloxetine. And it wasn’t just that being not-depressed helps you try harder; they did some complicated statistics and found that vortioxetine’s test-score-improving effect was independent of its antidepressant effect (and seems to work at lower doses). What’s the catch? The improvement was pretty minimal, and only shows up on this one test – various other cognitive tests are unaffected. So it’s probably doing something measurable, but it’s not going to give you a leg up on the SAT. The FDA seriously considered approving it as indicated for helping cognition, but eventually decided against it on the grounds that if they approved it, people would think it was useful in real life, whereas all we know is that it’s useful on this one kind of hokey test. Still, that’s one hokey test more than vilazodone was ever able to show for itself.

In summary, vortioxetine probably treats depression about as well as any other antidepressant, but makes you slightly more nauseous, may (if you really trust pharma company studies) give you slightly fewer sexual side effects, and may improve your performance on the Digit Symbol Substitution Test. It also costs $375 a month (Lexapro still costs $7). If you want to pay $368 extra to be a little more nauseous and substitute digits for symbols a little faster, this is definitely the drug for you.

VI.

In conclusion, big pharma spent about ten years seeing if combining 5-HT1A partial agonism with SSRI antidepressants led to any benefits. In the end, it didn’t, unless you count benefits to big pharma’s bottom line.

I’m a little baffled, because pharma companies generally don’t waste money researching drugs unless they have very good theoretical reasons to think they’ll work. But I can’t make heads or tails of the theoretical case for 5-HT1A partial agonists for depression.

For one thing, there’s a pretty strong argument that buspirone exerts its effects via dopamine rather than serotonin – a case that it seems like nobody, including the pharma companies, is even slightly aware of. If this were true, the whole project would have been doomed from the beginning. What happened here?

For another, I still don’t get the supposed model for how buspirone even could exert its effects through 5-HT1A. Does it increase or decrease serotonergic transmission? Does it desensitize presynaptic autoreceptors the same way SSRIs do, or do something else? I can’t figure out a combination of answers to this question that are consistent with each other and with the known effects of these drugs. Is there one?

For another, the case seems to have been premised on the idea that buspirone (a presynaptic 5-HT1A agonist) would work the same as pindolol (a presynaptic 5-HT1A antagonist), even after studies showed that it didn’t. And then it combined that with an assumption that it was better to spend hundreds of millions of dollars discovering a drug that combined SSRI and buspirone-like effects, rather than just giving someone a pill of SSRI powder mixed with buspirone powder. Why?

I would be grateful if some friendly pharmacologist reading this were to comment with their take on these questions. This is supposed to be my area of expertise, and I have to admit I am stumped.

Posted in Uncategorized | Tagged | 94 Comments

Open Thread 156

This is the biweek-ly visible open thread (there are also hidden open threads twice a week you can reach through the Open Thread tab on the top of the page). Post about anything you want, but please try to avoid hot-button political and social topics. You can also talk at the SSC subreddit – and also check out the SSC Podcast. Also:

1. Comment of the week: superkamiokande from the subreddit explains the structural and computational differences between Wernicke’s and Broca’s areas.

2. There’s another SSC virtual meetup next week, guest speaker Robin Hanson. More information here.

3. As many areas reopen, local groups will have to decide whether or not to restart in-person meetups. I can’t speak to other countries that may have things more under control, but in the US context, I am against this. Just because it’s legal to hold medium-sized gatherings now doesn’t mean it’s a good idea. I would feel really bad if anyone became sick or spread the pandemic because of my blog. I don’t control local groups, and they can do what they want, but I won’t be advertising meetups on the blogroll until I feel like they’re safe. Exceptions for East Asia, New Zealand, and anyone else who can convince me that their country is in the clear.

4. Some people have noticed that my toxoplasma post seems disconfirmed by recent protests, which reached national scale even though the incident was very clear-cut and uncontroversial. I agree this is some negative evidence. The toxoplasma model was meant to be a tendency, not a 100% claim about things always work. Certainly it is still mysterious in general why some outrageous incidents spark protests and other near-identical ones don’t. I think it’s relevant that everyone is in a bad place right now because of coronavirus (remember, just two months ago Marginal Revolution posted When Will The Riots Begin?), and that 2020 is the peak of Turchin’s fifty-year cycle of conflict.

5. Speaking of protests, the open threads have been getting pretty intense lately. I realize some awful stuff has been going on, and emotions are really high, but I want everyone to take a deep breath and try to calm down a little bit before saying anything you’ll regret later. I will be enforcing the usually-poorly-enforced ban on culture war topics in this thread with unrecorded deletions. I may or may not suspend the next one or two hidden threads to give everyone a chance to calm down. I hope everybody is staying safe and sane during these difficult times.

6. If you haven’t already taken last week’s nootropics survey, and you are an experienced user of nootropics, you can take it now.

Posted in Uncategorized | Tagged | 1,208 Comments

Wordy Wernicke’s

There are two major brain areas involved in language. To oversimplify, Wernicke’s area in the superior temporal gyrus handles meaning; Broca’s area in the inferior frontal gyrus handles structure and flow.

If a stroke or other brain injury damages Broca’s area but leaves Wernicke’s area intact, you get language which is meaningful, but not very structured or fluid. You sound like a caveman: “Want food!”

If it damages Wernicke’s area but leaves Broca’s area intact, you get speech which has normal structure and flow, but is meaningless. I’d read about this pattern in books, but I still wasn’t prepared the first time I saw a video of a Wernicke’s aphasia patient (source):

During yesterday’s discussion of GPT-3, a commenter mentioned how alien it felt to watch something use language perfectly without quite making sense. I agree it’s eerie, but it isn’t some kind of inhuman robot weirdness. Any one of us is a railroad-spike-through-the-head away from doing the same.

Does this teach us anything useful about GPT-3 or neural networks? I lean towards no. GPT-3 already makes more sense than a Wernicke’s aphasiac. Whatever it’s doing is on a higher level than the Broca’s/Wernicke’s dichotomy. Still, it would be interesting to learn what kind of computational considerations caused the split, and whether there’s any microstructural difference in the areas that reflects it. I don’t know enough neuroscience to have an educated opinion on this.

Posted in Uncategorized | Tagged , | 101 Comments

The Obligatory GPT-3 Post

I.

I would be failing my brand if I didn’t write something about GPT-3, but I’m not an expert and discussion is still in its early stages. Consider this a summary of some of the interesting questions I’ve heard posed elsewhere, especially comments by gwern and nostalgebraist. Both of them are smart people who I broadly trust on AI issues, and both have done great work with GPT-2. Gwern has gotten it to write poetry, compose music, and even sort of play some chess; nostalgebraist has created nostalgebraist-autoresponder (a Tumblr written by GPT-2 trained on nostalgebraist’s own Tumblr output). Both of them disagree pretty strongly on the implications of GPT-3. I don’t know enough to resolve that disagreement, so this will be a kind of incoherent post, and hopefully stimulate some more productive comments. So:

OpenAI has released a new paper, Language Models Are Few-Shot Learners, introducing GPT-3, the successor to the wildly-successful language-processing AI GPT-2.

GPT-3 doesn’t have any revolutionary new advances over its predecessor. It’s just much bigger. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion. The researchers involved are very open about how it’s the same thing but bigger. Their research goal was to test how GPT-like neural networks scale.

Before we get into the weeds, let’s get a quick gestalt impression of how GPT-3 does compared to GPT-2.

Here’s a sample of GPT-2 trying to write an article:

PROMPT: MILEY CYRUS WAS CAUGHT SHOPLIFTING FROM ABERCROMBIE AND FITCH ON HOLLYWOOD BOULEVARD TODAY

The 19-year-old singer was caught on camera being escorted out of the store by security guards.

The singer was wearing a black hoodie with the label ‘Blurred Lines’ on the front and ‘Fashion Police’ on the back

The singer was also wearing a pair of black-rimmed glasses, a black jacket, black jeans and black sandals.

She was carrying a pair of black and white striped gloves and a small black bag.

And here’s GPT-3 at the same task:

PROMPT: UNITED METHODISTS AGREE TO HISTORIC SPLIT – THOSE WHO OPPOSE GAY MARRIAGE WILL FORM THEIR OWN DENOMINATION

After two days of intense debate, the United Methodist Church has agreed to a historic split – one that is expected to end in the creation of a new denomination, one that will be “theologically and socially conservative,” according to the Washington Post. The majority of delegates attending the church’s annual General Conference in May voted to strengthen a ban on the ordination of LGBTQ clergy and to write new rules that will “discipline” clergy who officiate at same-sex weddings. But those who opposed these measures have a new plan: They say they will form a separate denomination by 2020, calling their church the Christian Methodist denomination.

The Post notes that the denomination, which claims 12.5 million members, was in the early 20th century the “largest Protestant denomination in the US”, but that it has been shrinking in recent decades. The new split will be the second in the church’s history. The first occurred in 1968, when roughly 10% of the denomination left to form the Evangelical United Brethren Church. The Post notes that the proposed split “comes at a critical time for the church, which has been losing members for years,” which has been “pushed toward the brink of a schism over the role of LGBTQ people in the church.” Gay marriage is not the only issue that has divided the church. In 2016, the denomination was split over ordination of transgender clergy, with the North Pacific regional conference voting to ban them from serving as clergy, and the South Pacific regional conference voting to allow them.

In case you’re wondering, no, the Methodist Church did not really split from the Evangelical United Brethren Church in 1968, it merged with it in that year. GPT-3 is doing what I used to do on essay questions – throw out a bunch of half-remembered names and dates and hope nobody looks too closely at the exact relations.

These kinds of comparisons are inherently fuzzy, since both GPTs have produced a lot of material and it’s easy to cheat by taking better or worse samples of one or the other. You can see more GPT-2 samples here and more GPT-3 samples here

Or compare some of GPT-2’s poetry:

The gods are they who came to earth
And set the seas ablaze with gold.
There is a breeze upon the sea,
A sea of summer in its folds,
A salt, enchanted breeze that mocks
The scents of life, from far away
Comes slumbrous, sad, and quaint, and quaint.
The mother of the gods, that day,
With mortal feet and sweet voice speaks,
And smiles, and speaks to men: “My Sweet,
I shall not weary of thy pain.”

…Let me drink of the wine of pain
And think upon the agonies of hope,
And of the blessed Giver of all good things;
For, man for man, mine is the deepest love
That sorrow takes upon the humblest soul;
But who hath learned how sorrow turns to gall
The places where my feet have trod before.

…to some of GPT-3’s:

I must have shadows on the way
If I am to walk I must have
Each step taken slowly and alone
To have it ready made

And I must think in lines of grey
Must have dim thoughts to be my guide
Must look on blue and green
And never let my eye forget
That color is my friend
And purple must surround me too

The yellow of the sun is more more
Intusive than the bluish snow
That falls on all of us. I must have
Grey thoughts and blue thoughts walk with me
If I am to go away at all.

Ignore the style differences – the GPT-2 instance was trained on 18th-century verse, GPT-3 on Wallace Stevens. I notice that the first poem, while flowery, is kind of incoherent. The second has more of a structure and a theme that it maintains throughout. I’m not sure it really symbolizes anything interesting, but I’ve never been sure of that with real Wallace Stevens either, so good work.

(you can read more GPT-2 poetry here. There’s not much more public GPT-3 poetry – you can read one or two more samples on the paper linked above).

Enough writing. What about math? Last year, I posted the following excerpt from GPT-2:

PROMPT: MY 10 FAVORITE ANIMALS ARE:

My ten favorite animals are:
1. Zebras with a white scar on the back
2. Insiduous spiders and octopus
3. Frog with large leaves, hopefully black
4. Cockatiel with scales
5. Razorbill with wings hanging about 4 inches from one’s face and a heart tattoo on a frog
3. Cockatric interlocking tetrabods that can be blind, cut, and eaten raw:
4. Black and white desert crocodiles living in sunlight
5. Zebra and many other pea bugs

Despite the obvious flaws in this piece, I was impressed. GPT-2 was clearly trying to make a numbered list, and almost kind of getting it right! It counted to 4 successfully! Remember, this is a text prediction engine that didn’t necessarily need to have any concept of numbers. But it still kind of counted to 4! I wrote:

Imagine you prompted the model with “What is one plus one?” I actually don’t know how it would do on this problem. I’m guessing it would answer “two”, just because the question probably appeared a bunch of times in its training data.

Now imagine you prompted it with “What is four thousand and eight plus two thousand and six?” or some other long problem that probably didn’t occur exactly in its training data. I predict it would fail, because this model can’t count past five without making mistakes. But I imagine a very similar program, given a thousand times more training data and computational resources, would succeed. It would notice a pattern in sentences including the word “plus” or otherwise describing sums of numbers, it would figure out that pattern, and it would end up able to do simple math. I don’t think this is too much of a stretch given that GPT-2 learned to count to five and acronymize words and so on.

I said “a very similar program, given a thousand times more training data and computational resources, would succeed [at adding four digit numbers]”. Well, GPT-3 is a very similar program with a hundred times more computational resources, and…it can add four-digit numbers! At least sometimes, which is better than GPT-2’s “none of the time”.

II.

In fact, let’s take a closer look at GPT-3’s math performance.

The 1.3 billion parameter model, equivalent to GPT-2, could get two-digit addition problems right less than 5% of the time – little better than chance. But for whatever reason, once the model hit 13 billion parameters, its addition abilities improved to 60% – the equivalent of a D student. At 175 billion parameters, it gets an A+.

What does it mean for an AI to be able to do addition, but only inconsistently? For four digit numbers, but not five digit numbers? Doesn’t it either understand addition, or not?

Maybe it’s cheating? Maybe there were so many addition problems in its dataset that it just memorized all of them? I don’t think this is the answer. There are 100 million possible 4-digit addition problems; seems unlikely that GPT-3 saw that many of them. Also, if it was memorizing its training data, it should have gotten all 100 possible two-digit multiplication problems, but it only has about a 25% success rate on those. So it can’t be using a lookup table.

Maybe it’s having trouble locating addition rather than doing addition? (thanks to nostalgebraist for this framing). This sort of seems like the lesson of Table 3.9:

“Zero-shot” means you just type in “20 + 20 = ?”. “One-shot” means you give it an example first: “10 + 10 = 20. 20 + 20 = ?” “Few-shot” means you give it as many examples as it can take. Even the largest and best model only does mediocre on the zero-shot task, but it does better on the one-shot and best on the few-shot. So it seems like if you remind it what addition is a couple of times before solving an addition problem, it does better. This suggests that there is a working model of addition somewhere within the bowels of this 175 billion parameter monster, but it has a hard time drawing it out for any particular task. You need to tell it “addition” “we’re doing addition” “come on now, do some addition!” up to fifty times before it will actually deploy its addition model for these problems, instead of some other model. Maybe if you did this five hundred or five thousand times, it would excel at the problems it can’t do now, like adding five digit numbers. But why should this be so hard? The plus sign almost always means addition. “20 + 20 = ?” is not some inscrutable hieroglyphic text. It basically always means the same thing. Shouldn’t this be easy?

When I prompt GPT-2 with addition problems, the most common failure mode is getting an answer that isn’t a number. Often it’s a few paragraphs of text that look like they came from a math textbook. It feels like it’s been able to locate the problem as far as “you want the kind of thing in math textbooks”, but not as far as “you want the answer to the exact math problem you are giving me”. This is a surprising issue to have, but so far AIs have been nothing if not surprising. Imagine telling Marvin Minsky or someone that an AI smart enough to write decent poetry would not necessarily be smart enough to know that, when asked “325 + 504”, we wanted a numerical response!

Or maybe that’s not it. Maybe it has trouble getting math problems right consistently for the same reason I have trouble with this. In fact, GPT-3’s performance is very similar to mine. I can also add two digit numbers in my head with near-100% accuracy, get worse as we go to three digit numbers, and make no guarantees at all about four-digit. I also find multiplying two-digit numbers in my head much harder than adding those same numbers. What’s my excuse? Do I understand addition, or not? I used to assume my problems came from limited short-term memory, or from neural noise. But GPT-3 shouldn’t have either of those issues. Should I feel a deep kinship with GPT-3? Are we both minds heavily optimized for writing, forced by a cruel world to sometimes do math problems? I don’t know.

[EDIT: an alert reader points out that when GPT-3 fails at addition problems, it fails in human-like ways – for example, forgetting to carry a 1.]

III.

GPT-3 is, fundamentally, an attempt to investigate scaling laws in neural networks. That is, if you start with a good neural network, and make it ten times bigger, does it get smarter? How much smarter? Ten times smarter? Can you keep doing this forever until it’s infinitely smart or you run out of computers, whichever comes first?

So far the scaling looks logarithmic – a consistent multiplication of parameter number produces a consistent gain on the benchmarks.

Does that mean it really is all about model size? Should something even bigger than GPT-3 be better still, until eventually we have things that can do all of this stuff arbitrarily well without any new advances?

This is where my sources diverge. Gwern says yes, probably, and points to years of falsified predictions where people said that scaling might have worked so far, but definitely wouldn’t work past this point. Nostalgebraist says maybe not, and points to decreasing returns of GPT-3’s extra power on certain benchmarks (see Appendix H) and to this OpenAI paper, which he interprets as showing that scaling should break down somewhere around or just slightly past where GPT-3 is. If he’s right, GPT-3 might be around the best that you can do just by making GPT-like things bigger and bigger. He also points out that although GPT-3 is impressive as a general-purpose reasoner that has taught itself things without being specifically optimized to learn them, it’s often worse than task-specifically-trained AIs at various specific language tasks, so we shouldn’t get too excited about it being close to superintelligence or anything. I guess in retrospect this is obvious – it’s cool that it learned how to add four-digit numbers, but calculators have been around a long time and can add much longer numbers than that.

If the scaling laws don’t break down, what then?

GPT-3 is very big, but it’s not pushing the limits of how big an AI it’s possible to make. If someone rich and important like Google wanted to make a much bigger GPT, they could do it.

Does “terrifying” sound weirdly alarmist here? I think the argument is something like this. In February, we watched as the number of US coronavirus cases went from 10ish to 50ish to 100ish over the space of a few weeks. We didn’t panic, because 100ish was still a very low number of coronavirus cases. In retrospect, we should have panicked, because the number was constantly increasing, showed no signs of stopping, and simple linear extrapolation suggested it would be somewhere scary very soon. After the number of coronavirus cases crossed 100,000 and 1,000,000 at exactly the time we could have predicted from the original curves, we all told ourselves we definitely wouldn’t be making that exact same mistake again.

It’s always possible that the next AI will be the one where the scaling curves break and it stops being easy to make AIs smarter just by giving them more computers. But unless something surprising like that saves us, we should assume GPT-like things will become much more powerful very quickly.

What would much more powerful GPT-like things look like? They can already write some forms of text at near-human level (in the paper above, the researchers asked humans to identify whether a given news article had been written by a human reporter or GPT-3; the humans got it right 52% of the time)

So one very conservative assumption would be that a smarter GPT would do better at various arcane language benchmarks, but otherwise not be much more interesting – once it can write text at a human level, that’s it.

Could it do more radical things like write proofs or generate scientific advances? After all, if you feed it thousands of proofs, and then prompt it with a theorem to be proven, that’s a text prediction task. If you feed it physics textbooks, and prompt it with “and the Theory of Everything is…”, that’s also a text prediction task. I realize these are wild conjectures, but the last time I made a wild conjecture, it was “maybe you can learn addition, because that’s a text prediction task” and that one came true within two years. But my guess is still that this won’t happen in a meaningful way anytime soon. GPT-3 is much better at writing coherent-sounding text than it is at any kind of logical reasoning; remember it still can’t add 5-digit numbers very well, get its Methodist history right, or consistently figure out that a plus sign means “add things”. Yes, it can do simple addition, but it has to use supercomputer-level resources to do so – it’s so inefficient that it’s hard to imagine even very large scaling getting it anywhere useful. At most, maybe a high-level GPT could write a plausible-sounding Theory Of Everything that uses physics terms in a vaguely coherent way, but that falls apart when a real physicist examines it.

Probably we can be pretty sure it won’t take over the world? I have a hard time figuring out how to turn world conquest into a text prediction task. It could probably imitate a human writing a plausible-sounding plan to take over the world, but it couldn’t implement such a plan (and would have no desire to do so).

For me the scary part isn’t the much larger GPT we’ll probably have in a few years. It’s the discovery that even very complicated AIs get smarter as they get bigger. If someone ever invented an AI that did do more than text prediction, it would have a pretty fast takeoff, going from toy to superintelligence in just a few years.

Speaking of which – can anything based on GPT-like principles ever produce superintelligent output? How would this happen? If it’s trying to mimic what a human can write, then no matter how intelligent it is “under the hood”, all that intelligence will only get applied to becoming better and better at predicting what kind of dumb stuff a normal-intelligence human would say. In a sense, solving the Theory of Everything would be a failure at its primary task. No human writer would end the sentence “the Theory of Everything is…” with anything other than “currently unknown and very hard to figure out”.

But if our own brains are also prediction engines, how do we ever create things smarter and better than the ones we grew up with? I can imagine scientific theories being part of our predictive model rather than an output of it – we use the theory of gravity to predict how things will fall. But what about new forms of art? What about thoughts that have never been thought before?

And how many parameters does the adult human brain have? The responsible answer is that brain function doesn’t map perfectly to neural net function, and even if it did we would have no idea how to even begin to make this calculation. The irresponsible answer is a hundred trillion. That’s a big number. But at the current rate of GPT progress, a GPT will have that same number of parameters somewhere between GPT-4 and GPT-5. Given the speed at which OpenAI works, that should happen about two years from now.

I am definitely not predicting that a GPT with enough parameters will be able to do everything a human does. But I’m really interested to see what it can do. And we’ll find out soon.

Posted in Uncategorized | Tagged | 263 Comments

Take The New Nootropics Survey

A few years ago I surveyed nootropics users about their experiences with different substances and posted the results here. Since then lots of new nootropics have come out, so I’m doing it again. If you have nootropics experience, please take The 2020 SSC Nootropics Survey. Expected completion time is ~15 minutes.

Thanks!

Posted in Uncategorized | Tagged , | 20 Comments