Anxiety Sampler Kits

The best thing about personalized medicine is that it’s obviously right. The worst thing is we mostly have no idea how to do it. We know that different people respond to different treatments. But outside a few special cases like cancer, we don’t know how to predict which treatment will work for which person. Some psychiatric researchers claim they can do this at a high level; I think they’re wrong. For most treatments and most conditions, there’s no way to figure out whether a given sometimes-effective treatment will work on a given individual besides trying it and seeing.

This suggests that some chronic conditions might do best with a model centered around a controlled process of guess-and-check. When it’s safe and possible, we should be maximizing throughput – finding out how to test as many medications as we can in the short time before we exhaust our patients’ patience, and how to best assess the effects of each. The process of treating each individual should mirror the process of medicine in general, balancing the need to run controlled trials and gather more evidence with the need to move quickly.

I don’t know how seriously to take this idea, but I would like to try it.

Some friends and I made thirty of these Anxiety Sampler Kits, containing six common supplements with some level of scientific and anecdotal evidence for treating anxiety (thanks to Patreon donors for helping fund this). The 21 boxes include three nonconsecutive boxes of each supplement, plus three boxes of placebos. They’re randomly arranged and designed so that you can’t tell which ones are which – I even put some of the supplements into different colored capsules, so you can’t even be sure that two capsules that look different aren’t the same thing.

Each box contains enough supplement for one dose, and all supplements are supposed to work within an hour or so. Whenever you feel anxious, you try the first non-empty box remaining. Afterwards, you rate how you felt on the attached log (not pictured). When you’ve finished all twenty-one boxes, you fill out a form (link is on the attached paperwork) and figure out whether there was any supplement you consistently rated higher than the others, or whether any of them were better than placebo. If your three highest ratings all went to boxes which turned out to contain the same supplement, and it did much better than placebo, then you have a strong argument that this is the best anti-anxiety supplement for you.

(this setup isn’t quite as irresponsible as it sounds. The six supplements I’m using are all considered very safe. I’m not concealing which six supplements are in it – it’s magnesium, 5-HTP, GABA, Zembrin, lemon balm, and l-theanine – so you can check if you have allergies to any of them. And there’s a spoilers page available if you have a bad reaction and need to tell your doctor what caused it)

Also on the form is a link to send me your data, which I’m asking you to do as a condition for using the kits. I’ll add everything up and this will double as an n = 30 placebo-controlled trial of six different supplements. I don’t think n = 30 is enough to impress anybody, but it might be enough to get some informal hunches about what works and be able to give people better advice. And if the experiment goes well, I can always make more kits.

If you live in the Bay Area, have enough anxiety that you expect to use a sample at least two days a week, and are okay with self-experimentation, these kits might be for you. Starting tonight I’m leaving a box full of them at the Rationality & Effective Altruism Community Hub, on the ground floor of 3045 Shattuck, Berkeley. REACH is usually open (or contains people who will open it if you knock) at all reasonable hours, and the caretaker there is aware that people might be coming in to get these kits. If you notice the box is out of kits, please comment here telling me so and I’ll add an update so people don’t waste their time. [EDIT: All out of kits, sorry! Once I have gotten results I might make a new batch.]

Remember that by taking a kit, you’re saying you expect to have anxiety that you’d be willing to experiment on at least twice a week (it’s okay if it doesn’t work out this way exactly) and you’re committing to – if you’re able to finish the test – sending me a form with your results. People who are pregnant or nursing, who have relevant preexisting medical conditions, or who are already taking potentially-interacting medications should talk to their doctor before trying these kits. I will not give you medical advice about whether these kits are safe for your specific situation, so please don’t ask. If you would be comfortable taking a random supplement you got off the shelf at Whole Foods, you should feel comfortable with everything in here.

I might take this idea further, but I’m going to wait until the first set of results come in. If you are interested in taking this idea further, send me an email and let me know your thoughts.

  1. correlatedresiduals says:

    Any chance you’d consider distributing or shipping these outside the Bay Area? Or would you mind sharing the dosages of the supplements you cite?

    • Guy says:

      Yes please. Good DIY instructions before November 1st ==> 30% chance I will make at least one for my wife before 2019, 20% I will make more and make them available somehow in Seattle area.

    • Aevylmar says:

      I want to participate, but Berkeley is far. I strongly approve of this request, and wish to add that I would be willing to pay money to participate.

      (And, also, in general, really approve of this idea of Scott’s and hope it works.)

    • Scott Alexander says:

      This is Phase 1. If it’s very popular and nobody else steals the idea first, I could probably end up turning this into something more company-like, at which point I would work on being able to ship and distribute them.

      But if you don’t want to wait, the doses are (three numbers because sometimes the three samples included have different doses):

      5-HTP 100, 100, 200
      GABA: 500, 750, 750
      L-theanine: 200, 400, 400
      Lemon balm: 600, 600, 1200
      Magnesium glycinate 200, 450, 450
      Zembrin: 25, 25, 25

      Please contact me if you end up doing this so I can make sure we’re getting comparable data.

      • Aevylmar says:

        Any suggestions for how to blind yourself? I wouldn’t be inclined to think it would be easy, even if possible.

        • Error says:

          Gwern has done this. Google for “gwern blinding yourself” and check the first result. I tried to link the relevant section but the spam trap ate it.

          • Douglas Knight says:


            But how to blind myself? I used my pill maker to make 9 OO pills of piracetam mix, and then 9 OO pills of piracetam mix+the Adderall, then I put them in a baggy. The idea is that I can blind myself as to what pill I am taking that day since at the end of the day, I can just look in the baggy and see whether a placebo or Adderall pill is missing: the big capsules are transparent so I can see whether there is a crushed-up blue Adderall in the end or not. If there are fewer Adderall than placebo, I took an Adderall, and vice-versa. Now, since I am checking at the end of each day, I also need to remove or add the opposite pill to maintain the ratio and make it easy to check the next day; more importantly I need to replace or remove a pill, because otherwise the odds will be skewed and I will know how they are skewed. (Imagine I started with 4 Adderalls and 4 placebos, and then 3 days in a row I draw placebos but I don’t add or remove any pills; the next day, because most of the placebos have been used up, there’s only a small chance I will get a placebo…)
            This is only one of many ways to blind myself; for example, instead of using one bag, one could use two bags and instead blindly pick a bag to take a pill out of, balancing contents as before. (See also my Vitamin D and day modafinil trials.)

        • Scott Alexander says:

          Get a 21 pill box and a friend. Ask the friend to look at this picture (WARNING: spoiler for the kit, please do not click here unless you are absolutely sure you will never use this kit or any future ones derived from this) and arrange it according to those boxes without telling you.

          The actual kit puts a little work into putting some pills into different capsules or getting different-looking types of the same pill so you can’t recognize them by shape, but if you just avoid memorizing what pills look like you should be mostly okay without that.

      • For an easier solution, could someone coming to our next meetup bring one or more kits to hand out to people at the meetup who are not near Berkeley?

  2. Peter Gerdes says:

    Whoa, I’m both so glad you did this and super surprised. Aren’t you terrified that the medical establishment will view your involvement as part of the practice of medicine thus putting you on the hook for all the medical trial approvals that are needed? I mean I’d have assumed that despite the fact that we should prefer if trained medical professionals did this kind of things they’d likely be the ones who could face consequences from disciplinary boards.

    Initially I was going to suggest you ask for volunteers to both purchase and randomize supplements so you can increase n…but then it occured to me that it would be easier for you just to accept pre-registration of anyone claiming to replicate this (e.g. let people email you once they ready to start handing supplements out).

    • Scott Alexander says:

      These are supplements. You can buy pretty much all of them at Whole Foods.

      • Peter Gerdes says:

        I know they are and it SHOULD be fine but just because they are OTC doesn’t mean suggesting someone take them isn’t the practice of medicine no? I mean when you recommend a patient drink more water in your role as a doctor you are practicing medicine.

        Anyway I’m sure you know more about this than I do so I assume you know what you are doing.

        I guess I just assumed someone would have added some unreasonably broad ethics rules into the AMA bylaws (or whatever the appropriate organization is) governing any kind of human subject testing. Guess I may be too pessimistic.

        • Murphy says:

          ” I mean when you recommend a patient drink more water in your role as a doctor you are practicing medicine.”

          You would be. In fact if you had an anorexic patient who was water loading and they took your advice to heart and died you could get in a great deal of trouble for advising “drink more water” as a doctor.

  3. Peter Gerdes says:

    Seems to me someone should be able to turn this into a business. You ship n different kinds of products to their address and ask them to strip identifying information and place n_1 mg, m_2 mg… m_k mg of those products into some permutation of boxes numbered 1..k which they then ship to you. The website could even register description of your experiences as having been submitted before you viewed the key.

    • Scott Alexander says:

      Or you just ship them the kits.

      • Peter Gerdes says:

        If you just wanted to do one kind of kit yes. I was thinking something that would accommodate any kind of blind testing of legal substances. Whether it is wine taste testing or crackers or whatever. This wasn’t a suggestion for you personally just a vague thought that it could be neat.

        I mean there are a lot of blind tests I’d like to do that involve other products but it’s a pretty big pain to get a friend to do it without accidentally letting on and that doesn’t let you expand it to more people easily. A website that did this generally could just aggregate all the people who bought randomization packet X and reported their results before requesting their key.

        I mean I’d like to see little twitter mems buzzing around ‘You think you know wine/cheese/etc..’ but can you pass the X blind taste test challenge.

        • Erusian says:

          Services already exist that do this as a form of consulting. Many companies have internal divisions too.

        • The test I want to do is to see if I really prefer Coke Zero to Diet Coke to Diet Pepsi or only think I do. But I could set that up pretty easily myself if I really wanted to.

          Should probably include the caffeine free versions as well.

          • Lambert says:

            I think that kind of thing is highly contextual, in a way that is not easy to expeimentally study.
            Like how Pepsi does better on blind taste tests of a 100ml sample, but worse for drinking a whole cup or two.

  4. baconbits9 says:

    I think this is just a really neat idea, and sounds like a solid implementation. Kudos.

  5. tcheasdfjkl says:

    this sounds really, really cool. I currently have a chill enough schedule that I’m not sure I’ll have anxiety twice a week in the next few weeks, so unfortunately I’m probably not the right person to test these, but if this thing works well, I’ll be very interested in trying it when I go back to work full-time (which is an environment I get a lot more anxiety in).

  6. Axiomatic Doubts says:

    How to increase your sample size: Be an empty individualist

    On a more serious note, that is a very great idea and I will very likely participate. I also have ideas on how to take it further (specifically the data collection aspect of it) which I may email you about.

  7. kaakitwitaasota says:

    I have anxiety that’s been acting up a bit lately (though a recent infusion of wheat bran into my diet seems to have cooled it down a bit, presumably due to the magnesium), but I’m in Mongolia (moved from Chengdu). I’d love to try, but I’m too far away.

  8. 天可汗 says:

    Are the brands known to be good? There was a pretty big fake supplement scandal a while back. (What brands are known to be good / have been widely tested and found to be selling what’s on the label instead of powdered rice?)

    • Douglas Knight says:

      That scandal was about herbal supplements, wasn’t it? With herbal supplements, it’s not even clear what’s supposed to be in the supplement, making testing difficult. I haven’t heard of such a scandal for specific chemicals, where testing should be easy. Of the 6 supplements Scott used, 4 are specific small molecules. Of the 2 herbal supplements, one is a specific brand, Zembrin, leaving only lemon balm of concern.

    • Scott Alexander says:

      I’m using Nootropics Depot, which I 100% trust, and NOW Foods, which I think is the biggest supplement brand. Both of them have been tested by and found to be legit.

      I am pretty skeptical of that big fake supplement scandal. I thought I heard that later some people checked and found that the test they did to find everything was fake constantly threw up false positives of things being fake and was totally inappropriate. Also, tests all the major supplement brands, and although occasionally one of them will get the dose off by 20% or something, not one of them has ever ended out completely fake in the way the scandal suggested that many or most supplements might be.

      I’m still not sure what to think of this, but pretty confident the brands I’m using are for real.

  9. b_jonas says:

    This is a really good idea. Obviously I won’t personally participate in it. But if a doctor offered me that I can participate in an experiment organized as well as this one, but for a type of medication that I need, I’d gladly take the offer, and pay for the costs.

    I’d recommend one change. Can you consider to print the spoiler page, put it into a sealed envelope, and give it to those and only those patients who are certain they won’t open it unless it’s necessary? If I participated that experiment, then I would try to print the spoiler page and put it to a sealed envelope like that, because not having the spoiler with me would increase my anxiety. Only if I printed it myself, then I couldn’t avoid accidentally reading some of the spoilers. If instead you printed the spoilers, I’d gladly pay your time and costs doing that.


    RANT follows about my personal medical issues.

    On the other hand, the medications I’d currently need are slowly acting ones, for conditions where it’s much more difficult to evaluate which medication is effective, so it’s borderline impossible to make a proper experiment for it.

    I take allergy medication. I am allergic to mosquito bites, some unknown seasonal pollens (mostly late in the summer), and possibly also some unknown food. I’ve had three medical tests to try to directly figure out what I am allergic to, but besides mosquito bites and possibly wasp bites, there’s no positive results. How much allergy symptoms I get depends not only on the allergenes I’m exposed to, but also on the general state of my health and immune system. Thus, even though I consistently take the same allergy medication for years (except during the winter), the strength of my allergy symptoms vary a lot from day to day, month to month, and even year to year. I’d pay for the expenses of a really well-designed experiment to find the best allergy medication for me, but such an experiment would have to span two to five years, and I’d only do it if there was some really simple way to evaluate my allergy symptoms that doesn’t take too much time or effort for me to consistently do for a long time.

    I also take hypertension medication. For this one, it’s completely impossible to figure out a good experiment. Obviously just like for allergy, my blood pressure varies from hour to hour, day to day, month to month, and year to year. Unlike allergy, these changes are not connected to obvious external factors like the weather. Instead, they’re factors I’m deliberately trying to improve, as in, reducing my body weight, reducing my stress level in various ways, doing more sports and spending more time walking, eating more healthily. I clearly wouldn’t wnat to try the type of controlled self-experiment where I deliberately try to keep my hypertension at the same baseline, because that would mean keeping my general health worse. Improving my general health to reduce hypertension (and other health problems) is currently much more important in the long term than finding the right medication. Further, I also wouldn’t want to measure my blood pressure every day. I’d be very hard to convince to even measure it every week, even though I have a perfectly working automatic blood pressure meter at home. But without that, it’s impossible to properly evaluate how well a hypertension medication is working. So currently I’ll just go with whatever my GP doctor’s medical intuition tells him from the little data that’s available to him.

    One case where a fast-acting medication is necessary to me and there’s a choice of more than one is fever. I have a high fever once or twice each year, spanning a few days each time, so I take between eight and sixteen pills for fever each year. There are four types of fever medication available to me: Algopyrin, paracetamol, ibuprofen, and aspirin. Fever pills are fast-acting: I am allowed to take at most one pill each time, in a spacing of at least four hours between any two pills. I can also choose to take a cooling shower or not to reduce fever. Fever symptoms are easy to measure: measure my body temperature with a mercury thermometer in my armpit immediately before taking the pill, and two hours after taking the pills, take the pill only if my body temperature is over 37.5 °C, and evaluate how the pill and shower reduced my body temperature. When I do have fever, I am willing to take pills and measure my body temperature often. Thus, there’s enough data available to measure how effective each type of fever pill is for me. I am measuring these symptoms. In fact, since I had fever significantly more often when I was a child, I already have consistent data about Algopyrin, paracetamol, and aspirin. The result is that Algopyrin is consistently more effective for reducing my fever than the other two types of pills, paracetamol is still more effective than aspirin or nothing, and aspirin is useless. I have not started to consider ibuprofen as a fever medicine until a few years ago, so I have less data available, but I already know that it’s more effective than placebo. This data is useful, because now I choose to take Algopyrin when my fever is very high, because that’s when an effective fever pill matters more for my health than obtaining more data.

    But the problem is, except for the cases when my fever is very high, I don’t want to choose the fever medication that’s the most effective. Algopyrin, ibuprofen, and paracetamol are effective enough for those cases, and the small difference in their effect is not important. What does matter is the short and long term side effects of the pills, and how they interact with other medications. I have gathered a lot of non-personalized information about those side effects and interactions from doctors. There’s a very ugly tradeoff there. Paracetamol is said to be the most dangerous in long term side effects, beacuse there’s a small risk that it damages my liver in a way that is impossible to detect in advance, but can cause my liver to suddenly stop working, which has a high chance of fatality. The other three medications won’t cause any such unexpected fatality, but they can accumulate some slight damage in my liver or kidney or stomach in the long term, which is the main reason to only take the medications when I actually have fever. Paracetamol does not interact at all with any other medications I take, as long as I keep the rule that I don’t take more than one fever pill or other NSAIDs within four hours. The other three medications, however, can weaken the effect of hypertension pills, and have various strange effects to other medications. Thus, if taking other medication is important, or I have already taken some, then I have an incentive to take paracetamol over other fever pills. As long as I’m not taking other medications, algopyrin has no other short-term side effects, but some of the others can upset my stomach in the short term, which is unfortunate because I often already have stomach problems when I have fever. My current strategy is to take paracetamol if I must because of other important medications, otherwise algopyrin if my fever is very high, otherwise randomly ibuprofen or algopyrin. This lets me gather more data about ibuprofen while avoiding other risks. This strategy may change in the future, especially if I find that my kidney or stomach already has too much long term damage, or other health problems.

  10. mondsemmel says:

    Re: Your Patreon: I’m not a patron, but IIRC you charged per post at some point but now charge monthly. However, your Patreon intro still contains this line: “I usually write about ten blog posts a month, but I won’t charge you for open threads and meetup announcements and such.”

  11. Murphy says:

    A few issues:

    1: are people supposed to take them in order each day? that design of pill box is easy to get out of order. maybe number the boxes 1-21.

    2: probably doesn’t distinguish well between individual things being effective and pairs being effective since some are likely still in your system from one day to the next.

    I get that you’ve had a bad experience with trying to do proper publishable research in the past but I’m inclinded to repeat the comment I made about that re: if you want the results to actually be useful to anyone else.

    I don’t know if Scott or others will be interested in this but there’s someone I work with who’s job is managing human research such as large drug trials worth 10’s to hundreds of millions of $ . One of the things she does is go in and rescue large trials which are on the rocks. So figuring out the sources of problems and getting things back on track.

    I got chatting to her over a few drinks about regulation of research in different countries since she regularly manages international projects that have to deal with regulations in the US, Europe, UK and elsewhere.

    I brought up the outline of scotts old post “My IRB Nightmare” to get her opinion on it and she had some bullet points a bit different to most of the issues raised on here back when that topic was new.

    Mostly I’m paraphrasing but there’s a few direct quotes in there. (though those are also from memory so are only close-enough)

    1: Yes the US way of organizing IRB’s is flawed and has been fixed in the UK. Having a local IRB who are themselves sometimes extremely inexperienced and rarely having to deal with any proposals leads to problems because the IRB’s in places where little research is done often barely know what they’re doing or are supposed to be doing. It’s dramatically improved by having IRB’s deal with a steady stream of proposals from outside their institution so that the board members are more experienced… and so that they don’t feel they should spend all day thinking of “helpful” suggestions for this one proposal because it’s the only one they have in front of them for the entire meeting.

    Her other points were less charitable.

    2: I remember in one of scotts old posts talked about “types” he encountered in his job constantly. Like grandmothers who’s children won’t let see their grandchildren to extort money.

    Clinicians who sort of want to “dabble” in research, want to do it all themselves and who get overwhelmed and can’t push a project through are a *Type* she encounters constantly and mostly their research is not terribly well designed or useful.

    When I outlined scotts research as described in the post her opinion of it was “a fuckin waste of everyone’s time resources and money. he should have done it properly or not at all”

    To flesh that out, had scott got his final results how useful would it have been to other clinicians?

    not very.

    Had Scott actually got the final results it wasn’t particularly generalisable. It would have told you that the results of Dr. W’s assessments did not line up with the results of this questionnaire… but not much else.

    Perhaps he has some systematic pattern in how he diagnoses people. Who knows. But by one way of looking at it your sample size could be described as “n=1. Dr W” because he was doing all the assessments.

    or one interjection of hers “contrary to the beliefs of many clinicians, their consultant is not God”

    3: almost every problem or delay Scott encountered had almost no relation to regulation. Almost every single thing was hurdles put in place by scotts own employer, the company that ran the hospital chain. In her opinion this may have been partly by design. because they don’t really want their doctors spending work hours and resources on research that doesn’t yield much profit for the organization. They may talk about wanting people to do research but what they actually mean is they want people to bring in grant income. If you’re not bringing in grant income they probably don’t really want you doing research.

    4: Her point was that in reality the actual real regulations that are actually in place are much much more minimalist and reasonable than most people believe.

    When she’s called in over a large failing project one of the conditions she sets is that the most senior person linked to the project must personally read the actual laws/regulations about doing the kind of clinical trial they’re doing. Not a summary, not someones opinion piece on it. The actual root text.

    They can take a couple days to do it but it’s entirely tractable to do so in “a couple of evenings”

    Because a common problem she encounters is chinese whispers. People who are sure that the regulations say X must be done when in reality the real regs say that you should consider doing X if there is reason to do so and that gets filtered through layers of people and you end up with someone insisting that there’s no choice, that X must be done because the rules say so.

    Often the fix for a failing project will be to force people to actually read the rules themselves, not delegate or hope someone else will do the work, then to submit a reasonably sane fix to the project to be approved that’s based on the real rules.

    Scotts employers were apparently falling into this hole hard.

    Also, a suggested rule of thumb, if there’s something like that that people don’t seem to want to do within your organization, try to find someone who’s tried before and buy them lunch in exchange for asking them what screwed up their own past attempts.

    5: Scott talked about a single paper and doctors he worked with using the questionnaire clinically but did Scott ever do any kind of systematic literature review to show that it was being used generally, not just in a couple of hospitals?

    If it was being used regularly and Scott could prove that did Scott do any kind of systematic review to show that there definitely wasn’t any more existing research actually showing the questionnaire to work.

    6: If Scott has in fact done both of those then, to quote, “why the fuck didn’t he contact a real academic in an institution specialized in research to get a real grant and a proper multi-center trial done so that any results would actually be useful to anyone”

    Also, some states have stricter rules for doing active-intervention human-subject research of any kind.

    That would likely include OTC supplements.

    You might want to make sure none of your participants are in any of those states or you could run afoul of the law.

    • baconbits9 says:

      This is a common problem with centralization, the experts at the center understand (to be charitable, and manipulate to be uncharitable) the system well enough to know the ins and outs and gets to define the outputs. It is like the designer of a maze laughing at people who miss the trapdoor in the 2nd room that takes them down through an underground passage so they can skip half of the rooms. “If you had just spent a few hours reading the manual you would have noticed that I specifically said ‘it is not necessarily necessary to go through every room (although it might be) there are situations in which a careful consideration of all objects in a room may be fruitful’.

      • Murphy says:

        You know how IT people get frustrated when people fail to follow The Algorithm?

        Bonus: when people blame “everything” being “broken” on some kind of conspiracy of people intentionally breaking their computer.

        People who have actually bothered to read the regs on doing drug trials tend to feel the same way.

        It’s literally a case of RTFM where TFM isn’t even terribly big. People will blunder around for many many many times longer then it would have taken to RTFM trying to avoid reading TFM and waste time, resources and money.

        If you spend a year lost and dying of thirst in a 20-yard-wide hedge maze when you were offered a map of the maze and books on effective maze-navigation algorithms and had the option to RTFM the whole time (but didn’t ever avail of it) and the option of going in with a guide… that doesn’t imply an evil conspiracy of guides.

        It makes you that old guy who types with 1 finger, blames whoever last touched his computer of “breaking it” whenever anything he doesn’t expect happens and calls IT to rant that “it’s broken again” when the screen is turned off.

        The individual I talked to has never worked for the regulator, they’re not deeply connected, on paper at least they even started off deeply unqualified for their job, they’re just the kind of person who religiously followed the creed of RTFM their whole life and applied it when they went into human research.

        But as in almost every job, reading the manual rather than just kinda hoping that someone else will do it for you is apparently some kind of superpower.

        • baconbits9 says:

          It is interesting that you use the term religious to describe this person, while also noting that things are guidelines and not hard rules. The original post basically took Scott (or his organization) to task for treating the guidelines to religiously, if 10% of people totally get the guidelines and 90% of people flounder and the guidelines are written for the 90% of people then the default is that the guidelines are the issue. When a small percentage of people claim that a map is super easy to follow they are almost always like minding.

          • Murphy says:

            90% of the people in an organization can be people who type with 1 finger. That doesn’t mean that when they download and run any email attachment that promises to let them see a celebrities tits… that IT is in the wrong because so many of their users ignored the weekly pleas saying “please don’t download and run email attachments, they’re probably virus’s”

          • baconbits9 says:

            Since when did anyone wanting to do research become part of an overreaching organization?

          • Gazeboist says:

            Would you prefer “callers to help centers”? “People who joined the IRC/listserve/etc to ask exactly one question”?

    • Douglas Knight says:

      In her opinion this may have been partly by design. because they don’t really want their doctors spending work hours and resources on research that doesn’t yield much profit for the organization. They may talk about wanting people to do research but what they actually mean is they want people to bring in grant income. If you’re not bringing in grant income they probably don’t really want you doing research.

      Yes, that’s the key, although people at research universities often have bad stories about IRB, though less worse.

      But that’s a pretty narrow conception of “actually be useful to anyone else.”

    • Scott Alexander says:

      > “Probably doesn’t distinguish well between individual things being effective and pairs being effective since some are likely still in your system from one day to the next.”

      I agree this could happen in practice, but I think it’s so rare for medications to significantly interact at low doses (like the one that would be in your body after a day) that it won’t matter in real life.

      Most of your friend’s argument seems to reduce down to “Well, if he spent a really long time learning all the rules, and got a grant, and did a much larger multi-center trial, and worked together with a bunch of compliance officers, things probably would have been fine”. I agree that given infinite work and energy things could have been fine. My complaint isn’t that literally no human being has ever successfully navigated an IRB, it’s that IRBs lock out small individual amateur researchers who have a hunch and want to check it out, leaving only professionals backed by giant institutions who want to make a career of it. I was required by my residency program to do research, with zero institutional support, and in addition to a fifty-hour workweek of clinical duties. “You should just hire a team, apply for a grant, and run a multi-center trial” isn’t really a solution to my problem.

      • Murphy says:

        You’ve got the wrong end of the stick.

        Put on the same hat you wear when thinking about effective altruism.

        Ultiamtely, what was your goal doing research?

        Was it to get a little feater in your own cap reading “has done research! Gold star!” or was it to to generate utilions in the form of showing that a piece of standard practice is ineffective/misleading/harmful in an effective way such that other doctors will notice, change their practice such that patients benefit in the form of more accurate diagnosis’s and less wasted resources overall?

        This is as important a question as “what is your goal giving to charity”. If your goal is to help people vs your goal is to generate warm fuzzies for yourself that dictates different paths.

        It’s the science equivilent of someone deciding they’re gonna [Do Charity] to feed starving orpahns then going off and registering a charity and collecting money and then carefully selecting fresh organic vegetables at their local store…. then being surprised that shipping them to africa is actually pretty slow and by the time they get there all that’s left is rotten mush.

        The same applies when deciding how to approach research.

        There’s litterally millions of crappy little research papers that generate exactly zero utilions for anyone. when you take a look at them within a few minutes you find that their findings are useless.

        Bob had a hypothesis but bob designed his study such that nobody reading it can really trust the results. Perhaps it was an intervention and he split the trial and control arms based on gut feeling rather than doing it as an RCT, perhaps everything hinged on some judgement call and Bob was the only one making that judgement call every time and amazingly, bob, who was already of the belief that XYZ didn’t work… by his judgement call showed that XYZ didn’t work. etc etc etc.

        On net some of those papers may make the field worse because many have crappy methods that may mislead all 3 eventual readers and even the neutral ones just increase the noise level and make finding the informative papers harder.

        No doctor is changing their practice based on them, no patient is benefitting. Those papers did nothing but generate warm fuzzies for the writer.

        Best case scenario, had you powered through and completed your paper: what do you believe the effect would have been? Would doctors have seen it and said “I am doing this wrong!”, would anyone writing guidelines have looked at it and said “hmm, we should make sure to take this into account”

        No. Because N was effectively 1.

        Sometimes people take the position that it doesn’t matter if *this* paper is crap, it’s raising awareness or generating conversation or … you get the idea. Hoping that someone else will notice it and be inspired to do the hard work of doing it properly is the “when I’m a big movie star” or “when my app makes me a million on the app store” or “like and share” of research.

        There are pilot studies but the worthwhile ones are still done in methodologically sound ways and often with a big study planned right from the start depending on results of the pilot.

        You need to do it well if you want to generate utilions.

        There’s millions of ways to make a research paper into toilet paper, “researcher” is a profession for a reason and involves learning some of the pitfalls.

        While I agree that the world would be a better place if people, in general, took a more scientific view of the world…. that doesn’t automatically grant all research bonus utilions.

        Yes, it’s not reasonable to expect you to run such a trial as a student. That’s why you bring it to a specialst. And if you can’t get them interested in your research idea then joining someone elses big project as Assessor 14 in center 8 for a big well-run trial that’s recruiting and being a small cog in a machine that actually matters can generate a lot more utilions than adding one more paper to the kilometer tall pile that already get auto-filtered out in the first round by those services which highlight important research papers to doctors that are worth the minutes it takes to read them.

        • Scott Alexander says:

          You’re acting as if we were arguing over whether bad research should be done, which is a totally different question about whether research should be done by people with few resources. You haven’t made an argument that being able to fill out two thousand pointless forms makes you a better researcher; almost none of the training or the forms are about how to do a study well, and almost all of them are about whether the study is “ethical”.

          I don’t especially want to spend time defending my study (though I want to register that I do have some defenses and disagree with your view of it), because even if my study were bad, that’s my fault personally and not very correlated with the fact that I’m a small researcher rather than a big-time one. Many small researchers do great work (including, as I mentioned in the post, me winning prizes for later research that did not involve an IRB). Many big-time researchers do terrible work. If small researchers are locked out of the scientific enterprise because of very high transaction costs, that hurts the good ones and bad ones alike.

          I actually think it’s much worse than this, because the researchers who are able to overcome the transaction costs are the ones with a big corporation behind them (eg a drug company) and researchers who can’t are more likely to be small individuals who notice that the drug company research is wrong or misguided and want to fight back against it. The IRB filter doesn’t lock out people who are going to do a bad job, it locks out people who can’t raise lots of money and get an institution behind them.

          I think it also contributes to all the problems you’re talking about. Make recruiting each subject into a nightmarish swamp, and you’ll use fewer subjects. Make adding each investigator require a few weeks worth of bureaucracy, and you won’t consult other people who could help you. Force people to use weird, low-tech recording methods, and you’ll record fewer variables. It’s been a long time and I don’t remember the exact situation, but plausibly the reason we didn’t have more doctors doing the diagnosing was because of the trouble convincing them to go through all the bureaucracy of being official study members.

          Your argument sounds kind of like “having a business-unfriendly climate doesn’t matter, because lots of businesses fail”. If you make people wait a year and pay tens of thousands of dollars to get a business license, you won’t prevent businesses from going bankrupt, you’ll just remove dynamism from the system and protect existing interests.

          • Murphy says:

            As mentioned in my top post: most of that insane beauracracy wasn’t anything to do with the actual regs, most of it was your own employers chinese whispers.

            Lots of small researchers have an easy time because they start off around people who know how to do things in an insitution where it’s done regularly.

            How many days/weeks/months did you spend wrestling with it to avoid a couple of evenings reading the actual rules yourself?

            there was a guy who turned up on reddit-bitcoin a while back, he’d got a windfall suddenly from bitcoins and had decided to set up a company selling stuff.

            A year in he suddenly realised he needed to pay taxes and he’d neglected to track any transactions for sale tax or keep any reciepts. He had no actual account kept. Was “beauracracy” the cuase of his failure? or was it a failure to spend a few days before he started finding out the rules and what he needed to do.

            It ended much worse for him than your study ended for you.

          • Murphy says:

            Thinking on this some more: I need to re-stress. You had no actual contact with any government regulator.

            Imagine you were part of a weird little pseudo-Luddite community co-op type organization. Something like the Amish.

            Now imagine that you’d decided you wanted to sell some jam from fruit in your garden.

            You have no experience with running a small business or the rules and start off thinking that you’ll just keep track of your sales in an excel sheet and pay any taxes due at the end of the year. But you don’t know the actual law, you have no idea of the actual rules.

            The community/co-op is clear that since you’re part of the organization they don’t want any tax-fraud etc getting linked to their co-op so they assign Methuselah from the co-op to make sure you’re compliant with the law.

            Methuselah unfortunately also doesn’t know the law but in 1954 he was involved in a business with poor records that was audited and he distinctly remembers that the auditor at the time commented that there needed to be accounts with all transactions written down.

            So he rejects your excel spreadsheet. They’re not written like the auditor said they needed to be.

            You go away and quickly re-write it all.

            Methuselah unfortunately distinctly remembers that the auditor in 1954 complained about some of the records that were there not being fully legible.

            So Methuselah rejects your hand written accounts. He says the penmanship isn’t good enough. Methuselah happens to have a thing for calligraphy.

            You go away and re-write your years accounts, this time in beautiful calligraphy.


            Now Methuselah finally deigns to look at the actual content of your accounts.

            He notices you missed a receipt from February.

            Methuselah unfortunately distinctly remembers that the auditor in 1954 said it was good practice to have sales recorded clearly in order of date.

            He makes you re-write all accounts with new totals from February on.

            This goes on, back and forth and back and forth until you give up and decide to never again try selling jam and just post all your jam-money to the IRS with a note reading “just take it all, I’ve spent more hours than it’s worth trying to comply with tax law on this”. Because the evil regulations and unreasonably requirements of the auditors made it too hard to set up business. Really you think: the government is destroying business in this country.

            You go to a bar (your sect of pseudo-Amish frown on alcohol but the ordeal has driven you to drink) and gripe about what a pain it is complying with all the unreasonable demands from the government for selling goods.

            Other people at the bar who sell goods generally mumble in general agreement. Some because they had a run-in with Methuselah, some because they just don’t very much like keeping basic accounts.

            Meanwhile when an actual IRS inspector come to audit some business that’s felt Methuselahs touch they just roll their eyes and wonder why these weird pseudo-amish people insist on keeping all accounts on paper in longhand calligraphy when a nice printoff from an excel sheet would be much easier to audit.

          • Nancy Lebovitz says:

            This situation of guessing at the law (and some of the guesses being made by people who are wrong but sure they’re right and have local power to enforce) is partly the result of big scary organizations with rules that are hard to learn.

            The rules are publicly available, but they’re buried in a lot of material that’s hard to focus on.

          • Murphy says:

            @Nancy Lebovitz

            Sure, I agree. I suspect that if there was more research there would be better basic knowledge of the real requirements just like how millions of corner shops manage to keep reasonably in line with basic tax and business law despite tax law being famously vastly more complex.

          • Scott Alexander says:

            I can’t remember what percent of my institution’s rules I read, but I think it was a pretty high percent. I don’t know why you keep assuming I didn’t read them and that all of my problems would have been solved if I had.

            I don’t know why you keep stressing that my IRB wasn’t the government. IRBs are never the government. They’re always specific institutions. Many people responding to my post said my IRB was typical of IRBs they’ve worked with as well, so it’s not like my institution was just some freak outlier in a world of perfectly reasonable IRBs.

            I think a better situation to compare it to would be this: imagine your local homeowners association doesn’t let you spray WD-40 on your creaky hinges without a $1000 impact assessment from a professional assessment company. You complain about this online, and everyone else says their homeowners association is also like this. Then someone comments “Well, that’s not an official US law, that’s just what some bad homeowners association are like. Also, I bet you never read any of the relevant documentation. Lots of professional government contractors do a great job repairing things in compliance with all the relevant laws, because unlike you they read documentation. You’re probably crappy at repairing houses anyway. And also, have I mentioned a homeowners association is different from the federal government?”

          • Lambert says:

            >Before the law sits a gatekeeper. To this gatekeeper comes a man from the country who asks to gain entry into the law. But the gatekeeper says that he cannot grant him entry at the moment…

          • brmic says:

            I can’t speak to the amount of telephone involved, but I wholly agree with Murphy that the IRB nightmare study was terrible and think that 90% of the obstacles are perfectly legitimate from my POV. I certainly think that the ‘nightmare’ is 90% down to poor preparation, poor planning and poor understanding of the process. The other 10% is bureaucracy being unpleasant.

            I have difficulty remaining charitable when an unqualified goof complains about having to demonstrate knowledge of research ethics or having to have a PI or having to fill out 20 pages. Likewise, claims about the horrible burden of having to say no, we don’t remove organs on a form that is obviously intended to cover a wide range of study types strike me as overblown.
            Likewise, the study name needs to be on the consent form so that it’s clear and unambiguous later who signed what and neither patient nor researcher gets to claim they actually signed something different. (I’ve seen both happening.) Risks need to be on the form so that any unforeseen adverse event which occurs later can be clearly categorized as ‘warned about’ and ‘not warned about’ for legal reasons. Yes, this is superfluous for pen and paper, but as soon as e.g. you move to a computer based test, warning of the risk of epileptic seizures can be relevant. (And I have zero sympathy for the people who but don’t warn about this because it might lower participation and at the same time refuse to do a thorough evaluation of the risk, essentially offloading the risk on to the participants/community.)
            Pen is prefered to pencil because the latter can more easily smudge or be erased, so it’s perfectly fine for the IRB to ask why an inferior method of record keeping should be prefered. I think the answer is also perfectly valid and the IRB’s subsequent refusal to accept that is wrong.

            Next, data storage. First up, understand that from the POV of the hospital, these data are useless and strictly a liability. The other records are relevant for treatment, billing or for malpractice suits, so have to be created. Study data doesn’t. The complaint about the locked room is too silly to bother with. The encryption is because in some cases researchers are ethically required to hand over the data upon publication and request for the data. If you have annonymized data, you can do that. You can also delegate the statistical analysis etc. You need to kept the table with real names and IDs around so if someone questions whether you dutifully removed the patient data of someone who changed their mind about wanting to be in the study, you can show patient John Smith had ID 034 and that ID is no longer in your data set. Finally, consider Nov 2016 “Others [newbies] would have written the patient name down on the Results Log instead of the Secret Code Log right next to it.”
            Unsurprisingly, institutions are aware that tasks frequently end up delegated to poorly instructed, poorly motivated underlings and part of point of the procedures is to ensure the system continues to work even then. Another part is to anticipate problems for the inexperienced by making them consider the potential problems in advance.

            Jan 2015: _Of course_ a new co-autor needs to be qualified. The problem is that you intended to lie about their co-authorship and their role in the study.
            July 2015 See above for data storage, consent etc. also, step back for a moment and consider: You can do your study outside the hospital without the IRB, just the laws and your own money. Instead, you want to take advantage of the hospital’s prestige, of patient’s trust in the doctors they’re already dealing with and moan whenever the institution puts some preconditions on that. You know what, get your own (grant) money, advertise for participants in the newspaper and you don’t need to deal with an IRB.

            July 2016

            The various newbies whom we had strategically enlisted had either forgotten about it, half-heartedly screened one or two patients before getting bored, or else mixed up the growing pile of consent forms and releases and logs so thoroughly that we would have to throw out all their work.

            You mean, you were unable to clearly communicate to someone ostensibly above average smart a series of simple steps? Where you unable or unwilling to break this down into a series of steps and a flowchart? A checklist? How dare you complain the IRB expressed reservations that you might be lazy, poorly organized or incompetent? This is like the fifth or tenth example of you not taking this seriously and me wishing the IRB had shut you down earlier and harder.

            Nov 2016
            Throwing away data records because they might cause problems with the audit. You know who else threw away records when the allies came to audit? Further proof the IRB was not strict enough.

            Dec 2016
            Yeah, the first quote is silly, I agree.
            The numbers not so much. (1.) ‘violent’ leaves it open whether calling someone an asshole qualifies. Which potentially might affect results, if one newbie does it one way, the other does it another way. (2.) The points you need to explicitly reassure patients that refusal to participate will not affect their treatment, their relationship to their doctor, the resources spent on them etc. Because ‘demand characteristics’
            (3.) has been covered above. (4.) is inexplicable to me, because I can’t fathom how this would cause you to submit a change to your study.

            So by my tally I get around 13 valid complaints to 3 invalid ones. Which, granted, is 3 too many and also means my estimate was off, it’s closer to 80% invalid complaints and 20% valid ones.
            But it’s before we even get to the point that the study is essentially N = 1 and thus mostly worthless.

    • FeepingCreature says:

      The core concept that weirds me out about this post is the assumption that science stops being valuable if it doesn’t generalize to the entire population or at least a sizeable fraction. It seems like a world in which you can’t do research on dogs because nobody is interested in biology that doesn’t at least generalize to every mammal – people would be looking down their nose on dog research because it’s “n=1”. But science doesn’t stop working just because you’re only looking at one questionnaire in one hospital.

      • Murphy says:

        If I argued against the effective altruism movement, because “charity is still valuable” when someone manages to piss away the equivalent of a few tens of thousands of a charities budget saving an unexceptional 2$ house plant.

        “Value” is not a categorical variable. Something can “have value” while that value remains minuscule.

  12. Deiseach says:

    Sounds interesting. I’m taking magnesium supplements regularly because I find them very helpful with muscle cramps but I have no idea if they do anything for anxiety. Ditto with beta-blockers, I’m on one for reducing blood pressure and again, I have no idea if it helps with anxiety.

    I suppose the problem is that I’m not regularly anxious (very much) but I do get occasional fits of anxiety (e.g. the past couple of days I’ve had the usual ‘crying bouts for no reason, fits of anxiety/nervousness out of the blue’) that don’t seem to be triggered by anything I can pin down (so no handy explanation like “Ah, you were stressed at work and then an hour later this anxiety attack happened”) apart from imagining that maybe it’s something vaguely to do with hormones (or neurotransmitters or who the heck knows what, maybe it’s Mercury being retrograde in my Ascendant sign this week).

    So I’d be no good for an experiment of this kind (can’t forecast reliably when I’m going to have the anxiety attack) but I would be very interested in the results. Like I said, I’m on magnesium and a beta-blocker and still getting occasional anxiety attacks, but would I be even worse if I weren’t taking these? Could lemon balm be a wonder cure? I hope you get takers for this experiment!

  13. Joseph Greenwood says:

    If I lived out there I would happily claim one for my wife.

  14. Hesperos says:

    Very interesting idea as it would be useful to get some real head to head comparisons on these supplements. I am curious what the targeted population is as individuals with anxiety disorders are likely to be on medications some of which are prescribed on an as needed basis. Is the study excluding these people and therefore focusing more on day to day stress as opposed to anxiety disorders? This is not a critical problem as most of the consumers of these supplements have either no or relatively mild and untreated anxiety disorders. It does seem that there will be a problem generalizing to psychiatric patients, a group to whom these supplements are often marketed as alternatives to Rx.

  15. Douglas Knight says:

    Have you done a power analysis? What model of efficacy makes this a good experiment? That there are a few super-responders to (some of) these drugs? I find it hard to imagine a regime in which the effect is strong enough to be detected, yet blinding is needed. Blinding is always valuable, but it sounds like it cost you a lot of time and effort to achieve it.

    I imagine the main topics being studied are compliance and reporting, so maybe it’s too early to talk about power.

    • Scott Alexander says:

      The model where blinding is needed is the one where the first person I had test this kit told me “None of these worked for me except number X, that one was great and really took my anxiety away” and he was super-disappointed to learn that X was one of the placebos.

      • Douglas Knight says:

        That’s not a model. You can tell it’s not a model, because it doesn’t let you do a power analysis.

        This shows that there is a lot of noise, but that doesn’t mean it was a placebo effect, which is what blinding protects you from. And if the noise is symmetric, you’ll get lots of false negatives, too.

  16. ksvanhorn says:

    Oh, anxiety treatment.

    From the headline, I thought this was my opportunity to try out some neuroses I had not yet experienced.

  17. angularangel says:

    “The best thing about is that it’s obviously right. The worst thing is we mostly have no idea how to do it.”

    Mmmm. Words to live by. -_-

  18. David Condon says:

    So there’s a lot I like here, but I still think you’ve got some issues. I would recommend getting a textbook on within-subjects design. Your second sentence is wrong. We already know a lot about how to personalize medicine. Doctors don’t know how to do this, but that’s because doctors generally aren’t trained in within-subjects design. A good person to talk to would be a biologist, neuroscientist, or experimental psychologist; many of whom are trained in this area. My biggest concern is that you probably don’t have enough samples of each treatment condition to make a within-subject comparison. Three samples per condition is an often recommended minimum for simple studies of this type involving few comparisons (I’ve also seen five samples per condition as a recommended minimum). However you’ve got six treatment conditions plus the control to consider, which runs into the multiple comparisons problem. One simple way to improve the quality of this pilot is to drop placebo testing entirely. Simply send out six doses of each of the six supplements. Use this to determine which supplement is most likely to be effective for each individual. Then give them a follow-up package based on their results which is half placebo and half supplement. I would also recommend you consider including a no placebo condition in your data collection procedures to get another estimate of variability and efficacy over time. Yes, there are other issues as well, but I figure you already know about most of them, and plan to correct them during follow-up testing.

  19. Radu Floricica says:

    If this works, please please consider doing the same for quality of sleep. Or just the instructions.

  20. FeepingCreature says:

    For responsibility’s sake, please put a sticker saying “Randomized Controlled Trial, contents: ” and then a link to the spoiler page on each package?

  21. n8chz says:

    I guess my main concern is that when personalized medicine gets practiced on me it will be based on patented genetic tests or proprietary algorithms and therefore will be mind-bogglingly expensive. But then again, what isn’t in the US healthcare sector? I also believe in Newton’s Third Law, and that where these’s smoke there’s fire, and if there weren’t some kind of downside, there wouldn’t be so much public relations in support of the idea. If the idea is so obviously right, I’m guessing that the PR is to soften the public for the reality that it won’t be open source hackers working out of their garages, or academics who are into things like publishing their findings (at least back like in the Jonas Salk days), who figure it out.

    So I suppose it’s like GMO’s. No scientific basis for distrusting it, but the IP model is more than greedy enough for me to feel right about “non-GMO” being a selling point for me.

    • But then again, what isn’t [mind-bogglingly expensive] in the US healthcare sector?

      Aspirin. Band-aids. Antiseptic. Blood pressure monitors. Antifungal liquid for toenail fungus. The Mayo clinic web page. Cough medicine. …

  22. borda says:

    Would there be any added value in having fraternal twins both participate?

    • entognatha says:

      No, in fact it’s potentially a confounder, although probably fine. Typically sets of fratneral twiins are used as controls for sets of identical twins and would be useful if we were trying to see if there was a genetic vs.shared parenting cause for a positive response some supplements, but that’ll not what’s being looked at here.

  23. CheshireCat says:

    Fantastic idea. Count me as another of the commentors hoping for an eventually shippable version. I would be happy to provide whatever data I can gather.

  24. stardust says:

    Note: There is a single kit left at REACH. If you’d been considering stopping by but haven’t yet, you may want to check with me or someone else at REACH ( is a good way to get ahold of someone), especially if you don’t live close by. I’m guessing it will be gone by the end of the day since there is an event here tonight.

  25. roastingcanopus says:

    Do you think it would be worth trying a similar experiment with nootropics? I haven’t seen a similar randomized kit available for nootropics, and it seems like that would add a lot of much-needed rigor to the movement. Ideally the kits would be re-randomized with each purchase and people who bought them would have to upload their notes/scores before learning the actual drugs they consumed on each day.

    That being said, I think this experiment would be harder with nootropics than with anxiety meds. An anxiety drug that makes you feel less anxious works (for you) by definition, whereas, jeez, measuring (tiny) cognitive gains is much, much harder. You need to spend like twenty hours taking IQ tests in order to keep the practice effect from corrupting your measurements.

