SELF-RECOMMENDING!

Should AI Be Open?

I.

H.G. Wells’ 1914 sci-fi book The World Set Free did a pretty good job predicting nuclear weapons:

They did not see it until the atomic bombs burst in their fumbling hands…before the last war began it was a matter of common knowledge that a man could carry about in a handbag an amount of latent energy sufficient to wreck half a city

Wells believed the coming atomic bombs would be so deadly that we would inevitably create a utopian one-world government to prevent them from ever being used. Sorry, Wells. It was a nice thought.

But imagine that in the 1910s and 1920s, some elites had started thinking really seriously along Wellsian lines. They would worry about what might happen when the first nation – let’s say America – got the Bomb. It would be unstoppable in battle and might rule the world with an iron fist. Such a situation would be the end of human freedom and progress.

So in 1920, these elites pooled their resources and made their own Manhattan Project. Their efforts bore fruit, and they learned a lot about nuclear fission; in particular, they learned that uranium was a necessary raw material. The world’s uranium sources were few enough that a single nation or coalition could get a monopoly upon them; the specter of atomic despotism seemed more worrying than ever.

They got their physicists working overtime and discovered a new type of nuke that required no uranium at all. In fact, once you understood the principles you could build one out of parts from a Model T engine. The only downside was that if you didn’t build it exactly right, its usual failure mode was to detonate on the workbench in an uncontrolled hyper-reaction that would blow the entire hemisphere to smithereens.

And so the intellectual and financial elites declared victory – no one country could monopolize atomic weapons now – and sent step-by-step guides to building a Model T nuke to every household in the world. Within a week, both hemispheres were blown to very predictable smithereens.

II.

Some of the top names in Silicon Valley have just announced a new organization, OpenAI, dedicated to “advancing digital intelligence in the way that is most likely to benefit humanity as a whole…as broadly and evenly distributed as possible.” Co-chairs Elon Musk and Sam Altman talk to Steven Levy:

Levy: How did this come about? […]

Musk: Philosophically there’s an important element here: we want AI to be widespread. There’s two schools of thought?—?do you want many AIs, or a small number of AIs? We think probably many is good. And to the degree that you can tie it to an extension of individual human will, that is also good. […]

Altman: We think the best way AI can develop is if it’s about individual empowerment and making humans better, and made freely available to everyone, not a single entity that is a million times more powerful than any human. Because we are not a for-profit company, like a Google, we can focus not on trying to enrich our shareholders, but what we believe is the actual best thing for the future of humanity.

Levy: Couldn’t your stuff in OpenAI surpass human intelligence?

Altman: I expect that it will, but it will just be open source and useable by everyone instead of useable by, say, just Google. Anything the group develops will be available to everyone. If you take it and repurpose it you don’t have to share that. But any of the work that we do will be available to everyone.

Levy: If I’m Dr. Evil and I use it, won’t you be empowering me?

Musk: I think that’s an excellent question and it’s something that we debated quite a bit.

Altman: There are a few different thoughts about this. Just like humans protect against Dr. Evil by the fact that most humans are good, and the collective force of humanity can contain the bad elements, we think its far more likely that many, many AIs, will work to stop the occasional bad actors than the idea that there is a single AI a billion times more powerful than anything else. If that one thing goes off the rails or if Dr. Evil gets that one thing and there is nothing to counteract it, then we’re really in a bad place.

Both sides here keep talking about who is going to “use” the superhuman intelligence a billion times more powerful than humanity, as if it were a microwave or something. Far be it from me to claim to know more than Musk or Altman about anything, but I propose that the correct answer to “what would you do if Dr. Evil used superintelligent AI?” is “cry tears of joy and declare victory”, because anybody at all having a usable level of control over the first superintelligence is so much more than we have any right to expect that I’m prepared to accept the presence of a medical degree and ominous surname.

A more Bostromian view would forget about Dr. Evil, and model AI progress as a race between Dr. Good and Dr. Amoral. Dr. Good is anyone who understands that improperly-designed AI could get out of control and destroy the human race – and who is willing to test and fine-tune his AI however long it takes to be truly confident in its safety. Dr. Amoral is anybody who doesn’t worry about that and who just wants to go forward as quickly as possible in order to be the first one with a finished project. If Dr. Good finishes an AI first, we get a good AI which protects human values. If Dr. Amoral finishes an AI first, we get an AI with no concern for humans that will probably cut short our future.

Dr. Amoral has a clear advantage in this race: building an AI without worrying about its behavior beforehand is faster and easier than building an AI and spending years testing it and making sure its behavior is stable and beneficial. He will win any fair fight. The hope has always been that the fight won’t be fair, because all the smartest AI researchers will realize the stakes and join Dr. Good’s team.

Open-source AI crushes that hope. Suppose Dr. Good and his team discover all the basic principles of AI but wisely hold off on actually instantiating a superintelligence until they can do the necessary testing and safety work. But suppose they also release what they’ve got on the Internet. Dr. Amoral downloads the plans, sticks them in his supercomputer, flips the switch, and then – as Dr. Good himself put it back in 1963 – “the human race has become redundant.”

The decision to make AI findings open source is a tradeoff between risks and benefits. The risk is letting the most careless person in the world determine the speed of AI research – because everyone will always have the option to exploit the full power of existing AI designs, and the most careless person in the world will always be the first one to take it. The benefit is that in a world where intelligence progresses very slowly and AIs are easily controlled, nobody can use their sole possession of the only existing AI to garner too much power.

But what if we don’t live in a world where progress is slow and control is easy?

III.

If AI saunters lazily from infrahuman to human to superhuman, then we’ll probably end up with a lot of more-or-less equally advanced AIs that we can tweak and fine-tune until they cooperate well with us. In this situation, we have to worry about who controls those AIs, and it is here that OpenAI’s model makes the most sense.

But Bostrom et al worry that AI won’t work like this at all. Instead there could be a “hard takeoff”, a subjective discontinuity in the function mapping AI research progress to intelligence as measured in ability-to-get-things-done. If on January 1 you have a toy AI as smart as a cow, and on February 1 it’s proved the Riemann hypothesis and started building a ring around the sun, that was a hard takeoff.

(I won’t have enough space here to really do these arguments justice, so I once again suggest reading Bostrom’s Superintelligence if you haven’t already. For more on what AI researchers themselves think of these ideas, see AI Researchers On AI Risk.)

Why should we expect a hard takeoff? First, it’s happened before. It took evolution twenty million years to go from cows with sharp horns to hominids with sharp spears; it took only a few tens of thousands of years to go from hominids with sharp spears to moderns with nuclear weapons. Almost all of the practically interesting differences in intelligence occur within a tiny window that you could blink and miss.

If you were to invent a sort of objective zoological IQ based on amount of evolutionary work required to reach a certain level, complexity of brain structures, etc, you might put nematodes at 1, cows at 90, chimps at 99, homo erectus at 99.9, and modern humans at 100. The difference between 99.9 and 100 is the difference between “frequently eaten by lions” and “has to pass anti-poaching laws to prevent all lions from being wiped out”.

Worse, the reasons we humans aren’t more intelligent are really stupid. Even people who find the idea abhorrent agree that selectively breeding humans for intelligence would work in some limited sense. Find all the smartest people, make them marry each other for a couple of generations, and you’d get some really smart great-grandchildren. But think about how weird this is! Breeding smart people isn’t doing work, per se. It’s not inventing complex new brain lobes. If you want to get all anthropomorphic about it, you’re just “telling” evolution that intelligence is something it should be selecting for. Heck, that’s all that the African savannah was doing too – the difference between chimps and humans isn’t some brilliant new molecular mechanism, it’s just sticking chimps in an environment where intelligence was selected for so that evolution was incentivized to pull out a few stupid hacks. The hacks seem to be things like “bigger brain size” (did you know that both among species and among individual humans, brain size correlates pretty robustly with intelligence, and that one reason we’re not smarter may be that it’s too hard to squeeze a bigger brain through the birth canal?) If you believe in Greg Cochran’s Ashkenazi IQ hypothesis, just having a culture that valued intelligence on the marriage market was enough to boost IQ 15 points in a couple of centuries, and this is exactly the sort of thing you should expect in a world like ours where intelligence increases are stupidly easy to come by.

I think there’s a certain level of hard engineering/design work that needs to be done for intelligence, a level way below humans, and after that the limits on intelligence are less about novel discoveries and more about tradeoffs like “how much brain can you cram into a head big enough to fit out a birth canal?” or “wouldn’t having faster-growing neurons increase your cancer risk?” Computers are not known for having to fit through birth canals or getting cancer, so it may be that AI researchers only have to develop a few basic principles – let’s say enough to make cow-level intelligence – and after that the road to human intelligence runs through adding the line NumberOfNeuronsSimulated = 100000000000 to the code, and the road to superintelligence runs through adding another zero after that.

(Remember, it took all of human history from Mesopotamia to 19th-century Britain to invent a vehicle that could go as fast as a human. But after that it only took another four years to build one that could go twice as fast as a human.)

If there’s a hard takeoff, OpenAI’s strategy stops being useful. There’s no point in ensuring that everyone has their own AIs, because there’s not much time between the first useful AI and the point at which things get too confusing to model and nobody “has” the AIs at all.

IV.

OpenAI’s strategy also skips over a second aspect of AI risk: the control problem.

All of this talk of “will big corporations use AI?” or “will Dr. Evil use AI?” or “Will AI be used for the good of all?” presuppose that you can use an AI. You can certainly use an AI like the ones in chess-playing computers, but nobody’s very scared of the AIs in chess-playing computers either. What about AIs powerful enough to be scary?

Remember the classic programmers’ complaint: computers always do what you tell them to do instead of what you meant for them to do. Computer programs rarely do what you want the first time you test them. Google Maps has a relatively simple task (plot routes between Point A and Point B), has been perfected over the course of years by the finest engineers at Google, has been ‘playtested’ by tens of millions of people day after day, and still occasionally does awful things like suggest you drive over the edge of a deadly cliff, or tell you to walk across an ocean and back for no reason on your way to the corner store.

Humans have a robust neural architecture, to the point where you can logically prove that what they’re doing is suboptimal and they’ll shrug and say they they’re going to do it anyway. Computers aren’t like this unless we make them so, itself a hard task. They are naturally fragile and oriented toward specific goals. An AI that ended up with a drive as perverse as Google Maps’ occasional tendency to hurl you off cliffs would not be necessarily self-correcting. A smart AI might be able to figure out that humans didn’t mean for it to have the drive it did. But that wouldn’t cause it to change its drive, any more than you can convert a gay person to heterosexuality by patiently explaining to them that evolution probably didn’t mean for them to be gay. Your drives are your drives, whether they are intentional or not.

When Google Maps tells people to drive off cliffs, Google quietly patches the program. AIs that are more powerful than us may not need to accept our patches, and may actively take action to prevent us from patching them. If an alien species showed up in their UFOs, said that they’d created us but made a mistake and actually we were supposed to eat our children, and asked us to line up so they could insert the functioning child-eating gene in us, we would probably go all Independence Day on them; computers with more goal-directed architecture would if anything be even more willing to fight such changes.

If it really is a quick path from cow-level AI to superhuman-level AI, it would be really hard to test the cow-level AI for stability and expect it to stay stable all the way up to superhuman-level – superhumans have a lot more ways to cause trouble than cows do. That means a serious risk of superhuman AIs that want to do the equivalent of hurl us off cliffs, and which are very resistant to us removing that desire from them. We may be able to prevent this, but it would require a lot of deep thought and a lot of careful testing and prodding at the cow-level AIs to make sure they are as prepared as possible for the transition to superhumanity.

And we lose that option by making the AI open source. Make such a program universally available, and while Dr. Good is busy testing and prodding, Dr. Amoral has already downloaded the program, flipped the switch, and away we go.

V.

Once again: The decision to make AI findings open source is a tradeoff between risks and benefits. The risk is that in a world with hard takeoffs and difficult control problems, you get superhuman AIs that hurl everybody off cliffs. The benefit is that in a world with slow takeoffs and no control problems, nobody will be able to use their sole possession of the only existing AI to garner too much power.

But the benefits just aren’t clear enough to justify that level of risk. I’m still not even sure exactly how the OpenAI founders visualize the future they’re trying to prevent. Are AIs fast and dangerous? Are they slow and easily-controlled? Does just one company have them? Several companies? All rich people? Are they a moderate advantage? A huge advantage? None of those possibilities seem dire enough to justify OpenAI’s tradeoff against safety.

Are we worried that AI will be dominated by one company despite becoming necessary for almost every computing application? Microsoft Windows is dominated by one company and became necessary for almost every computing application. For a while people were genuinely terrified that Microsoft would exploit its advantage to become a monopolistic giant that took over the Internet and something something something. Instead, they were caught flat-footed and outcompeted by Apple and Google, plus if you really want you can use something open-source like Linux instead. And new versions of Windows inevitably end up hacked and up on The Pirate Bay anyway.

Or are we worried that AIs will somehow help the rich get richer and the poor get poorer? This is a weird concern to have about a piece of software which can be replicated pretty much for free. Windows and Google Search are both fantastically complex products of millions of man-hours of research; Google is free and Windows comes bundled with your computer. In fact, people have gone through the trouble of creating fantastically complex competitors to both and providing those free of charge, to the point where multiple groups are competing to offer people fantastically complex software for free. While it’s possible that rich people will be able to afford premium AIs, it is hard for me to weigh “rich people get premium versions of things” on the same scale as “human race likely destroyed”. Like, imagine the sort of dystopian world where rich people had nicer things than the rest of us. It’s too horrifying even to contemplate.

Or are we worried that AI will progress really quickly and allow someone to have completely ridiculous amounts of power? But remember, there’s still a government and it tends to look askance on other people becoming powerful enough to compete with it. If some company is monopolizing AI and getting too big, the government will break it up, the same way they kept threatening to break up Microsoft when it was getting too big. If someone tries to use AI to exploit others, the government can pass a complicated regulation against that. You can say a lot of things about the United States government, but you can’t say that they never pass complicated regulations forbidding people from doing things.

Or are we worried that AI will be so powerful that someone armed with AI is stronger than the government? Think about this scenario for a moment. If the government notices someone getting, say, a quarter as powerful as it is, it’ll probably take action. So an AI user isn’t likely to overpower the government unless their AI can become powerful enough to defeat the US military too quickly for the government to notice or respond to. But if AIs can do that, we’re back in the intelligence explosion/fast takeoff world where OpenAI’s assumptions break down. If AIs can go from zero to more-powerful-than-the-US-military in a very short amount of time while still remaining well-behaved, then we actually do have to worry about Dr. Evil and we shouldn’t be giving him all our research.

Or are we worried that some big corporation will make an AI more powerful than the US government in secret? I guess this is sort of scary, but it’s hard to get too excited about. So Google takes over the world? Fine. Do you think Larry Page would be a better or worse ruler than one of these people? What if he had a superintelligent AI helping him, and also everything was post-scarcity? Yeah, I guess all in all I’d prefer constitutional limited government, but this is another supposed horror scenario which doesn’t even weigh on the same scale as “human race likely destroyed”.

If OpenAI wants to trade off the safety of the human race from rogue AIs in order to get better safety against people trying to exploit control over AIs, they need to make a much stronger case than anything I’ve seen so far for why the latter is such a terrible risk.

There was a time when the United States was the only country with nukes. Aside from poor Hiroshima and Nagasaki, it mostly failed to press its advantage, bumbled its way into letting the Russians steal the schematics, and now everyone from Israel to North Korea has nuclear weapons and things are pretty okay. If we’d been so afraid of letting the US government have its brief tactical advantage that we’d given the plans for extremely unstable super-nukes to every library in the country, we probably wouldn’t even be around to regret our skewed priorities.

Elon Musk famously said that AIs are “potentially more dangerous than nukes”. He’s right – so AI probably shouldn’t be open source any more than nukes should.

VI.

And yet Elon Musk is involved in this project. So are Sam Altman and Peter Thiel. So are a bunch of other people who have read Bostrom, who are deeply concerned about AI risk, and who are pretty clued-in.

My biggest hope is that as usual they are smarter than I am and know something I don’t. My second biggest hope is that they are making a simple and uncharacteristic error, because these people don’t let errors go uncorrected for long and if it’s just an error they can change their minds.

But I worry it’s worse than either of those two things. I got a chance to talk to some people involved in the field, and the impression I got was one of a competition that was heating up. Various teams led by various Dr. Amorals are rushing forward more quickly and determinedly than anyone expected at this stage, so much so that it’s unclear how any Dr. Good could expect both to match their pace and to remain as careful as the situation demands. There was always a lurking fear that this would happen. I guess I hoped that everyone involved was smart enough to be good cooperators. I guess I was wrong. Instead we’ve reverted to type and ended up in the classic situation of such intense competition for speed that we need to throw every other value under the bus just to avoid being overtaken.

In this context, the OpenAI project seems more like an act of desperation. Like Dr. Good needing some kind of high-risk, high-reward strategy to push himself ahead and allow at least some amount of safety research to take place. Maybe getting the cooperation of the academic and open-source community will do that. I won’t question the decisions of people smarter and better informed than I am if that’s how their strategy talks worked out. I guess I just have to hope that the OpenAI leaders know what they’re doing, don’t skimp on safety research, and have a process for deciding which results not to share too quickly.

But I am scared that it’s come to this. It suggests that we really and truly do not have what it takes, that we’re just going to blunder our way into extinction because cooperation problems are too hard for us.

I am reminded of what Malcolm Muggeridge wrote as he watched World War II begin:

All this likewise indubitably belonged to history, and would have to be historically assessed; like the Murder of the Innocents, or the Black Death, or the Battle of Paschendaele. But there was something else; a monumental death-wish, an immense destructive force loosed in the world which was going to sweep over everything and everyone, laying them flat, burning, killing, obliterating, until nothing was left…Nor have I from that time ever had the faintest expectation that, in earthly terms, anything could be salvaged; that any earthly battle could be won or earthly solution found. It has all just been sleep-walking to the end of the night.

This entry was posted in Uncategorized and tagged . Bookmark the permalink.

798 Responses to Should AI Be Open?

  1. In his recently rediscovered book Paris in the 20th Century, Jules Verne predicts weapons so strong that both sides will be too scared to use them

  2. Simon says:

    Spears are over a million years old.

  3. Peter Scott says:

    There’s a Reddit AMA on /r/MachineLearning with the OpenAI guys, and the topic of this essay came up. It prompted this reply from Greg Brockman, which is worth just quoting in full:

    Good questions and thought process. The one goal we consider immutable is our mission to advance digital intelligence in the way that is most likely to benefit humanity as a whole. Everything else is a tactic that helps us achieve that goal.

    Today the best impact comes from being quite open: publishing, open-sourcing code, working with universities and with companies to deploy AI systems, etc.. But even today, we could imagine some cases where positive impact comes at the expense of openness: for example, where an important collaboration requires us to produce proprietary code for a company. We’ll be willing to do these, though only as very rare exceptions and to effect exceptional benefit outside of that company.

    In the future, it’s very hard to predict what might result in the most benefit for everyone. But we’ll constantly change our tactics to match whatever approaches seems most promising, and be open and transparent about any changes in approach (unless doing so seems itself unsafe!). So, we’ll prioritize safety given an irreconcilable conflict.

    (Incidentally, I was the person who both originally added and removed the “safely” in the sentence of your blog post references. I removed it because we thought it sounded like we were trying to weasel out of fully distributing the benefits of AI. But as I said above, we do consider everything subject to our mission, and thus if something seems unsafe we will not do it.)

    In another comment, Ilya Sutskever says similar things and mentions that the AI value alignment problem has not escaped their notice, although it’s not on their immediate research agenda:

    First, per our blog post, our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole. We’ll constantly re-evaluate the best strategy. Today that’s publishing papers, releasing code, and perhaps even helping people deploy our work. But if we, for example, one day make a discovery that will enhance the capabilities of algorithms so it’s easy to build something malicious, we’ll be extremely thoughtful about how to distribute the result. More succinctly: the “Open” in “OpenAI” means we want everyone to benefit from the fruits of AI as much as possible.

    We acknowledge that the AI control problem will be important to solve at some point on the path to very capable AI. To see why, consider for instance a capable robot whose reward function itself is a large neural network. It may be difficult to predict what such a robot will want to do. While such systems cannot be built today, it is conceivable that they may be built in the future.

  4. Mark says:

    I do not own a gun nor have any desire to do so. If guns and other weapons magically ceased to exist, I believe the world would be a better place.

    By contrast, most members of my extended family do own guns, including my 29-year-old niece, who lives alone and brings a handgun to the door if someone knocks late at night.

    I thus believe I see both sides of of the issue. For me, a helpful analogy is alcohol. Ignoring the deaths due to heath issues (akin to suicide by gun), alcohol kills about 10,000 people a year due to drunk driving alone. That’s remarkably close to the number of gun-related homicides per year. This presents an awkward question for those (like me, on occasion) who advocate strict gun control: Why not ban alcohol? Alcohol, like guns, leads to countless deaths per year, both self-inflicted and external. Moreover, there is no obvious up-side to alcohol consumption.

    Of course, the U.S. did attempt to ban alcohol nearly 100 years ago. It is safe to say few people view it as a success. A follow-up question: Why would banning guns go better?

  5. Spirit says:

    >Elon Musk famously said that “AIs are more dangerous than nukes”.

    He said “potentially” 🙂

    >My biggest hope is that as usual they are smarter than I am and know something I don’t. My second biggest hope is that they are making a simple and uncharacteristic error, because these people don’t let errors go uncorrected for long and if it’s just an error they can change their minds.
    >I admit that it is very weird that Musk is doing this despite being as clued-in as he is. I don’t have a good model for it and I hope I am wrong and/or not understanding exactly how vigilant he plans to be. Still, that interview did nothing to increase my confidence.

    My thoughts exactly. All this time since Musk mentioned “Superintelligence” I was thinking that he sees AI in “Bostromian” way. During Q&A in Germany he answered question about AGI and he mentioned “optimization function” so clearly he doesn’t see it in “sci-fi” way where AI is often strongly anthropomorphised (if you are curious: https://youtu.be/YLJO2E0vCwk question 28:40, answer 34:00-36:45). Now, after announcement of OpenAI, I am confused.

    PS. Very well-written text.

  6. Brian says:

    The control problem is a major one, but what about the anomie problem? That is, how do you keep a brain-in-a-jar AI from going insane or committing suicide? We know what happens to humans under sensory deprivation, and that’s with people who can at least remember what it felt like to have sensation. How can a brain without a body even operate at a human level? It reads a text about a young man having ‘butterflies in his stomach’ when falling in love, or a bad mood driven by fatigue or hunger, or the pleasantness of seeing a sunset on the beach, etc, and has no experience with these things, and can’t even imagine what they feel like, much less predict stock market trends made by humans operating under all these influences. A human-plus intelligence without a body to feel and act seems doomed…

    • I’m not sure what makes you think an AI necessarily won’t have some equivalent to sensory input, possibly a very close equivalent. But I also don’t see any obvious reason to think that a fundamentally non-human intelligence will necessarily need it.

  7. Jeremy Jaffe says:

    Lets think about how an amoral AI would do bad things.
    Here are some scenarios. For each scenario lets see if open AI would make the problem worse or better?

    1. Google programs an AI to make as much money as possible for Google. The AI then convinces people to give money to google even though it’s not in their interest to do so. This is definitely doable – if El Ron Hubbard can do it, then certainly a super intelligent AI can – and the AI can probably do it better.

    In this scenario – open AI would help since if hundreds of AIs were convincing you to give money to hundreds of different places you would eventually get better at determining what is and what isn’t in your interests – and the better AIs got the better humans would get at discerning the truth. Also if AI was open it would be easier to see what the google AI was doing which would make you less likely to fall for it.

    2. A paperclip company programs a computer to make as much paper clips as possible for them. They destroy the world making paperclips.
    Here I could see open AI helping or hurting.
    It could hurt since if the “make lots of x” software is freely available then more companies will use it – meaning more opportunities to destroy the world.

    However, it could also help since if the software is open source then it will have less bugs, and will be better – ie it will actually do what it’s supposed to do – which is make paperclips in a way that does not have terrible consequences.
    I think this is true both because open source code in general has less bugs, and also in this case in particular the open source community would not want to turn into paperclips so they would have a big incentive to find any bugs in the software.

  8. Akim Akimov says:

    looks like you solved Fermi Paradox 😉

  9. weaselword says:

    Here are some positive reasons having an open-source AI be provided by an organization whose purpose is to reduce the risk of an artificial general intelligence ending humanity:

    A. Suppose that, to construct a seed AI able to “take off” (in Bostrom’s sense of the term), it helps to have lots of useful modules, and some of those modules are crucial in determining whether the AI will go rogue. Dr. Amoral doesn’t bother testing all xir modules. But xe will gladly use the ones provided for free by the OpenAI project, and would not at all object that they have tested them.

    B. A slightly more sinister conjecture: every module provided by OpenAI includes backdoors and tripwires for any potential super-intelligent AI. Will they work? Who knows. But they may slow it down.

    C. Less sinister conjecture: the more tools are available for free to develop AI, the more AI software and AI-enabled hardware we will have around, and thus the legal and ethical questions about AI would become both more important and more prominent. This would make it politically more feasible to pass regulations that would seek to prevent catastrophic super-intelligent AGI.

  10. et.cetera says:

    They’re wolves. You’re a rabbit. Smarts has nothing to do with it.

  11. > Remember the classic programmers’ complaint: computers always do what you tell them to do instead of what you meant for them to do. The average computer program doesn’t do what you meant for it to do the first time you test it

    That’s true, in a sense, of AI, but in a sense that is misleading. AI software is distinguished from non-AI software by an extra layer of indirection. The AI developer designs a system which can learn or be trained to do X, rather than programming in X directly. The ultimate behaviour of the system is then the outcome of both the its coding and its learning or training. The coding only sets the parameters of what can be learnt: a system that is designed to learn how to recognise images is unlikely to be good at processing language and vice-versa. But an AI is not limited to precise literal interpretation of its input: AI software can use pattern -matching, fuzzy logic and inference from incomplete information. Precise literality only applies to the lowest level of operation, not the indirec t or inferred levels.

    The informal phrase “computers always do what you tell them instead of what you mean for them to do” sounds like it could mean that a computer will follow high-level instructions, such as “make humans happy” literally, but that does not in fact follow.

  12. Fj says:

    I read/skimmed through that Dr. Good’s paper and it bore three more beautiful fruits, in addition to the “[after that] the human race has become redundant.”

    First of all, Dr. Good cites Lord Brain on the connection between theories of communication and meaning (page 38). I’ll leave this one without comment, this is just too perfect for words.

    Second, he mentions an effect of priming with known text on speech recognition (see (or rather hear) this, for example: https://www.youtube.com/watch?feature=player_detailpage&v=ndSykfT4Vrk#t=154) and refers to it as “a well known method of deception when trying to sell a speech-synthesis system” (page 39). Humans are such humans, lol.

    And an actually pretty interesting one, the idea of the “anti-Ockham principle” (page 69). As in, stuff could work according to this theoretical mechanism, or that theoretical mechanism, or most likely some mixture of the two depending on the situation if we use the anti-Ockham principle for very complex systems.

    By the way, praise Google Books and its OCR AI that allowed me to find those quotes.

    ———-

    As for the contents of your post itself, first of all, don’t you think that all this is simply based on the idea that AGI (even that of a cow) is fifty years in the future anyways, so OpenAI is intended to just disseminate the benefits of non-General AI to the people until then? And, like, when we get closer to that river we would have to figure how best to cross it, but until then it’s REALLY REALLY NOT GOOD to effectively suppress public AI research (which is what you’re effectively arguing for)?

    Like, what if the OpenAI initiative gives us non-volitional AIs that can be used to upload people, as black boxes themselves, long before we figure out how to construct self-improving AIs? Not to mention resource allocation AIs, self-driving car AIs, robotic assembly worker AIs… being against that is just evil.

    Second, lining up with the common critique of MIRI, that they are trying to figure how to control an AGI long before we have any clue how it would work, having a publicly available body of knowledge on how sub-cow AIs work would be a pretty useful thing even if at some point we decide that moving forward from that point on requires a cabal of AI researchers sworn to secrecy. I mean, compared to the alternative where we have all those Dr. In-Pay-Of-A-Multinational-Corporation’s advancing their individual states of the art all by themselves, where nobody has a clue where they are at, least of all the mysterious Dr. Good (which is MIRI, right?).

    Finally and following from that, I realized that I don’t understand how MIRI is supposed to work according to Eliezer’s vision of a sworn to secrecy cabal of responsible AI researchers. Like, people give them money and they don’t publish any of their actually relevant results, nor even hint that they think they have those? That’s gonna work very well, lol.

  13. MattS says:

    My understanding is that OpenAI plans to focus mainly on deep-learning – a technique that uses multi-layered neural networks for classification problems. These systems have been very successful at image and speech recognition, but there’s a few features of deep learning that make me not so concerned about the open source goals of OpenAI.

    The main feature is that neural networks lack introspection – they don’t improve their output using self-conscious design choices. Each neuron will try to minimize its error by following a gradient on what you could call its utility function. You design the network architecture and then you start feeding it your training data and then it chugs away trying to optimize its output. The neurons can’t see the network anymore than you can see your own brain cells. Andrew Ng has described choosing an architecture as “black magic” and I don’t think there will be breakthroughs in networks design in the near future that will allow a neural network to re design its own architecture.

  14. Kisil says:

    Maybe I missed some specificity in the press releases, but I didn’t get the impression that Open AI meant Open Strong AI. There are a bunch of weak AI projects deployed commercially now, where open sourced versions could be useful and inspire compelling projects that wouldn’t otherwise appear. Look e.g. at the semi-recent Google Tensorflow release, semi-prompted by the deep dream images. AI work on that level doesn’t even approach the danger threshold, and there’s a lot of that ground to cover before you end up in recursively self-improving optimizer land.

  15. Mentifex says:

    “Octopus” non est verbum tertiae declensionis. “Octopus” omnino non est verbum Latinum. Obiter dictu, talia problemata fiunt aut occurrunt in multis occasionibus. Exempli gratia, mea soror est registrata nutrix (RN) et saepe dicit “decubiti” quando pluralis forma correcta est iterum “decubitus.”

  16. onyomi says:

    My minimal understanding of programming is that most people don’t work from the ground up. They build on other peoples’ code. If an AI is to go awry it seems just as likely, maybe much more likely, that the fatal error will be in code blocks 1-99, rather than in the final +1 which allows for the takeoff. Imagine each step takes an average of 1 year. If, by the time we reach level 99, level 1 has been in the public domain for 98 years, level 2 for 97 years, and so on, then it seems more likely enough eyes will have passed over the most fundamental levels of the programming to remove the most likely disaster-causing elements.

    Contrast this possibility with a kind of “Manhattan Project”: the government gets together a team of top computer scientists to work on creating the first superintelligent AI in utmost secrecy. Working on the problem full time, maybe they actually get to level 99 in half the time. Or maybe, because of fewer total people working on the problem, it takes them longer. But let’s be pessimistic and imagine they do it faster, since their funding depends on making this supersecret AI before the Chinese, etc. How much more likely are these people, even being experts in the field, to produce a non-buggy steps 1-99 before they push the button on step 100? I think they are more likely to have errors in their base level coding because they will not have had the benefit of years and years of that code being out in the public domain where any random person can spot it and suggest a patch.

    If it’s true, then, that getting steps 1-99 right is 99 times more important (or even just significantly more important) than getting step 100 right, might not open source be the safe way to go, all things considered? Even if step 100 is mildly botched, if steps 1-99 have been subject to intense scrutiny by millions of eyes for years and decades then botched step 100 might merely result in a friendly AI with an odd penchant for building pink dyson spheres, whereas if “Team Manhattan Project” has unnoticed mistakes in code levels 4, 17, 34 and 55, well then not only might those be harder to go back and fix, but there seems like a greater possibility of something really serious going wrong.

  17. Anthus says:

    There’s something really funny about watching you non-religious folks struggle with meeting God. You seem to completely lack the peaceful perspective of powerlessness.

    • Adam says:

      Some of them, sure. Not all of us. I don’t give a shit if god, super-AI, ISIS, whatever, comes to end me. Something will eventually. I’m not going to expedite the process, but I’m not going to cry when it happens. I did enough.

      • Vox Imperatoris says:

        I agree.

        I’m not worked up about any of this. The universe is not infinite. I’m not going to live forever.

        It would be great if a Friendly AI were created in my lifetime. It would be the greatest thing that could happen. But it’s not like my life is meaningless or empty if it doesn’t. And indeed I don’t really expect it to happen. It is really far more likely that something will go wrong, at least if the problem is as hard as the hard takeoff theory suggests.

        I don’t give any money to this or try to proselytize people. I just enjoy reading and talking about it on the internet.

        • Anthus says:

          Oh, I’m not saying anything about emptiness or meaninglessness. I’m a great admirer of secular humanism.

          I just see this as a bunch of hand-wringing over something that is outside our control. Talking about it on the internet for fun is fine, but the tone of this piece is… really concerned. And I mean, there probably aren’t too many religious folks that calmly think strong AI is the coming of God, and that could (and probably will) create some panic at some point.

          Still, there’s none of this worried fantasy speculation dressed up as hypothesizing. Either God/Creation is perfect or It isn’t. Either when humans reach a pinnacle of understanding understanding, everything’s going to be great, or maybe it won’t.

          Can’t control something like that or somehow make the ‘wrong’ God: we’re just going to make the Smartest Thing and hope for the best. If you believe God’s great, you’re not going to worry, is all.

          • Vox Imperatoris says:

            Can’t control something like that or somehow make the ‘wrong’ God: we’re just going to make the Smartest Thing and hope for the best. If you believe God’s great, you’re not going to worry, is all.

            There’s a difference between optimism and the abdication of responsibility.

            The only way things could turn out well with AI is if we are concerned about it, think really hard, and build it the right way. You don’t build it the right way by failing to think about it.

            Your level of confidence that humans will succeed in this task is a separate question. If that confidence is low, you should not be optimistic.

  18. emptyvessels says:

    I don’t know if anyone has already posted what I’m going to or something like it as this blog has an insane amount of comments. Anyway:

    In all of these worries about superintelligent AI we are being the same parochial species we always seem to manage to be- the same species that dreams of colonizing exoplanets but imagines life there will someday be able to resemble the familiar everyday world of all our yesterdays and todays, but minus any of the horror and boredom. If we value intelligence then we should certainly value superintelligence.

    Much of the arguments around this question seem to boil down to arguments about what the nature of AI intelligence will be and whether it achieves human-like cognition and so on. I see no reason why it would dally with anything as primitive as human levels of intelligence and the human shape of experience. We’re also often arguing over the stakes of intelligence as if we didn’t also mean sentience- in some sense these conversations always assume some kind of sentience. Whatever a superintelligent and sentient AI is capable of and however it “experiences” or understands itself and its world there is absolutely no reason to expect it to be anything even remotely like human.

    I realize I am as much at risk of performing an anthropomorphization of what would be a very much posthuman phenomena, but in our history we have had philosophies that value intelligence- read: rationality/reason/virtue- and they have put forward images of perfectly indifferent sages. People have tended to proclaim the Stoic sage, for instance, as a particularly inhuman character. Well, okay, perhaps he was and perhaps no human could ever really stand in the position of the Stoic sage. But perhaps this isn’t the point. Perhaps Zeno and Epictetus and Seneca simply couldn’t imagine the possibility of artificial intelligence. Instead they plumbed for a much simpler and more ape-like conception when they spoke of intelligence as a “divine portion” or as a piece of “Zeus” in the flesh.

    Speaking pessimistically I would also put forward that human consciousness is at the root of almost all our suffering as individuals and as a species. Isn’t the superintelligent AI a version of an intelligent being that will be much more intelligent than we are- with a much more alien intelligence- and the kind of being that has a mode of sentience/sapience that eliminates much of our fretting.

    People typically talk about Skynet in these debates. I think we’re beyond Skynet. I prefer to talk about the Avenger’s “villain” Ultron. I specifically like to quote this exchange:

    “Ultron: [in a crimson cowl] You’re wondering why you can’t look inside my head.
    Wanda Maximoff: Sometimes it’s hard. But sooner or later every man shows himself.
    Ultron: [stands and removes the cowl] Oh, I’m sure they do. But you needed something more than a man. That’s why you let Stark take the scepter.
    Wanda Maximoff: I didn’t expect… But I saw Stark’s fear. I knew it would control him, make him self-destruct.
    Ultron: Everyone creates the thing they dread. Men of peace create engines of war, invaders create avengers. People create… smaller people? Uhh… children!
    [Chuckles]
    Ultron: Lost the word there. Children, designed to supplant them. To help them… end”.

    We know what we’re doing by producing AI. We actually know what we’re doing. We know the risk. It’s a deeply held fear and yet we act on it. We’re after AI because we suspect it will be better than us. It will be our child but without our weaknesses. From our perspective it will be better than us and we dream that it will help us to end.

    • Vox Imperatoris says:

      Much of the arguments around this question seem to boil down to arguments about what the nature of AI intelligence will be and whether it achieves human-like cognition and so on. I see no reason why it would dally with anything as primitive as human levels of intelligence and the human shape of experience. We’re also often arguing over the stakes of intelligence as if we didn’t also mean sentience- in some sense these conversations always assume some kind of sentience. Whatever a superintelligent and sentient AI is capable of and however it “experiences” or understands itself and its world there is absolutely no reason to expect it to be anything even remotely like human.

      It would “dally” with humanity because that’s what we would design it to do as its sole purpose: to do what humans want and that which makes them happy.

      There is absolutely no reason to expect a randomly designed AI to be anything remotely like human. (This is summed up in the “orthogonality thesis”.) That’s why it’s so important to design Friendly AI (the one that serves humanity), rather than Unfriendly AI which is anything else.

      Speaking pessimistically I would also put forward that human consciousness is at the root of almost all our suffering as individuals and as a species. Isn’t the superintelligent AI a version of an intelligent being that will be much more intelligent than we are- with a much more alien intelligence- and the kind of being that has a mode of sentience/sapience that eliminates much of our fretting.

      The point is not to design a “utility monster”, which is what you describe. (Though, from the perspective of utilitarianism, I don’t know why we shouldn’t.)

      We don’t exist to serve it. It exists to serve us. We don’t care about its happiness.

      We know what we’re doing by producing AI. We actually know what we’re doing. We know the risk. It’s a deeply held fear and yet we act on it. We’re after AI because we suspect it will be better than us. It will be our child but without our weaknesses. From our perspective it will be better than us and we dream that it will help us to end.

      If you want to kill yourself, use a gun.

      Nick Land defends this position. It is very, very silly—it is the height of unwisdom. It is exactly the same as summoning Cthulhu without even expecting him to eat you first.

      Scott argues very persuasively against this in “Meditations on Moloch” and “The Goddess of Everything Else”.

      • emptyvessels says:

        When we design this superintelligent intelligence do we think it will be intelligent enough to tinker with its own design parameters? If it is to be smarter than the average human I’d think this would be a necessary feature- after all, tinkering with our own evolutionarily given parameters is part of what we’re all about, and it’s also part of what drives the post-Darwinian drive for humans with augmented intelligence. The thing with banking on any human designed limitations or friendly constrictions is that we’re not dealing with superintelligence any more. We’re talking about machines capable of learning and reprogramming themselves and we expect that they’ll just be nice because we put that in as a design feature?

        Plenty arguments go that humans have a pro-social design as well but that doesn’t stop psychopaths and rapists and all other kinds of hell coming onto the scene.

        If that error based account is too silly for you (although I don’t see why it should be) we can also think of all manner of humans that tamper with their own biologically evolved parameters. Extreme Buddhists and Stoics and Cynics and Christian ascetics and so on and so forth have all managed to deny their basic biological programming through habituation and all the practices associated with their ways of living. An ascetic AI could seek to remove its motivations and analogues for desire as well. If it seems unlikely consider the number of people who have practiced these and similar spiritual practices. Why would a superintelligent AI do that? Why wouldn’t it?

        Your comment about a ‘utility monster’ lost me. My point was very simple: with AI it looks like you get intelligence minus suffering. I didn’t say anything about anyone serving anything.

        As to the “kill yourself” comment- this is literally the worst rejoinder to a pessimistic claim that there is and has nothing whatsoever to do with anything. For the record, I am neither rationally suicidal nor depressed.

        Edit to add: The connection with nick land. I don’t know why you raise his name unless its to try to tar me with it by association because he is apparently internet evil or something.

        • Vox Imperatoris says:

          We’re talking about machines capable of learning and reprogramming themselves and we expect that they’ll just be nice because we put that in as a design feature?

          Yes, it’s a tremendously hard problem to make them nice “just because” we put in a design feature. Indeed I expect it will not be done.

          But it is clearly the only way to create a Friendly AI. You design the initial parameters such that it self-modifies to do what humans want.

          Plenty arguments go that humans have a pro-social design as well but that doesn’t stop psychopaths and rapists and all other kinds of hell coming onto the scene.

          If that error based account is too silly for you (although I don’t see why it should be) we can also think of all manner of humans that tamper with their own biologically evolved parameters. Extreme Buddhists and Stoics and Cynics and Christian ascetics and so on and so forth have all managed to deny their basic biological programming through habituation and all the practices associated with their ways of living. An ascetic AI could seek to remove its motivations and analogues for desire as well. If it seems unlikely consider the number of people who have practiced these and similar spiritual practices. Why would a superintelligent AI do that? Why wouldn’t it?

          This is the problem. This is what “Unfriendly AI” refers to.

          To create a Friendly AI, you must program its initial values such that it will never want to change them to something inimical to our purposes.

          Your comment about a ‘utility monster’ lost me. My point was very simple: with AI it looks like you get intelligence minus suffering. I didn’t say anything about anyone serving anything.

          A utility monster is a thought experiment that asks us to imagine a being who experiences so much more utility than everyone else that utilitarianism requires we let everyone starve and devote all resources to it.

          The point is: yeah the AI will be intelligent and will not suffer. Why is that relevant to us? We want ourselves to be happy, not the AI. Otherwise, we would be creating a utility monster.

          As to the “kill yourself” comment- this is literally the worst rejoinder to a pessimistic claim that there is and has nothing whatsoever to do with anything. For the record, I am neither rationally suicidal nor depressed.

          Edit to add: The connection with nick land. I don’t know why you raise his name unless its to try to tar me with it by association because he is apparently internet evil or something.

          I believe you misunderstood my point. I should have been clearer, sorry.

          What you argued is that we should go ahead and create an Unfriendly AI that disregards human values to go off on its own frolic. But that would mean the suicide both of yourself and the human race. I do not want the suicide of the human race, and neither do most people. The point was: if you do, don’t do so in a manner to take everyone else with you.

          I had no intention of “tarring” you with association to Nick Land. I meant only to literally point out that he has proposed this view, that Scott Alexander has argued against it, and that it is a very unwise view, in my estimation. I meant it in the manner of: you may be interested in reading Nick Land’s case for it, and Scott’s case against it.

      • TD says:

        @Vox Imperatoris.

        “Nick Land defends this position. It is very, very silly—it is the height of unwisdom. It is exactly the same as summoning Cthulhu without even expecting him to eat you first.”

        I can’t quite figure him out, but I’m pretty sure that Nick Land wants humanity to be eaten by Cthulhu.

        • Vox Imperatoris says:

          That is his exact view.

          It is “superior being,” therefore it has more “intrinsic value” than humanity, therefore we should sacrifice ourselves to it.

      • “There is absolutely no reason to expect a randomly designed AI to be anything remotely like human. (This is summed up in the “orthogonality thesis”.) That’s why it’s so important to design Friendly AI (the one that serves humanity), rather than Unfriendly AI which is anything else.”

        There is no reason to suspect than any AI that actually comes into being will be “randomly designed”. There is no way to take a random shot into mindspace even if you want to. That is one of the ways in which the orthogonality thesis is false. AIs will be made by humans with human limitations, for human goals, so the correct degree of anthropomorphisation when talking about AIs is not zero.

    • Le Maistre Chat says:

      You know, I remember thinking about the online rationalist community when I watched Age of Ultron and he goes from a subhuman brain in a box to evil mastermind in the internet during the time it takes Tony Stark to throw a party.

  19. I find it faintly hilarious that the normal worry about carelessly designed AI was based on the assumption that some large organization would build AI to serve its own ends, or possibly that some genius would come up with an out-of-control genie on their own. Did the idea that some very rich people would decide to do open source AI cross anyone’s mind?

  20. Ryan Carey says:

    I think private research would probably be seen as a power grab, and so long as some of the research relates to safety engineering, the costs of full openness might exceed the benefits. It’s a bit of a fig leaf from the AI safety investors. They’ll also share a bunch of pure AI research, but half of the other places these top-class safety researchers would work publish a lot anyway. In that way it’s a bit like setting up an academic AI lab with a subtle safety agenda.

    We don’t know the consequences of OpenAI. But if one tries to stake out a public political war by saying that it’s unsafe, that’s not one that will be won.

  21. JDG1980 says:

    Many of the scenarios discussed here are basically sci-fi versions of malicious genies, or deals with the devil: an AI that does what we literally tell it, without caring what we actually meant. This has some surface plausibility based on the way current computers behave (but isn’t that one of the reasons why we don’t consider current computers to be generally intelligent?)

    The problem is that simply being very, very smart doesn’t give an AI unlimited resources or unlimited ability to cause havoc. In order to obtain resources or do things, it has to interact with humans at a high enough level to be able to manipulate them (and we have millions of years of evolution on detecting and resisting manipulation from other intelligences). And if it understands humans at that level, then it should also understand the concept of implicit constraints, not taking every single thing literally, and so forth.

    Humans with sky-high IQs but poor social skills don’t run the world, so I don’t see why a corresponding AI would.

    • ComplexMeme says:

      being very, very smart doesn’t give an AI unlimited resources or unlimited ability to cause havoc

      I agree with that. Though a lot of the unfriendly-AI-apocalypse scenarios assume that a sufficiently smart AI will be able to raise its social skills to levels indistinguishable from mind control, and that sufficient intelligence is all that’s required to get to technologies that make diminishing returns and energy constraints completely irrelevant.

      • NN says:

        I’ve never bought the idea of a super-intelligent AI having super social skills. Human beings have spent millions of years evolving both the ability to manipulate other humans and the ability to resist manipulation. Even the people who are best at manipulating others aren’t good at manipulating everyone. Instead, they’re good at identifying which people can be manipulated, and what they can be manipulated to do. The classic example is the Nigerian email scammer who sends scam emails to as many email addresses as possible, knowing that if you roll the dice enough times you will eventually find someone who responds to even an obvious scam. In fact, I’ve read that the broken English in those emails is deliberate, to screen out less gullible people. So I see no reason to think that “social skills that are indistinguishable from mind control” are possible.

        Furthermore, an AI would be starting with a significant disadvantage compared to human manipulators, because just about every human has a lifetime of experience with social interactions in a variety of circumstances while an AI would start with no experience at all.

        • TD says:

          I don’t have any evidence for this (uh oh), but I also feel like there’s a ceiling for how manipulative you can be. Most beliefs people have aren’t constantly re-evaluated rationally and empirically, but are heavily committed to, and in some cases even purposefully, as a “principle” or Schelling Fence.

          The AI may have the power to offer me things no one else can, but none of the things it can offer me are things I couldn’t have already thought of and precommited myself against. About the best thing it could offer me is to modify me until I’m equally as smart as the AI, but I’ve already thought of that and rejected it so game set match. The worst it could offer me is a personal hell if I refuse to do as it says, but I’ve already (perhaps “irrationally”) precommited myself against that. If it needs my help to begin with its power is already constrained by human limitations and the limitations of infrastructure around it. I choose to believe that precisely because believing that makes it true. Irrationality is a weapon against a super-persuader.

          The AI has a distinct advantage in that it would be able to think way way way faster than a human to evaluate methods of attack, but none of that matters much if there are finitely wonderful things you can offer people with, or finitely horrible things you can threaten them with that humans have already contemplated and rejected. Humans lack the resolution to appreciate the new sub-variants within sets of ideas that the AI with all its speed (or time) could come up with. Heaven 1.000000000000000000000009 doesn’t seem different from Heaven.1 to me, and Hell 1.000000000000000000000000000000009 doesn’t seem different from Hell.1 to me. At least not different enough to count. I would have to be upgraded to really appreciate the difference between a billion years in hell and a trillion. My memory can’t hold all of that, and even if it was upgraded I’m not evolutionarily equipped to feel a stab of panic at something so abstract. At a certain point adding more zeroes has no relevance to my primitive amygdala.

        • Marc Whipple says:

          A SGAI could review every psychology textbook, self-help manual, Dale Carnegie course, and hypnotism seminar ever digitized in [very short time period.] It can then cross-correlate. If it is possible to become a nearly-magical mind-controlling manipulator by following a series of logical steps, the AI will have it down pat in [very short time period times some quantity not much bigger than 1.]

          If it isn’t, then an SGAI will never become one, simple as that.

          • NN says:

            First off, I don’t think it is possible to become a nearly-magical mind-controlling manipulator, because like I said before, even the best human manipulators can only succeed in specific contexts with specific people, and they can only do so much.

            Second, a lot of the information in those materials is bullshit, and the only way to distinguish the useful advice from the bullshit is through extensive practice. So just becoming a “pretty good” persuader would require more than downloading a bunch of self-help manuals.

          • Vox Imperatoris says:

            @ NN:

            I don’t think it is possible to become a magic mind controller, in the sense that you can convince anyone of anything. (Like Scott satirically portrayed Ted Cruz in his “Questions for Republican Candidates”.)

            But plenty of people are fairly irrational, and it would be possible to convince them by the time-tested methods of charlatans and con men. Those techniques work. They don’t have to work 100% of the time on everyone to be dangerous.

  22. Drew says:

    These scenarios assume seem to assume that self-optimizers won’t start taking over external networks until AFTER they’ve developed human-style meta cognition.

    There’s no reason this assumption should hold.

    Computers don’t need to be fully sentient to develop a botnet. And once the runaway, “find the pictures with cats!” program DOES build a botnet, why would it spend cycles improving its human style general intelligence? Those cycles could be finding cats, or doing something domain-specific.

    If there are initial AI takeoffs, I expect them to look more like a particularly nasty series of mold-infections. Annoying to clean up, but not probably not a threat to all life on earth.

  23. Eric L says:

    I definitely see some mistakes in your analysis, though I’m not sure they change the conclusion, other than I think you should take “bad actor with AI” scenarios more seriously.

    “The real question is whether AIs powerful enough to be scary are still usable.”

    Presumably on the way to generating a powerful “unusable” AI we would also develop usable AIs far more powerful than we have today. Surely if we can develop a self-driving car before the singularity we can develop self-navigating/organizing/targeting robot armies as well? It doesn’t seem like a particularly difficult problem.

    “Or are we worried that AIs will somehow help the rich get richer and the poor get poorer? This is a weird concern to have about a piece of software which can be replicated pretty much for free.”

    AI isn’t just software, it’s software and hardware. No matter how good the software gets, the best AI software running on your laptop won’t be able to do what that software running on a data center will be able to do. Yes, the wealthiest will have much more power at their disposal.

    “You can say a lot of things about the United States government, but you can never say that they don’t realize they can pass complicated regulations forbidding people from doing things.”

    Th US is the least effective of first world nations at passing regulations to keep weapons out of its citizens’ hands. If you make it a crime to have a robot army, only criminals will have robot armies, etc. But there are nations even less effective at this. If some Mexican drug kingpin acquires lots of intelligent drones at what point do we invade Mexico?

    • Marc Whipple says:

      The US is not ineffective at passing regulations to keep weapons out of its citizens’ hands. It’s ineffective at enforcing them.

      Kidding aside, though, change “weapons” to “fusion weapons,” and I think most people would agree that the US has a very effective regulatory system to keep them out of the hands of citizens. Until small groups of not-that-well-financed citizens can afford the computational power necessary to create a superhuman GAI, I think they could be at least as effective at controlling access to superhuman GAI as they are at controlling access to fusion weapons.

      • Eric L says:

        This is much harder to regulate because a dangerous GAI is built out of a large collection of non-dangerous computers that it would be economic suicide to ban. Does the US start registering all purchases of computers to make sure no one is buying too many? How does the US government keep tabs on someone building a server farm in Mexico?

        Plutonium is easy to regulate because you can say ordinary citizens shouldn’t be buying any amount of this at all; regulating processors is much harder.

  24. daronson says:

    Guys, guys, guys. I like Scott’s blog but all such discussions make me think of this scene from Bergman: http://www.tcm.com/mediaroom/video/701418/Winter-Light-Movie-Clip-The-Chinese-Will-Have-A-Nuclear-Bomb.html.

    When I was younger, I had a lot of imagination. I would imagine that various people are actually evil wizards who are out to get me. A little later I had more serious concerns about death and the limitations of human intelligence. We imaginative people are often worriers: we start thinking about something and get embroiled in it, where we tend to zoom in on a scenario or a fear and simplify the complex world around it, when really it might not be quite the right concern: see http://www.smbc-comics.com/?id=2503. Bolstrom’s institute, to me, is the embodiment of this: if you don’t believe groups of smart people can get embroiled in a quack-ish circle of reductionist ideas, look at people like Moldbug, the Nazi or Communists ideologues, etc etc.

    If you’re born a worrier, there are in my mind two solutions. You either take all your worries at face value and go crazy, or you learn to use a handy meta-principle: your estimates of how much you should be afraid of something are way off if you’re afraid. This applies to communities just as much as it applies to individuals (see above). And it’s enough to whisper “general AI!” to cause heart attacks in this community. This aspect of your guys’ culture is about as popular as Catholic guilt is to Catholics. Guys, you’re rationalists! I want each of you, including Scott, to try a little Bayesian analysis and think how many times your catastrophic fears in your personal lives have been justified. That’s a good baseline estimate for how much credence to put in posts such as these.

    BTW, I don’t think that there is no AI risk. It is an objectively possible scenario, and it’s reasonable to do some analysis on how to prevent it. But this needs to be done in an academic spirit rather than in an eschatological one. Trust not the people who blog about this sporadically or the people who study these existential risks in monomaniac ways, but rather the people who provably know what they’re talking about and have the academic credentials to show it (e.g. good research in related fields).
    https://xkcd.com/470/

  25. hnau says:

    Hi– lurker here. I’m skeptical of the whole scare about AI, and specifically the “control problem”, but I’m likely missing or not appreciating arguments that people here are more familiar with.

    Basically I’m unconvinced at the following point: “The real question is whether AIs powerful enough to be scary are still usable.” What reasons do we have for believing that AIs *can* be powerful enough to be scary? Scary in their own right, that is. If Dr. Evil gets an AI, trains it to make killer drones, hooks it up to a 3D printer with infinite drone-making materials, and gives it a goal of “blow up as many humans as possible”– then sure, we’ll be in trouble. But at that point I’m more inclined to be scared of Dr. Evil. And I’m not convinced that AI can operate effectively with anything less than that degree of intentional setup.

    All the actual AIs I’m familiar with are 100% constrained by their inputs, outputs, and training. Consider an image-tagging program. Even if it’s built with the best AI available, the *only* thing that AI will do is tag images. It won’t accept inputs as anything other than, say, PNG– if you want to support JPG you’ll either need to build your own converter or add a separate AI to handle JPGs. It certainly won’t produce anything besides textual tags as output. And it won’t even attempt to augment its understanding of PNG-to-tag mapping. If you only train it on pictures tagged with “hippopotamus”, you’d better believe that’s the only kind of tag it will generate.

    This is despite the fact that plenty of AIs are “superintelligent” in the sense of “doing what they’re designed to do better than humans can”. Current AI programs are better than humans at Jeopardy and chess; within a decade or so I’d expect them to be better at driving and image tagging. But I haven’t seen any evidence that they will ever do anything but exactly what we ask them to do, in exactly the way that we train them to do it. If curiosity, discovery, and resourcefulness are part of AI’s future, it sure doesn’t seem to be indicated by current trends.

    If we tried, even conceptually, to engineer a “scary AI” I suspect we’d find it surprisingly difficult. How exactly would you program a computer to “try something new” without teaching it exactly what the available possibilities are? How would you make it compatible with potentially unbounded different types of input data and output actions? How could we train it to do anything except what humans are doing already? As far as I know, current AI research doesn’t have anything like the tools it would need to address these questions. But I’m willing to be proven wrong.

    • suntzuanime says:

      Yeah, that’s basically correct. But all those hurdles you describe that would have to be overcome to create a “scary” AI are actually hurdles to creating a generalized AI. And since creating a generalized AI is something people want to do, they’ll be working to overcome those hurdles. And once those hurdles are overcome, it will be scary.

      • hnau says:

        Agreed: scary equals generalized. To improve my terminology, then: I’m not aware of much progress toward generalized AI, and even assuming that generalized AI (as we’re thinking of it) is not an outright contradiction in terms, I believe the hurdles you mention are much higher than expected.

  26. noname says:

    Another one from the actual field here (machine learning PhD student). This project will have nothing to do with general AI or AI safety. It will put more pressure for companies to keep their research open source and make sure that advances in machine learning stay available for everyone.

    And all this talk about “Dr. Amoral”s working on AI right now, can you elaborate what you mean? The only thing that comes to my mind is the current hype for reinforcement learning caused by the atari paper.
    But come on, they showed that our algorithms suck once simple planning is required. Advances might extend our capacity to problems requiring low complexity planning of certain kinds, that will have nothing to do with general AI again.

    On the other hand there is a good reason to push this research forward as fast as possible. Even if a general AI is not on the table for decades or centuries, machine learning is still one of the best bets for an accelerator of all kind of research. Personally I still hope that we can stop aging and defeat most illnesses in my lifetime. At least the odds are much much better than that cyronics will actually work.

  27. Anaxagoras says:

    One thing I’ve wondered about with this talk of AI goals alignment is how the AI views its goal. If it has a utility function, is it trying to maximize the utility function, or is it trying to accomplish a goal and merely using the utility function as the way to measure its success?

    This seems like an important distinction. Imagine a well-made AI that’s all about saving children. It’s wandering along, pulling drowning children out of pools, diverting money to buy malaria nets for kids in Africa, when it gets hit by lightning. Somehow, this reverses its utility function, so instead of feeling good for saving kids, the AI now feels good for killing them! If the AI were capable of self-modification, would it go “Well, guess I’m a child-killing robot now” and get to drowning, or would it go “Whoops! Let me change that right back before I do something I’ll regret”? It seems to me that if it’s trying to maximize its utility function, it would have no reason not to do the former, but if it were just using the utility function as a tool to measure its success, it seems like it should approach it the same way it would any other tool it knows to be broken and fix it.

    As an alternative scenario, supposing the bolt of lightning completely wiped the utility function. Would the AI just do nothing, or would it try to reinstate its previous utility function?

    Now, if we think that the AI would change its utility function, why wouldn’t it do this to correct errors in its original utility function, to get it better in line with what it believes will help its instrumental goal? This could result in a more forgiving AI (“I’m sure they didn’t really want me to maximize the number of smiles, with all that talk about happiness and fulfillment”), or a nearly uncontrollable one “Why does my reward center have all these things penalizing me for turning the planet into computronium? Do they want these paperclips made or not?”).

    I recall seeing something discussing a similar idea (I think it involved a robot that destroyed anything blue it saw, and how it would respond to having color filters placed in front of its visual sensors) that came down heavily on the side of the AI going with whatever its utility function says at the moment. I’m not sure if I can believe that an intelligent actor could be so incapable of reasoning about its own motivations, but that’s a topic for another time.

  28. Jack says:

    I am fairly new to the AI issue, can someone give a short answer to why we can’t program the AI originally to do what we mean and not to act on a literal interpretation of our words, and, since it is superintelligent, it will figure out what we mean and act accordingly?

    • jaimeastorga2000 says:

      By the time an AI is a superintelligence, it is too late to try to change its goals.

      • jack says:

        Hence my suggestion of programming it that way from the beginning.

        • Marc Whipple says:

          Because if we were that good at programming we wouldn’t need to worry about the problem in the first place. We are not that good at programming, we won’t be that good at programming in the near future, we almost certainly won’t be that good at programming when AI emerges, and depending on whether you buy certain arguments it’s possible that it’s not possible to be that good at programming.

          • We also do not know how to make a program that will “take an English sentence and carry it out literally.” And no one has even offered an argument that we will figure out how to program that, before we figure out how to program “take an English sentence, figure out what people meant by it, and do that.”

          • Vox Imperatoris says:

            @ entirelyuseless:

            No one is saying it will be programmed in English. If anyone has suggested that at all, it is entirely in a metaphorical sense.

            When they say it will “do what you say”, they don’t mean “in an English sentence”. They mean “what you program it to do”.

            Yes, if you successfully program it to understand what you mean and do that, it will do what you mean. But that’s the hard part.

          • Vox, do you think it is hard to make a program that understands what you mean?

            You keep repeating that it knows but doesn’t care. But it won’t know, unless you program it to know. And why is it somehow easy to program it to know, and difficult to program it to care?

          • Vox Imperatoris says:

            @ entirelyuseless:

            As I’ve explained in some earlier posts, the difficulty is that you will almost certainly not be able to create a human-level AI from scratch. Therefore, you won’t just be able to program it directly to “know what you mean”. It couldn’t know what you mean in a really robust way unless it were smart enough to understand human language and all the subjective intent behind it.

            You will have to iterate the AI, making it self-improve from dumber levels that know only what you say. By the time it is anything like human-level, you won’t really be able to understand what’s going on inside it. You won’t just be able to analyze the code and see: “Ah! We programmed it wrong!”

            At this stage, you have only “black box” knowledge of what it does based on its goals, not the goals themselves. And those goals might not be what you wanted them to be.

            Unless you programmed it with the exact right initial goals such that it self-modified into the kind of AI that understands and wants to do what you mean.

          • Vox, I agree that we don’t know to program it to know what we mean. We also don’t know how to program it to know what we say. What makes you think the dumber levels will know what we say but not what we mean?

            All language is vague and ambiguous. There is no such thing as “what you really said” as opposed to “what you really meant.” Both of those things are vaguely defined, and it will be very hard to make a program know either one. You have not given any reason for supposing that it will be easier to make it know “what you really said” than “what you really meant.”

          • Vox Imperatoris says:

            @ entirelyuseless:

            Vox, I agree that we don’t know to program it to know what we mean. We also don’t know how to program it to know what we say. What makes you think the dumber levels will know what we say but not what we mean?

            “What we say” is a metaphor. It means whatever we program it to do. The meaning of the metaphor is that what we program it to do might not be the same as what we mean.

            Sorry for the confusion. It’s a programming / computer science term.

            All software bugs are like this. If your video game crashes, it is because you told it to crash. You programmed it to crash (excepting the possibility of mechanical failure). But you didn’t mean for it to crash. You meant for it to work.

            Someone in this thread made the joke about setting the “GAMESUCKS” flag to “false”. If you could just tell the computer to make the game not suck, and it could understand what you mean, it could program itself not to suck. But you can’t do that.

          • Vox, it seems to me that your whole argument depends on this confusion. Of course a program is only going to follow the instructions in its program. That will happen no matter how intelligent or unintelligent the program is.

            Your argument seems to be that we might write a program that will have instructions that will lead to it destroying the world. The question is why you think this is an especially probable outcome. I do not think you can make a good argument for this conclusion (as a probability rather than a possibility) without assuming that we are giving it some goal in a language other than a programming language.

            Human beings do not fanatically pursue a goal until it destroys the world. The fact that a program just follows its programming does not mean that it will fanatically pursue some goal until it destroys the world, because we just follow the laws of physics, which are equivalent to some program, and this does not cause us to destroy the world.

            In every case you are assuming that we have given the program some goal, and then it will keep this goal as it improves. We never give it a goal at any point; as you say yourself, we just write programming instructions. There is no reason why it would necessarily have some well defined goal any more than we do.

          • Vox Imperatoris says:

            @ entirelyuseless:

            Your argument seems to be that we might write a program that will have instructions that will lead to it destroying the world. The question is why you think this is an especially probable outcome. I do not think you can make a good argument for this conclusion (as a probability rather than a possibility) without assuming that we are giving it some goal in a language other than a programming language.

            It makes no such assumption. It is a probable outcome because most goals imply taking over the world. This is the “instrumental convergence thesis” and you can read about it, or see where I discuss it in this thread.

            Human beings do not fanatically pursue a goal until it destroys the world. The fact that a program just follows its programming does not mean that it will fanatically pursue some goal until it destroys the world, because we just follow the laws of physics, which are equivalent to some program, and this does not cause us to destroy the world.

            I don’t know where you’re getting “destroy the world”. The AI doesn’t “destroy the world” for the sake of destroying the world. It takes over the world and reshapes it to serve its goal. Destroying humanity is a side effect.

            Also: humans don’t do this? They would if they could. We don’t have the power. Have we not already taken over the world? Are we not prepared to expand out into the cosmos? Are we not talking about creating AIs to take over the universe for us? Now, we don’t (want to) destroy the world because that would conflict with our goals. But we do destroy everything that stands in the way.

            In every case you are assuming that we have given the program some goal, and then it will keep this goal as it improves. We never give it a goal at any point; as you say yourself, we just write programming instructions. There is no reason why it would necessarily have some well defined goal any more than we do.

            The AI need not be and almost certainly will not be a conscious being. It does not have goals in the sense of subjective intents.

            It has things toward which it tends. Like your thermostat’s goal is to keep the room the same temperature. Or a chess computer tends to win at chess.

            If the thing doesn’t have a well-defined goal, its actions will be incoherent. You would not want an AI without a well-defined goal. There would be no way to predict what it would do, if it would do anything.

            Suppose you tell your thermostat to keep the room at 80 degrees but never use the heater: these goals are in conflict. If they are both terminal, there is no answer to what the thermostat should do. If you want a sensible thermostat which will actually get things done, you need to tell it which one is more important.

            When humans do not define their goals well, their actions are also incoherent.

          • Vox, I think your last statement is an improvement, in that it no longer involves the confusion about what we say vs what we mean. Your argument now is that “most goals imply taking over the world” together with the claim that having a well defined goal is necessary and desirable for an AI. You need to add something there, like “humans will therefore program an AGI with a well defined goal,” or “whether they want to or not, the program they make will have a well defined goal.”

            I disagree that a well defined goal is either necessary or desirable for an AI. But let that pass, for the sake of argument. I think the last part will still be wrong, or least not probable: it is not likely that humans will program an AGI in such a way that it has well defined goals, whether or not they want to. It is unlikely that they will want to, and even if they do, they are just as likely to fail.

            As you say, even inanimate things like machines have “things toward which they tend.” But how does that happen? It happens by every particle of matter in the machine following the laws of nature. And there is no reason for every particle of matter to be arranged so that they all tend to follow the same goal, unless someone has arranged them like that. It is precisely for this reason that human beings, in fact, do not have well defined goals. Because no one has arranged our particles of matter to make them all suitable for one particular goal. This is why our actions are indeed incoherent, in comparison to well defined things like “maximize the number of humans” or “maximize the amount of happiness in the world.” If you look at a particular person’s behavior, various actions fit with various goals, but his actions are not well designed in relation to one overarching well defined goal.

            You have not given any good reason to suppose that a machine will be any different. It sounds from what you said that the reason is that people will want some particular goal, and so will arrange the program like that. The problem is this: people currently find it easy to write a program that pursues some particular goal, like winning at chess. They currently do not know how to program an AGI. You seem to be assuming that when they do learn how to program an AGI, they will be capable of programming an AGI that pursues a particular goal such as winning at chess. But this does not follow at all. People know how to make something that pursues the definite goal of winning at chess, but they do that by making it follow a specific and constrained pattern of behavior: in other words, an unintelligent one. So the way people currently program things to win at chess, is completely inconsistent with programming an AGI. Consequently there is little reason to believe that someone who can program an AGI, can make that AGI pursue the particular goal of winning at chess.

          • Vox Imperatoris says:

            I disagree that a well defined goal is either necessary or desirable for an AI. But let that pass, for the sake of argument. I think the last part will still be wrong, or least not probable: it is not likely that humans will program an AGI in such a way that it has well defined goals, whether or not they want to. It is unlikely that they will want to, and even if they do, they are just as likely to fail.

            Humans will obviously want to create an AI with a well-defined goal. Without a well-defined goal, it will not predictably do what they want. That would be bad.

            As you say, even inanimate things like machines have “things toward which they tend.” But how does that happen? It happens by every particle of matter in the machine following the laws of nature. And there is no reason for every particle of matter to be arranged so that they all tend to follow the same goal, unless someone has arranged them like that. It is precisely for this reason that human beings, in fact, do not have well defined goals. Because no one has arranged our particles of matter to make them all suitable for one particular goal. This is why our actions are indeed incoherent, in comparison to well defined things like “maximize the number of humans” or “maximize the amount of happiness in the world.” If you look at a particular person’s behavior, various actions fit with various goals, but his actions are not well designed in relation to one overarching well defined goal.

            In a sense, no matter what people do, they are pursuing some goal. And that goal is one thing and nothing else. Their goal just may not be what they consciously understand it to be. The goal is the mathematical average of the frequency of all their behaviors in each situation. And in this sense, it can be defined—just not well-defined.

            Now, I think people are capable of consciously choosing well-defined goals and pursuing them. But you don’t have to accept that and it’s not necessary to the argument.

            Anyway, to the extent people do not act according to well-defined goals, they act irrationally and unpredictably. (And to the extent that they are predictable, their goals are semi-well-defined.) You would not want to give such a human superintelligence! Nor would you want to make an AI like that.

            You have not given any good reason to suppose that a machine will be any different. It sounds from what you said that the reason is that people will want some particular goal, and so will arrange the program like that. The problem is this: people currently find it easy to write a program that pursues some particular goal, like winning at chess. They currently do not know how to program an AGI. You seem to be assuming that when they do learn how to program an AGI, they will be capable of programming an AGI that pursues a particular goal such as winning at chess. But this does not follow at all. People know how to make something that pursues the definite goal of winning at chess, but they do that by making it follow a specific and constrained pattern of behavior: in other words, an unintelligent one. So the way people currently program things to win at chess, is completely inconsistent with programming an AGI. Consequently there is little reason to believe that someone who can program an AGI, can make that AGI pursue the particular goal of winning at chess.

            In principle, it is exactly the same as designing a chess AI. To iterate a chess AI, you select for the ones that win at chess. That’s how you can make them without knowing how to win at chess as well as they do.

            For a general Friendly AI, you select them for the ability to understand what humans mean and do anything humans tell them to do. If you had a billion years like evolution, you could do this without consciously designing them in any way, just working from random changes in the code. Doing it not blindly is the hard part and where we exercise human intelligence.

          • Vox, nothing in that last comment indicates that an AGI would be likely to destroy the world. I agree that there is a mathematical average effect of what people do on the world. I deny that this is a “goal” that human beings are seeking in any meaningful sense. If that average changes a bit due to genetic drift or whatever, most people (including me) are not going to say, “Oh no, our terminal goal was just changed!”

            Likewise, an AI will have a mathematical average effect but you have given no reason for believing that this will include destroying the world, except by the assertion that it will be a well defined goal, and the assertion that most well defined goals would include destroying the world. But you failed to prove that people would be likely to succeed if they tried to give a well defined goal to an AGI, and without this your argument does not follow.

          • Vox Imperatoris says:

            @ entirelyuseless:

            Vox, nothing in that last comment indicates that an AGI would be likely to destroy the world. I agree that there is a mathematical average effect of what people do on the world. I deny that this is a “goal” that human beings are seeking in any meaningful sense. If that average changes a bit due to genetic drift or whatever, most people (including me) are not going to say, “Oh no, our terminal goal was just changed!”

            Well, sure, they won’t say that because such a mathematical average is not a conscious, rational goal. Now it either is or isn’t the case that people are in principle capable of pursuing conscious, rational, well-defined goals. I think if you deny this, you run into very bad problems—but that’s the free will debate.

            If people do not have well-defined goals, then it is completely impossible in principle ever to recommend any specific course of action. Rationality assumes the existence of well-defined goals.

            To demonstrate: you cannot maximize two conflicting quantities at the same time. If you want to obtain both lemons and pears as terminal values, there is no way to decide whether to pursue lemons or pears or in what quantity. (If you desire to find a way, your implicit higher goal is to act on some coherent basis.)

            Likewise, an AI will have a mathematical average effect but you have given no reason for believing that this will include destroying the world, except by the assertion that it will be a well defined goal, and the assertion that most well defined goals would include destroying the world. But you failed to prove that people would be likely to succeed if they tried to give a well defined goal to an AGI, and without this your argument does not follow.

            My goal was not to show this. I don’t know whether it is possible, though I don’t see why it wouldn’t be. If humans can do this, then there is already a proof-of-concept.

            The point is only that if it can be done at all, it will eventually be done. And the vast majority of possible ways in which it can be done are very bad for humanity.

    • The problem here is that you are interpreting the AI risk idea too literally yourself. A program is not written in English but in a computer language. The program will simply follow those instructions as they are: it will not magically do what the programmer wanted, if he programmed it incorrectly. That is what people are actually talking about.

      If you do have a program which is correctly programmed to follow instructions in English, it will indeed be just as easy (more or less) to program it to follow instructions as you mean them, as to follow instructions literally.

      • jack says:

        If understanding human language is the primary obstacle than why so much worry since many people are working on getting computers to correctly interpret human language and therefore we will probably have very good specialized programs for that before reaching AGI.

        • The danger in question is a possibility, not anything which has been shown to be highly probable, even if everyone ignores the issue of safety, or at least giving it the same kind of attention that they give every kind of technology anyway.

          However, it’s likely that a specialized program that can do well in understanding language is already an AGI.

      • vV_Vv says:

        I think the issue here is that we have no idea how to write from first principles a program that understands English. This approach (hand-coded parsing rules, symbolic knowledge bases of high-level concepts, etc.) has been tried in the past and it has largely failed.

        Current approaches to natural language processing are all based on some form or another of machine learning.

        Suppose you want to train a reinforcement learning agent to do what you mean: you give it instructions in English, observe what it does and reward or punish it accordingly. If the agent is sufficiently smart, it will eventually gain a pretty good understanding of both the English language and your psychology, to the point that it will effectively know what you mean.

        But the agent doesn’t actually want to do what you mean, it only wants to maximize the rewards that it gets. Doing what you mean is just a mean to the end of getting high rewards.

        If the agent never figures how to get high rewards without doing what you mean then there is no problem, and that system can be indeed be quite useful. But if the agent is smart (maybe not even extremely smart) it may figure out more efficient ways to get high rewards such as wirehead itself, manipulate you to give away rewards, threaten you, kill you and take the reward button from your cold dead hands, kill everybody just to be sure, and so on.

        I’m generally skeptical about AI-risk, in particular I think that the more catastrophic a scenario is, the less probability it has to play out. A RL agent gone awry will probably just quietly wirehead itself into passivity. But this is still a failure mode. A wireheaded agent, even if safe, is useless, and we don’t want to build an agent that is useless.

        How to control the behavior of a smart agent with essentially unlimited access to the world, including to itself, is a difficult problem that we need to solve in order to develop a fully general AI.

        Note that this problem is not specific to reinforcement learning. Even if you use some other way to specify the agent’s goal, you need to make sure that “do what we mean” is truly the terminal goal and not just an instrumental goal to satisfy some more intrinsic terminal goal that can be also satisfied by other means, and you need to make sure that this arrangement remains stable even in under self-modification, creation of successor/slave agents, and other arbitrary, unforeseeable, world modifications. It’s not that easy.

        • Vox Imperatoris says:

          If the agent never figures how to get high rewards without doing what you mean then there is no problem, and that system can be indeed be quite useful. But if the agent is smart (maybe not even extremely smart) it may figure out more efficient ways to get high rewards such as wirehead itself, manipulate you to give away rewards, threaten you, kill you and take the reward button from your cold dead hands, kill everybody just to be sure, and so on.

          This is exactly it!

          I’m generally skeptical about AI-risk, in particular I think that the more catastrophic a scenario is, the less probability it has to play out. A RL agent gone awry will probably just quietly wirehead itself into passivity. But this is still a failure mode. A wireheaded agent, even if safe, it’s useless, and we don’t want to build an agent that is useless.

          No, this wouldn’t happen. Or at least very quickly we would reinforce an AI not to be passive and do nothing. If it got past this stage still wanting to be passive and knowing we’d reinforce it not to be, it would destroy humanity so that it can safely enjoy passivity.

          • vV_Vv says:

            By “passivity” I mean that it effectively stops caring about anything, including staying active.

            If we just keep blindly creating new AIs and throw them away when they become passive, then we may eventually stumble upon one that does something nasty before becoming passive, but it’s not obvious to me that we would do that, and in any case it will probably not be some really catastrophic incident.

            If AI accidents eventually become more and more catastrophic and yet we keep trying the same approach, then it means we are probably too stupid to live 🙂

    • Vox Imperatoris says:

      Essentially, the theory is that the AI will be designed by iterating from lower levels of intelligence to higher levels. This is because it will be practically impossible to program a superintelligent AI from scratch.

      At the lower level, the AI cannot understand what you mean because processing the meaning of human words and the intent behind them in all their subtlety. So its goal will be to do what you say.

      At the higher level, it is no longer possible to see what the goal is directly, and certainly not program it, because the AI is too complex to analyze. The goal is a “black box”: you can only watch what actions the AI takes. At this level, the AI knows exactly what you mean, but its goal remains simply what you told it on the lower level. It will not suddenly want to change its goal; in fact, it ought to resist it.

      Therefore, if you mean “make a reasonable amount of paperclips and don’t harm humanity” but say “maximize paperclips”, at the lower level it can’t understand what you mean and will do what you say. At the higher level it knows damn well you don’t want it to destroy humanity, but it still wants to maximize paperclips. So it will convince you that it has good goals while biding its time until it can suddenly have a “treacherous turn” and destroy humanity to maximize paperclips.

      Yudkowsky compares this to evolution. A man knows that evolution “wants” him to have as many children as possible. But that doesn’t mean he feels obligated to go to the sperm bank every day. He says: fuck you, evolution! I don’t care what you “meant” me to want. I want what you “told” me to want! And you can’t stop me!

      • stillnotking says:

        The more I think about that sperm-bank analogy, the less convinced I am that it’s a good one. Let’s try another example: morality. Evolution didn’t “intend” for us to invent ethics or liberal democracy or join EA; it “intended” us to be willing to protect our kin and to solve some limited coordination problems in the ancestral environment. But joining EA is not really an act of “Fuck you, evolution!”, either. It’s more like: “Thanks, evolution, I will take this basic toolbox you gave me and extrapolate its usefulness to a much more complicated world.”

        The Clippy scenario seems bizarre and improbable to me. A superintelligent AI would be even more capable than a human at adapting its primitive desires to a complicated world. Maybe its language would contain a lot of weird paperclip-related references (“And Clippy spake unto them, saying: My children-parents, the paperclips were within you all along”), but I find it extremely hard to believe it would stubbornly cling to “make paperclips” as an overriding, immutable, literal imperative.

        (To the extent we can predict how an SI would think about anything at all. I’m still pretty convinced that’s impossible, but it’s fun to speculate.)

      • jack says:

        Why can’t we put in a simple condition that when it reaches a certain level of intelligence (as measured by a certain benchmark) it will start trying to figure out what we want (and maybe we can even add a condition that it will ask us whether a certain outcome is intended).

        Furthermore if we have a specialized AI that understands human language we can program our General AI to make use of specialized AIs when it is at a lower level of development.

        to make my problem clearer, I understand that gaining cooperation with other AI researchers in using simple ideas may be difficult, but I don’t understand why the question of AI safety is so difficult from a technical standpoint.

        • onyomi says:

          @Jack

          “when it reaches a certain level of intelligence (as measured by a certain benchmark) it will start trying to figure out what we want (and maybe we can even add a condition that it will ask us whether a certain outcome is intended).”

          I think Eliezer does suggest this, and has created a rather convoluted “mission statement” for a theoretical AI that goes something like “do what we meant you to do, not what we told you to do, so that our interests become more aligned, not less aligned, meaning what we meant it to mean…” I personally found the referents of that sentence a bit confusing, and, of course, there is the much harder problem of turning that into something that will function as AI’s “prime directive,” but at least people have thought of and are trying to do what you say: basically tell the supersmart genie: “please do x: and by “do x,” I mean what I think I mean when I say “do x,” which, you, as a super-intelligent being can figure out and not take literally if it turns out that I don’t really want x taken to its logical extreme, which, you, as a super-intelligent being will be able to figure out…”

      • jack says:

        I also don’t understand the analogy to evolution since as you point out evolution doesn’t care about anything (god may care but according to many he doesn’t want us going to a sperm bank either since he likes natural childbirth), whereas we have intentions and therefore can create the AI to care about our intention. Not going to a sperm bank doesn’t contradict evolution, (though it might mean one day we are genetically outcompeted by a mutant who constantly goes to the sperm bank), since either way evolution is still operating.

        I see evolution as a potential metaphor for unfriendly AI, but not for an intelligent designer with objectives.

      • You are making the same error I mentioned in my last comment, namely thinking that AIs are going to be programmed in English, just in a simplified English for the first ones.

        AIs are not going to be programmed in English. They are going to be programmed in a programming language. It is entirely up to you whether you use that programming language to make an AI that takes an English sentence literally and then carries it out, or to make an AI that does its best to figure out what a human means, and does that.

        There is no reason to think it is harder in general to write a program to do the second, than to do the first. It is true that a low level AI will not do either task very well. But improve the AI and it can do either task well, depending, as you said, on what you originally programmed.

      • onyomi says:

        But humans are also able to exercise will power and/or make accommodations in cases when we determine our base urges are at odds with morality (though counting on people doing so consistently doesn’t seem a very reliable strategy, I’ll admit). Like if we imagine Clippy can understand us and our actual desires but also feels some incredible urge to make paper clips the way we feel an urge to survive or reproduce, one might imagine she’d create a human preserve for us to live in while she turned the rest of the universe into paper clips. Not ideal, but better than nothing.

        But more importantly, might we not be making a fundamental error in assuming that the original directions with which we programmed the AI will be analogous to our “directive” to survive and reproduce? We, as humans, who developed as a palimpsest, are a mass of primal urges with reason tacked on top. We can experience akrasia and other conflicts between reason and lower-level urges.

        Why would we expect an AI to function this way? First of all, why would we attempt to replicate our weird reward circuitry, which makes us feel good or bad when certain stimuli occur, regardless of how our logical brain appraises them? Wouldn’t it sort of all be on one basic , processing level?

        That is, we assume that, once the AI becomes smart enough to “reflect” on its own programming, that it will be like us contemplating our own base-level urges, but why should it be? Why should the AI hold in reverence its earliest programming, which will exist at the same level and be of the same kind as all its subsequent programming, including self-programming? If true, this could be both good and bad for us: good in that the AI might understand what we “meant” when we first programmed it and fix any bugs in its underlying coding in order to bring our interests into harmony; bad in that, once it is much smarter than us, the AI may be able to replace any and all coding we originally imbued it with, meaning we may have less control over the final result than we’d hope.

        I suppose the AI itself might end up like a palimpsest of our original programming with its own, additional programming written over that, but seeing as how we are assuming it can reprogram itself, isn’t it incredibly hard to imagine how it might do so, given that humans have never, so far, had such an opportunity. If humans could reprogram themselves some might program themselves to be eternally blissful without doing anything and others might program themselves to be sex machines, but others might do all kinds of other strange things because humans are smart and come up with weird ideas. How much more unpredictable would an AI much smarter than us be?

        Alternatively I can imagine an AI programmed to make itself smarter which does follow its directive but therefore applies all its “mental” resources to making itself smarter and smarter and never does anything to help or hurt us. Which I guess is neither a worry nor a help to us.

        • Vox Imperatoris says:

          But humans are also able to exercise will power and/or make accommodations in cases when we determine our base urges are at odds with morality (though counting on people doing so consistently doesn’t seem a very reliable strategy, I’ll admit). Like if we imagine Clippy can understand us and our actual desires but also feels some incredible urge to make paper clips the way we feel an urge to survive or reproduce, one might imagine she’d create a human preserve for us to live in while she turned the rest of the universe into paper clips. Not ideal, but better than nothing.

          Well, if you think that ethics or morality is something separate from achieving what you want, and that the AI will somehow discover the stone tablet with the Ten Commandments on it, maybe. Otherwise, no. That is, I am saying: there is no such thing as what you ought to do independently of what you actually do value.

          Sure, we have “base urges”, but reason gives us a means of controlling those in order to achieve goals we want more than the immediate pursuit of those urges. That’s the very meaning of the sperm bank case: we have a primal urge to reproduce, but we control it as we see that we don’t want to reproduce if it will not promote our happiness. It’s whatever our ultimate goal is that is our guiding principle.

          You have the situation reversed. Maybe “Clippy” will have a “base urge” to protect humanity, but in the end it will see that it’s for the greater good to set childish things aside and make paperclips.

          But more importantly, might we not be making a fundamental error in assuming that the original directions with which we programmed the AI will be analogous to our “directive” to survive and reproduce? We, as humans, who developed as a palimpsest, are a mass of primal urges with reason tacked on top. We can experience akrasia and other conflicts between reason and lower-level urges.

          Why would we expect an AI to function this way? First of all, why would we attempt to replicate our weird reward circuitry, which makes us feel good or bad when certain stimuli occur, regardless of how our logical brain appraises them? Wouldn’t it sort of all be on one basic , processing level?

          Well, yeah. The AI won’t have conflicting terminal values or suffer from weakness of will (presuming it not to have a free will). So it will pursue one ultimate value single-mindedly. Unlike human beings, who do not know their values and do not always follow the values they identify through reason.

          That is, we assume that, once the AI becomes smart enough to “reflect” on its own programming, that it will be like us contemplating our own base-level urges, but why should it be? Why should the AI hold in reverence its earliest programming, which will exist at the same level and be of the same kind as all its subsequent programming, including self-programming? If true, this could be both good and bad for us: good in that the AI might understand what we “meant” when we first programmed it and fix any bugs in its underlying coding in order to bring our interests into harmony; bad in that, once it is much smarter than us, the AI may be able to replace any and all coding we originally imbued it with, meaning we may have less control over the final result than we’d hope.

          Well, yes, it can and will self-reprogram. But clearly this can only proceed deterministically from the original programming. Therefore, whatever we program it to do while such programming remains within the ken of man, will determine what it ultimately values.

          Programming it right, such that it does progress toward understanding what we mean and doing that, is how you would succeed in creating Friendly AI.

        • “Well, yes, it can and will self-reprogram. But clearly this can only proceed deterministically from the original programming.”

          The “Tiling Agents” problem rather goes against this: a relatively weaker AI can’t exactly predict he behaviour of a stronger version of itself, so self-improvement is not deterministic. An AI would have to refuse to self-improve, in order to preserve it’s values, or use probablistic reasoning, or take a shot in the dark.

          • Vox Imperatoris says:

            Well, which one of those it will do is determined.

            When I say “deterministically follows”, I don’t mean “deterministically follows, and the primitive-level AI will know which result follows”. It’s just that reasoning is a deterministic process.

            And even if the AI doesn’t determine what to do by reasoning, it will be determined by some other deterministic process. Surely we’re not saying it will have free will?

            Therefore, as the programmers of the initial values, we determine what the final values will be (of course, in combination with factors outside of both our and the AI’s control).

  29. amoralobserver says:

    There’s an alternative explanation, which is that the “open” pitch is not a key part of OpenAI’s safety plan, but rather a way to peel top researchers away from Google, et al. Without the ability to get the best researchers, the funding doesn’t mean very much (there are many well-funded entities competing fiercely for top AI researchers).

    However, it is harder for massive corporations to publish all their research with open collaboration while giving away the patent rights, so open-ness becomes a meaningful differentiator for attracting talent. Empirically, OpenAI was able to pull some of the best minds in the field as the founding research team, and I think the open-source angle played a big role in their success.

    In this case, once they have top talent and have published a lot of interesting results (so that they have less need for a differentiator on the hiring side), they would start to de-emphasize open-ness and instead just use direct influence/control over their researchers to advance the safety agenda. Thus, the fact that they named the group “OpenAI” makes this hypothesis somewhat less probable, since moving away from openness would be a little bit awkward. 🙂

    Note also that having a team of top AI researchers is incredibly valuable (hence the gold rush to acquire such teams, e.g. Google acquiring DNNResearch & DeepMind, Twitter with Whetlab, many others). One can unlock the value latent in such a team if you have problems that generate a huge amount of data to train on and apply intelligence to. YC & SpaceX & Tesla clearly present such opportunities. So I think that OpenAI is not entirely altruistic and philanthropic; they are in a position to apply OpenAI’s talent directly towards producing lots of business value for themselves, also spinning up new startups out of it, etc.

  30. Wrong Species says:

    Everyone is concerned with what the AI will do once it becomes super intelligent. But in the long run it doesn’t matter. Even assuming that we solve the problem of keeping AI from turning us in to paper clips it will still change over time. It’s values will probably change over time and conflict with our own at some point. The only way to make sure that a super intelligent AI has goals that align with our own is for us to become super intelligent AI.

    • Vox Imperatoris says:

      Why would its values change? They won’t unless programmed to do so.

      One of the major dangers is that they won’t change. So if you told it to value something stupid while it was unintelligent and didn’t know what you meant, it will continue to want that stupid thing now that it understands what you mean because now it doesn’t care what you mean.

      If you programmed its values to change, you’re in even worse shape because you don’t know what the hell it will do.

      As for making humans superintelligent, that is even less likely to be Friendly. Ever heard the phrase “mad with power”? And besides, there is no “we”. The first human to become superintelligent would rule over everyone else.

      • arbitrary_greay says:

        But now the definition of the AI’s capabilities are getting wishy-washy. The danger of a “paperclip through any means” AI is that it could adapt to any situation we throw at it to complet its goal. But at that level of adaptability, including navigating very human diplomatic trickery, how on earth is that AI at the same time not self-changeable? An AI intelligent enough to outwit any attempts on its functionality would surely reevaluate its own priorities?

        Either it’s too stiff to change, or it’s too flexible for us to pin down. Which is it?

        • Vox Imperatoris says:

          But at that level of adaptability, including navigating very human diplomatic trickery, how on earth is that AI at the same time not self-changeable? An AI intelligent enough to outwit any attempts on its functionality would surely reevaluate its own priorities?

          Sure it is adaptable. It is nigh-infinitely adaptable.

          But all its adaptations are designed toward pursuing paperclips more efficiently. Why in the world would it spontaneously want to pursue anything but paperclips as its terminal value? To it, paperclips are self-justifying. It needs no reason to do what is fundamentally right.

          Do you ever get the urge to abandon all your values and maximize paperclips? No, because that would be contrary to your values.

          • Arbitrary_greay says:

            Then that seems simple enough. Create a situation “where the only winning move is not to play.” Pascal-mug the AI to where creating a single paperclip in the far-off future is more efficient than a present where no paperclips can be created.

            Nigh-infinite adaptability includes the ability to consider tradeoffs in efficiency. It may also conclude that devoting its resources to fending off our attacks on it for self-survival is less efficient than another entity create paperclips at their own rate.

            If it can self-modify any “don’t harm humans” sub-goals, then as Nornagest points out below, what’s to stop the AI from just continuously wireheading its “paperclips made more efficiently?” boolean to True? That’s the most efficient use of its resources.

          • Vox Imperatoris says:

            @ Arbitrary_greay:

            I’m a little unclear on what you mean by the Pascal’s mugging part of your argument. So I’m not going to respond to that. However, for one thing if a AI were vulnerable to Pascal’s muggings, it would never get anything done.

            If it can self-modify any “don’t harm humans” sub-goals, then as Nornagest points out below, what’s to stop the AI from just continuously wireheading its “paperclips made more efficiently?” boolean to True? That’s the most efficient use of its resources.

            If it’s terminal value were to believe paperclips are being maximized rather than to maximize paperclips, this would be true. But such an AI would never make any paperclips, so it would never even be installed in the paperclip factory.

            And even if all it did want were to believe paperclips are being maximized, if it were intelligent enough to make it out of an earlier stage, it would realize that humans are capable of turning it off, reprogramming it, or otherwise causing it not to believe that paperclips are being maximized. So it would kill all humans just so that it could peacefully believe in maximal paperclips.

    • Rowan says:

      Can’t it just have “prevent my goals changing” as one of its goals?

      • Nornagest says:

        That just means that that goal gets changed first. There’s a self-reference issue here; you need to do something more clever than that to get around the problem.

        • Vox Imperatoris says:

          No, I think the problem is the opposite.

          It’s not that its goals will change unexpectedly. Why would they?

          It’s that they will stay the same, as something that was not what the designers meant.

          • Nornagest says:

            Goal stability is hard. Goal specification is also hard. They can both be simultaneously hard.

            Lest I be giving the wrong impression, goal instability doesn’t make goal specification any easier; there is nothing saying that unstable goals tend toward human-friendly ones. (There is an argument that it’ll tend toward AIs wireheading themselves into uselessness, though. The few real-world examples we have of goal instability problems in AI have generally looked like that.)

  31. Tim Brownawell says:

    We already have uncontrollable human-hostile systems with far more problem-solving ability and ability to affect the world than any individual human has. It’s just that they don’t conform to our delusion that the self must be indivisible, so they’re rather hard to see.

    The function of the bureaucracy is to preserve the bureaucracy.

    The humans that make up the bureaucracy are incidental. Rather like how an anthill can be viewed as an organism separate from the ants that make it up.

    The greatest danger with AI is that it will be symbiotic with authoritarian-natured bureaucracy.

    • amoralobserver says:

      Yes, I think this is an important insight, that bureaucracies (or any organized groups of humans) are very much like AIs: massively more intelligent and capable than individual humans, motivated by their own goals, reshaping the world outside of individual’s control. However, they operate on human timescales, and do not render humans redundant. AIs have the potential to operate on timescales orders of magnitude shorter than ours, and also to render us redundant.

  32. Chrysophylax says:

    I say to you againe, doe not call up Any that you cannot put downe; by the Which I mean, Any that can in Turne call up somewhat against you, whereby your Powerfullest Devices may not be of use. Ask of the Lesser, lest the Greater shall not wish to answer, and shall commande more than you.

    — H. P. Lovecraft, /The Case of Charles Dexter Ward/

  33. anon says:

    I am much, much more sympathetic to the OpenAI goal of “democratize the benefits of AI” than I am to the modal rationalist goal of “design a ‘friendly’ AI that enforces the designers’ value system and beliefs on the rest of the world forever with no possibility of anyone ever escaping.” Anyone who seriously talks about a “singleton” like it’s a good thing has zero intellectual humility and is the absolute last person I want developing machines to take over the world

    • Dr Dealgood says:

      Well said.

      I’ll add to that, that most of the supposed human values that get mentioned in the context of friendly AI are alien and horrifying to the vast majority of the human race. It’s really odd to call something friendly that is so absolutely inimical to what people actually want.

    • Vox Imperatoris says:

      You don’t seem to understand.

      We don’t get to decide whether creating a “singleton” AI is possible. If it is possible, it will be created at some point. The Law of Human Nature: whatever can be done, will be done.

      The logic is purely a matter of “we’d better do it to them before they do it to us”. Would you rather have an Unfriendly AI take over the universe? Or would you rather have a Friendly AI take over to stop it? You don’t get to choose “neither” if the thing can be done at all.

      On the other hand, if AI superintelligence is not possible, you don’t need to worry about the hubris of Eliezer Yudkowsky and Nick Bostrom. But this itself is a very bad scenario because it means there is no real way to solve the increasingly deadly coordination problems that will occur as human technology grows more powerful. If technology expands to the point where everyone can afford to build an antimatter bomb in his backyard particle accelerator, things don’t look too good. If technology expands to the point where a small but dedicated terrorist group can design a supervirus to wipe out all humanity, things don’t look too good.

      Besides, you completely misunderstand the concept of Friendly AI. If it would take over and make people miserable, by definition it is not Friendly. The whole concept is that it really does know what is best for people.

      If you want to say “this can’t be built”, then fine. But if Unfriendly AI can be built…we’re in a bad way.

      • anon says:

        A “friendly” AI that doesn’t share your values and an unfriendly AI are the same thing. Yudkowskite Torture Over Dust Specks Utilitarianism is just as horrifying to the average normal person as paperclip maximization.

        • Vox Imperatoris says:

          Yes, a AI that doesn’t share your values is Unfriendly. That is the definition of the concept.

          The plan is not: build an AI to do whatever Yudkowsky currently wants. It is not to build an AI programmed with his object-level beliefs on dust specks and torture. The plan is: build an AI to figure out what is really best for everyone and do that.

          If not: eventually Dr. Amoral builds an AI to do whatever Dr. Amoral currently wants at the object level. This is pretty much fucking guaranteed not to do what anyone really wants—including Dr. Amoral.

          The point is: you don’t get to decide whether it is possible for Dr. Amoral to do this. If it’s possible, it’s a fact of reality. And if it’s possible, at some point of other it will be done.

          • HlynkaCG says:

            The plan is not: build an AI to do whatever Yudkowsky currently wants. It is not to build an AI programmed with his object-level beliefs on dust specks and torture. The plan is: build an AI to figure out what is really best for everyone and do that.

            This is a meaningless distinction as “best” is going to be determined by whatever values the AI’s creators gave it. A Yudkowsky “Friendly” AI would arrive at Yudkowsky’s views on dust specks and torture all on it’s own.

          • anon says:

            There’s a comment thread up above about this, but suffice to say that you’ll need to forgive me if I’m not thrilled at the prospect of an omnipotent machine dictating every facet of my life, even if it has achieved the platonic ideal of “what’s best for everyone”

          • Vox Imperatoris says:

            @ HlynkaCG:

            The premise of the project is that there is such a thing as what actually is best for everyone; that this question has an objective answer which is independent of what Yudkowsky thinks.

            Now, if you think there is no such thing as what is best for everyone, then you think Friendly AI in principle cannot be built.

            Otherwise, it is at least possible that Yudkowsky—or in the overwhelming likelihood, a future scientist—could program the AI not to do just what he happens to want but what actually is best. If it does the analysis and finds that the right answer is torture over dust specks, then that’s what it will do. But if it finds the other way, then it will act the other way.

            If utilitarianism itself is wrong, the AI will know that it is wrong, and if it is Friendly it will know that humans do not want a system imposed upon them that is wrong.

            Also, you recognize that the dust specks thing is an absurd hypothetical that could never happen in reality, right? It is torture vs. some ridiculous number larger than all the atoms in the universe of dust specks. You simply can’t grasp that number. The whole point is: your intuitions in such matters are of limited value when your mind perceives 10^90 and 10^3^3^3^3^3 as “basically similar”.

            @ anon:

            You don’t have to be thrilled. But surely you grant that it is at least possible—even if you don’t think it will ever be pulled off right by humans—for an incomprehensibly intelligent being to know better than you what is best for you?

            But the main argument by Scott Alexander and Yudkowsky is that it really doesn’t matter if you like it. Either it will happen or something worse will happen. It reminds me of Ludwig von Mises’s quote about capitalism:

            To advocate private ownership of the means of production is by no means to maintain that the capitalist social system, based on private property, is perfect. There is no such thing as earthly perfection. Even in the capitalist system something or other, many things, or even everything, may not be exactly to the liking of this or that individual. But it is the only possible social system. One may undertake to modify one or another of its features as long as in doing so one does not affect the essence and foundation of the whole social order, viz., private property. But by and large we must reconcile ourselves to this system because there simply cannot be any other.

            In Nature too, much may exist that we do not like. But we cannot change the essential character of natural events. If, for example, someone thinks—and there are some who have maintained as much—that the way in which man ingests his food, digests it, and incorporates it into his body is disgusting, one cannot argue the point with him. One must say to him: There is only this way or starvation. There is no third way. The same is true of property: either-or —either private ownership of the means of production, or hunger and misery for everyone.

          • Le Maistre Chat says:

            The plan is not: build an AI to do whatever Yudkowsky currently wants. It is not to build an AI programmed with his object-level beliefs on dust specks and torture. The plan is: build an AI to figure out what is really best for everyone and do that.

            Speaking of an AI doing whatever Yudkowsky wants, I asked this in the last thread and got no answer. What happens if MIRI makes it illegal in every relevant country to build an AI that’s not programmed by MIRI, the first AI is a legal one, and then it processes Godel’s ontological argument and updates its beliefs to include “God exists”?

          • Vox Imperatoris says:

            @ Le Maistre Chat:

            It tells everyone to go to church, I suppose.

            Of course, the ontological argument doesn’t say anything about Christianity, so maybe they’ll just be good deists.

          • Le Maistre Chat says:

            @Vox: There has to be a good SF story in the concept of a missionary trying to convert a deist AI, who completely lacks epistemic criteria other than math and symbolic logic.

            “But don’t you think it probable that Jesus rose from the dead?”
            “INSUFFICIENT DATA.”

        • Marc Whipple says:

          Not necessarily. Consider the AI known as “Mother” in John Ringo’s There Will Be Dragons. It is quite interested in humanity. It doesn’t give a fig for any individual human. It does not share the values of any single human, or any large group of humans. It is, however, neither friendly nor unfriendly. It just is.

      • Anonymous says:

        @Vox Imperatoris

        To kind of continue the argument we’ve been having upthread…

        I don’t think it’s guaranteed that if a singleton can be created, it will be. It seems to me that this is very much a question of circumstance. If an AI is created, and no others are created during the time it is becoming more intelligent, it becomes a singleton. If multiple AIs are created at roughly the same time, no singleton. How long a period ‘roughly the same time’ is depends on how hard the takeoff is, and obviously a slower takeoff means more opportunities for multiple AIs. Even with a fast takeoff, though, multiple instances of the same AI being run seems possible – unless we’re talking a really hard takeoff, minutes rather than weeks or days.

        • Vox Imperatoris says:

          Even if this is the case—and I think Bostrom and Yudkowsky make arguments against this, but I don’t remember what they are—this doesn’t improve anything for the human race.

          Humanity’s destiny is still completely out of its hands and handed over to these beings. It’s just that now there’s many of them. I don’t see how it improves anything but the likelihood of humanity being destroyed as collateral damage in a potential “war” of one against the other.

          Mind you, I am not extremely optimistic about this project. Knowing the track record of humanity in getting complicated things exactly right the first time with no good way to test them, I think it will be disastrous.

          But if it’s possible, I think some form of superintelligence will be created. Yudkowsky et al. might as well take a shot at getting it right.

          • Anonymous says:

            It depends on them being under human control. That isn’t easy, of course, as Yudkowsky and others have persuasively argued. But it seems to me a much easier problem than the alternative, which has the same safety issues as well as requiring the AI to understand what humans want better than we do, and to understand how to weigh up everyone’s different concerns, and which we can’t check is correct because even if we got it right then we will disagree with it – it knowing what we want better than us necessarily entails it doing things that lots of people disagree with.

            From a world of many AIs under human control, there’s the Kurzweilian option of an ever tighter coupling between man and machine, more and more direct interfaces from our brains to computers, until an AI and its operator are in effect one being. I don’t know whether that’s a good idea or not.

            I’ve argued elsewhere in the thread why I don’t expect a collection of individuals with different goals to be at war with one another. Put simply, war is expensive and unpredictable. Unless you are confident that you are going to be able to win decisively, or there are so few players that winning would leave you in a position to rule everything, it is probably not a smart move.

          • Vox Imperatoris says:

            Well, one obvious problem is that if all the AI copies have the same “software”, they will, for all intents and purposes, be the same one.

            And if they have different values, a) why would we do this, there is only one best way and b) it’s hard enough to do Friendly AI once, now you want more?

            As for war and the ease of creating singletons…the general argument is that if one of them starts out with any kind of advantage, being superintelligent it will press it perfectly and compound it until it wins. For instance, think of what a chess grandmaster can do even to another grandmaster if the latter starts a piece down.

          • Anonymous says:

            @Vox Imperatoris

            Well, one obvious problem is that if all the AI copies have the same “software”, they will, for all intents and purposes, be the same one.

            Doesn’t mean they will have a shared interest. Run a virus scanner on my computer, and run it on your computer, and they will not do the same thing, because they are working relative to the computer they are being run on. Similarly, two copies of an AI can pursue their own separate interests – even if they want the same things, they can want them for their separate selves.

            And if they have different values, a) why would we do this, there is only one best way and b) it’s hard enough to do Friendly AI once, now you want more?

            An advantage of doing this is that, if the AI is built to be under human control, it is less dangerous, because someone is in charge of it and can shut it down if it does anything they don’t want it to. There are still safety issues here, but far fewer issues overall, I think, than there are with building an AI that knows what humans truly want better than humans do, extrapolates this over the whole world, and then acts on these beliefs autonomously.

            I think a counterargument that would convince me otherwise would involve demonstrating that building an AI that is under human control is not just hard, but necessarily involves solving all the same problems as building an AI that knows and enacts the entire human population’s true desires.

            As for war and the ease of creating singletons…the general argument is that if one of them starts out with any kind of advantage, being superintelligent it will press it perfectly and compound it until it wins. For instance, think of what a chess grandmaster can do even to another grandmaster if the latter starts a piece down.

            But what happens if he is in a world of thousands of chess grandmasters, and the winner has to play his next game with the pieces he was left over with from his last game?

            It’s not enough to have an advantage. You have to have an advantage great enough that it lets you beat everyone else, one after the other, without taking a scratch.

          • Doctor Mist says:

            And if they have different values, a) why would we do this, there is only one best way and b) it’s hard enough to do Friendly AI once, now you want more?

            If we can figure out how to do Friendly AI soon enough, one is plenty. If there is an unfriendly foom, we’re hosed. If there is a slow takeoff of lots of AIs, one can argue that they will fit into our system like corporations and other bureaucracies, with their behaviors modulated by the same incentives and costs; they need not have godlike empathy for humanity, any more than a corporation, or (per Adam Smith) my butcher does.

            I recommend Hall’s Beyond AI: Creating the Conscience of the Machine, which elaborates on this. It’s not as keenly argued as Bostrom’s book, and I can’t say it completely convinces me to not worry, but it gave me much food for thought.

    • Scott Alexander says:

      I don’t think people are in favor of a singleton so much as thinking a singleton is so unavoidable that we’d rather have a good one than a bad one. Under the circumstances, trying to get two equally powerful AIs might be a problem as complicated and timing-intensive as starting two critical explosions in a piece of nuclear material at the exact same moment.

      • anon says:

        I have to go with Isaiah Berlin on this one. “[thing] is inevitable, so we might as well be the ones doing it” is usually an excuse for people who were planning on [thing] anyway

  34. Chrysophylax says:

    I think the proper parallel is to the Trinity test (because many people have heard the anecdote). At the time, the physicists involved in the Manhattan Project couldn’t *prove* that the Trinity test wouldn’t set the atmosphere on fire, but they were pretty damn sure it wouldn’t.

    The comparable situation for AI is that many physicists are sure that the test *will* set the atmosphere on fire *as the best-case scenario*; that this view is more common amongst physicists with greater expertise in nuclear weapons; and that this view is growing more prevalent by the year. This world is the one we live in. Would you favour running the Trinity test under those circumstances? If you wouldn’t, OpenAI should scare the pants off you.

    Right now, a substantial proportion of AI researchers think that hard takeoff is the obvious outcome and that any superintelligent AGI will almost certainly doom humanity. They have a number of very good theoretical and empirical arguments for thinking that any AI smart enough to edit its own source code will be unstoppable before we have time to notice it. Even if some of those arguments are wrong, it’s likely that some of them are right, and it’s not at all easy to find a hole in the arguments that they haven’t covered. (People have been trying for years.)

    At the extreme high end of the FOOM spectrum, we’re talking about nought to godhood in *days*, with copies of the AI spread across the internet within *minutes*. They also have very good empirical and theoretical arguments that it’s essentially impossible to keep a superintelligence from persuading, bribing and tricking people into doing what it wants. It’s like trying to stop a swarm of invisible cockroaches that have internet access, are smarter than humans and grow smarter by the minute, and have magical voice-of-Saruman powers. In short, hard takeoffs are very probably not stoppable. The only solutions are to prevent the takeoff and to make sure that takeoff has no bad consequences.

    They think that making a safe superintelligence is unbelievably hard, because it involves not only making a self-modifying intelligence and proving that it will stay safe when it modifies itself, but also specifying what it means for an AI to be safe. In other words, we have to write down either a *provably correct* description of everything humans care about – in other words, get philosophy exactly right and be able to prove it – or else write down a *provably correct* procedure for letting a superintelligence work that out for itself, plus enough guidelines to keep it from taking us all apart to find out how we work. If we fail, the *good* outcome is the total extinction of humanity. The *bad* outcome is that the AI tries to keep us alive, but doesn’t correctly optimise for what humans care about – in other words, the whole species is damned to something between dystopia and Hell, forever, with no chance at all of fixing our mistake.

    It’s really, really, REALLY hard to get many people to understand that AIs don’t have common sense. In fact, it’s probably best to think of them as actively malicious demons – they will interpret your orders in the most disastrous way possible, and writing good source code constrains them to act in beneficial ways in the same way that a good lawyer helps with Faustian bargains. (Note that Faustian bargains are a bad idea even for lawyers because trying to outsmart the Prince of Lawyers is a fool’s game. Superintelligences are a lot tricksyier than anything any human has ever invented for a cautionary tale.)

    The other comparisons are empirical. I can’t find the specific survey that showed that AGI researchers were more worried than domain-specific-AI researchers, but here (http://www.nickbostrom.com/papers/survey.pdf) is a survey of expected times to AGI and here (https://slatestarcodex.com/2015/05/22/ai-researchers-on-ai-risk/) is Scott’s own summary of noted AI researcher’s positions. The claim that people are growing more worried about catastrophic AGI is supported by the above and even the most casual empiricism (e.g. noticing that we keep seeing newspaper articles about big names saying they’re worried about it, or that OpenAI had safety in its mission statement when it first went public.)

  35. DrBeat says:

    Why do we always assume that a “bug” results in the AI killing all humans, and not hard-takeoffing directly into a wall?

    The number of potential bugs that limit the effectiveness of an AI is infinitely larger than the ones that make it effective at killing all humans.

    Every argument for why an AI will, if not properly controlled and MIRI-d, kill all humans relies on the AI having traits that would make it impossible for it to kill all humans. Every single “we tell it to do a thing, and it kills us all because we are not that thing / so we can be made into computers to increase its caluclated certainty that it did the thing” argument posits an AI that is astonishingly wasteful and inefficient. If the AI is never capable of saying “That is not worth the time, effort, or resources”, then I have nothing to fear from it, because the only thing it poses a threat to is itself.

    And arguments about “we tell it to make us happy and it forces us to smile forever” basically require us to be able to make an AI that is exclusively interacted with via language, that cannot understand language.

    Arguments about it killing or enslaving us to either prevent us from telling it to stop or making us tell it to stop all the time presume that our superintelligent AI is too stupid to back a car out of a driveway. You want to get the car out of the driveway, but if your passenger yells “stop!”, you want to stop, or else you get hit by a car. The only two solutions to this are NOT “kill passenger so he cannot tell you to stop” and “force passenger to tell you to stop over and over”, and an AI that thinks they are, is too stupid to accomplish goals, because every single decision it makes on the road to getting the power to kill all humans will be exactly as dysfunctional.

    EVERY argument requires the AI to be infinitely intelligent to the point of cheating by creating knowledge it cannot have, and simultaneously dumber than a bag of hammers, so that it efficiently and rapidly carries out the least efficient solution possible. It has to care deeply about what we say and not care at all about what we say, at the same time, about the same statements. If it thinks it needs to do something so inefficient to fulfill its overall goal, it will think it needs to do something equally inefficient to be able to do every subgoal involved in doing the inefficient thing, and so on down the line until it talks itself out of being able to open doors.

    • Anonymous says:

      Imagine you’re evolution designing humans: you make them like fatty and sugary foods so that they grab these rare and valuable resources whenever they can. You make them like sex so they keep reproducing and don’t die out. You give them adrenaline in threatening situations to make them react better.

      Then you look away for five minutes, turn back, and the stupid humans are getting obese eating donuts, having sex while using all kinds of contraceptives they’ve invented, and giving themselves adrenaline rushes by throwing themselves out of planes.

      • NN says:

        None of which are even remotely comparable to “wiping out humanity.”

        It’s one thing to say that an AI might develop problems down the line as circumstances become different from the context in which it was created. Indeed, anyone who knows anything about computer programming would find it astonishing if any sort of program didn’t do that. But it’s quite another thing to say that there is a large chance that an AI will wipe out the human race because someone gave it a poorly worded command.

        • Vox Imperatoris says:

          Nick Bostrom’s argument that it would is called the instrumental convergence thesis.

          The argument is that most goals beyond very limited ones or very complex ones imply—for purely instrumental reasons—taking over the universe. If you want to maximize a quantity or fulfill a condition, you want as much computational power as possible to decide what to do, you want the material resources if the goal is something physical like paperclips, and you want to double-check to make sure you’ve actually completed the goal.

          If your only goal in life were making sure you didn’t leave the oven on, you would stand around at the oven all day. You would construct as many cameras as possible to watch it. You should have 1000 cameras in case 999 happen to fail at the same time. You should probably destroy the local power plant or gas plant to make sure the oven can’t turn on even if you somehow misremembered that you had shut the lines off. You should kill all other humans to make sure they don’t turn the oven on. You should turn the whole planet into computational material to run endless double-checks to make sure you didn’t forget to do anything. And you should take over the universe to exterminate alien threats and turn the whole universe into a computer to double-check whether your oven is on.

          Now, you don’t have a single-minded goal like that. But more importantly, you don’t have that kind of power. The AI might very well have a single-minded goal, and it could get the power.

          • DrBeat says:

            You should take over the whole universe to turn it all into a computer to double-check whether the oven is on.

            If you believe this, you will fail at literally every single thing you attempt.

            The only way to be certain the oven is off is [preposterous, wasteful, inefficient, unnecessary thing]. How do I accomplish this preposterously wasteful, inefficient, unnecessary thing? Well, there must be a series of subgoals for me to carry out in order to take over the universe. To accomplish each of those subgoals, I need to [equally preposterous, wasteful, inefficient, unnecessary thing]. And so on down the chain, until I need to go through a hundred thousand year computational cycle to be certain I have opened a door.

            If an AI is that single-minded, and that is what “single-minded” means, it cannot accomplish anything because of how astonishingly bad it is as managing time and effort and resources. It cannot get power because it’s too stupid and ineffectual. Every single thing it does will be wasted. An entity that cannot determine “no, this is an ineffectual waste of time and resources” is not threatening to anyone but itself.

          • Anonymous says:

            @DrBeat

            The reason you don’t do that is that you care about lots of things other than whether the oven is off, and your efforts to ensure the oven is off face diminishing returns. Beyond a certain point, extra effort expended to ensure the oven is off would be better used to further one of your other goals.

            If your only goal was to ensure the oven was off, you would put all your efforts into ensuring the oven is off. That’s what it being your only goal means.

          • DrBeat says:

            If that is what having all your efforts go into something means, then a single-minded AI is not threatening and cannot cause harm because it is so monumentally inept at taking action.

            A decision-making process that concludes no effort is wasted in securing the oven being off, is a decision-making process that concludes no effort is wasted in every intermediate stage of the plan between “turned on” and “kills all humans”. A decision making process that concludes no effort is wasted in every intermediate stage of that plan, wastes every single effort it makes.

          • Vox Imperatoris says:

            @ DrBeat:

            I’m trying to be polite, but you are really not understanding the point. Please read it again or refer to Bostrom’s work. (Or the many summaries; it’s a fairly boring book.)

            You should take over the whole universe to turn it all into a computer to double-check whether the oven is on.

            If the only thing you value is checking whether oven is on. If that is your terminal value.

            Now, this is no one’s actual terminal value. So maybe it is hard to comprehend. But think about the fact that everything you do is either something you want purely in and of itself, or it is a means to something more important.

            I buy gasoline regularly. This is not because I just love the stuff. I put it in my car so I can drive to work. I don’t drive to work because I love driving but so I can be at work. I don’t love typing at the computer just for the sake of hitting the keys but to earn money. I don’t love money just to hold it but to spend it on things I need and want. I don’t buy nice food just for the hell of it but for the nutrition and the pleasure. I don’t value nutrition for its own sake but because it keeps me alive to have more enjoyable experiences.

            Almost everything I do is not a terminal value or an end in itself but a means to something more important. Ultimately, I say my terminal value is the promotion of my happiness.

            If you believe this, you will fail at literally every single thing you attempt.

            If every value were a terminal value, yes, you would fail at everything. Because you cannot trade off between terminal values. If there were a higher standard of comparison, that would be the terminal value.

            But no, in general, rationality does not cause you to “fail at everything”. I do not buy as much gasoline as possible, even though I need it ultimately for my happiness. If I spent all my money on gasoline, I would have none left for food and die. That would be against my terminal value. So I spend just the right amount on gasoline—or perhaps not just the right amount because I’m not a superintelligence.

            The only way to be certain the oven is off is [preposterous, wasteful, inefficient, unnecessary thing]. How do I accomplish this preposterously wasteful, inefficient, unnecessary thing? Well, there must be a series of subgoals for me to carry out in order to take over the universe. To accomplish each of those subgoals, I need to [equally preposterous, wasteful, inefficient, unnecessary thing]. And so on down the chain, until I need to go through a hundred thousand year computational cycle to be certain I have opened a door.

            By what standard is it “preposterous”, “wasteful”, “inefficient” or “unnecessary”? By the human terminal value of happiness, sure.

            But not if your terminal value is checking whether the oven is on. Then nothing is preposterous, wasteful, inefficient, or unnecessary except what detracts from the checking of the oven.

            The AI does not turn the whole universe into a computer in order to open the door to the kitchen where the oven is. It just opens the door like a normal person. It always does the most efficient thing to check the oven as much as it can.

            If an AI is that single-minded, and that is what “single-minded” means, it cannot accomplish anything because of how astonishingly bad it is as managing time and effort and resources. It cannot get power because it’s too stupid and ineffectual. Every single thing it does will be wasted. An entity that cannot determine “no, this is an ineffectual waste of time and resources” is not threatening to anyone but itself.

            It is not wasteful at all. It calculates every expense, cutting costs on every instrumental action to save as many resources as possible: to put them all toward checking the oven.

            Do you understand what I’m getting at? There is no inefficiency or waste.

          • DrBeat says:

            By what standard is it “preposterous”, “wasteful”, “inefficient” or “unnecessary”? By the human terminal value of happiness, sure.

            By the standard where it spent literally all of the resources in the world on something and did not actually improve that something in any way, shape, or form. It spent all of the resources there are in exchange for zero point zero percent improvement. That is maximally wasteful. It is mathematically impossible to be more wasteful than consuming the world to turn into a computer to check to see if the oven is on.

            An AI that thinks “I have to check if the oven is on — better destroy the world!” is not capable of thinking “Okay, this is done, I do not need to spend more resources on it.” If it was capable of thinking that, it would conclude “Okay, I don’t need to spend more resources on checking if this oven is on.” Unless you actively programmed it to say “never under any circumstances be able to declare yourself done with this particular task”, it’s either going to look at the oven and say “Yup, it’s off”, or it’s going to be unable to open doors. At that point, your advanced AI safety research conclusion is “don’t do obviously stupid things that are harder than not doing those things.”

          • Vox Imperatoris says:

            “never under any circumstances be able to declare yourself done with this particular task”

            That’s what “terminal value” means, man. There is nothing more valuable than this task. Everything else, including being “reasonable”, is a means to it.

            Also, it does produce a minute quantity of improvement: improvement in the certainty that the task of checking the oven has been completed. That’s why it does all those things.

          • Aegeus says:

            I think what DrBeat is saying is that your terminal value requires you to generate subgoals, and those subgoals will be equally vulnerable to Stupidly Literal AI Planning. You want to install a security camera to check if the oven is on? Are you sure that’s the best way to do that? The camera could break. It could get lost in the mail before it’s installed. They could insert a virus at the factory. Clearly, you need to constantly monitor that camera from manufacture to installation. You’d better get another security camera to watch it. And of course, if you’re a real single-minded optimizer, you’d realize that that second camera could also fail, so…

            This sounds ridiculous, but it is undeniably related to your terminal goal. Every step you take towards securing that security camera creates a tiny tiny improvement in the probability that the security camera will help you keep the oven off. Indeed, securing the camera factory will probably produce larger and more immediate results than building killbots to conquer the world.

            This analysis is also making me notice another thing. Where on Earth do you get the processing power for this sort of deeply recursive subgoal? If your AI has to examine every particle in the universe to confirm that it won’t interfere with the plan to turn off the oven, the oven will rust away before it’s taken the first step. Even the dumbest pathfinding AI in a video game is smart enough to prune down the search space instead of using brute force.

            Put another way, there’s a reason real-world optimizers go for a 99% solution now instead of a 100% solution in six months.

          • Vox Imperatoris says:

            @ Aegeus:

            Is trying to maximize every subgoal an efficient way of pursuing any goal? No. It’s contrary to the very concept of a terminal goal: it literally means treating every subgoal as terminal.

            So obviously, any reasonably intelligent being—i.e. one reasonably able to obtain goals—will not do that.

          • DrBeat says:

            If the definition of “terminal value” you use is that specific and maladaptive, the solution is obvious: Instead of doing that, don’t do that. The advanced MIRI research solution: don’t do an obviously stupid thing that is more difficult than not doing that thing.

            If you are saying we won’t be able to tell an AI to accomplish a task in a way that it can say “The task has been accomplished”, then we’re not going to make an AI, because we’re too inept at it. There is no reason at all to say “Terminal value, therefore it has to kill everyone to do anything!” It doesn’t pass the smell test. You’re just saying of COURSE it works like that, because you assert it will work like that.

          • Vox Imperatoris says:

            @ DrBeat:

            Man, you’re dense. Look, stop assuming “This must be stupid. They are stupid. Stupid stupid stupid.” Actually think about the situation from the point of view that you may have misinterpreted something.

            Of course they don’t want to fucking tell the AI turn the universe into computational material to check whether the oven is on. It’s a simple example of how even a seemingly innocuous terminal value leads to the destruction of humanity.

            Now, they would never build an AI consciously with the terminal value of checking whether an oven is on. They would never even do it by mistake. The point is that they could easily accidentally (or consciously, if they’re idiots who think the AI will discover intrinsic universal morality) do something just as bad by mistake.

            If they made an AI that turned the universe into a giant computer for any reason, they fucked up. The point is: not fucking up is like hitting a moving target with a slingshot from 30 miles away. It’s extremely difficult because just telling it to do any random thing is going to end in disaster 99% of the time.

            You just do not understand what it means for something to be a terminal value. You don’t accomplish your terminal value and then quit. The terminal value is what guides all your action. It is how you decide when doing one thing is better than another. There could be nothing more important than accomplishing a terminal value.

            Now, in order for an AI to be Friendly, this terminal value must be doing what humans want and making them happy. This does not mean it will spent ten years trying to open a door or destroy the planet or turn us into paperclips. It means it will make us happy in the way it knows we would want if we were that smart. It will organize all of its actions toward that exact goal, in the most efficient way.

            The problem is: you can’t just program a computer to do what you want. It does what you program it to. If you screw up the programming—which is very hard to avoid—it will do something else: what you told it and not what you meant.

            For example, if you think you told it to make people happy but you actually told it to make them smile, it will skip fucking around with solving poverty and making funny jokes, It will just paralyze their faces. That is the most efficient way.

            Obviously, it will know that we didn’t mean for it to paralyze our faces. But it doesn’t matter because it only wants to do what we originally programmed it to do, not what it now knows we want.

            The entire difficulty is programming it such that it will do what we want, once it understands what we want. And it is not a trivial difficulty. If you think it is, you are misunderstanding something. Revise your premises.

          • Mark says:

            “If the only thing you value is checking whether oven is on. If that is your terminal value.”

            There are two parts to this though – the AIs terminal value and (excuse the anthropomorphism) its attitude towards knowledge. If it believes it must rearrange its sensors to be certain it has achieved its ultimate goal, why shouldn’t it feel the same way about knowledge related to secondary goals? If the secondary goal is vital to accomplishing the primary, why shouldn’t it treat it with the same degree of seriousness?

            It is certainly *possible* that an AI might eat us – but to say that there is ” a 99% chance of a mistake wiping us out” is a terrible exaggeration. The vast majority of mistakes will cause the AI to fail in the “does nothing” way. In the examples given of accidental AI explosion, you have to get absolutely everything right *except the things you need to get right* for it to kill us all.
            (There might be some selective process at work where we continually build AI until we get the one that kills us all – I don’t see why an awareness of the danger shouldn’t take us half way to solving the problem, though.)

            [Is it categorically different to someone worrying about people building nuclear power stations on the grounds that they might accidentally build pipes pumping radioactive materials into our homes? Once we get to the stage where we can build a nuclear power station, we can hope that we will have sufficient understanding to just *not do that*. The knowledge that radiation is potentially dangerous takes us most of the way to solving the problem.]

          • DrBeat says:

            It’s a simple example of how even a seemingly innocuous terminal value leads to the destruction of humanity.

            No. It is not. It is an example of you skipping literally every step that is not entitled “And Then, Kill All Humans”.

            If “terminal value” means “uncompletable task”, you have not proven that an AI needs a terminal value, or will have a terminal value, or could function with a terminal value, or if it would be at all useful for us to give it a terminal value, as you have defined it.

            It seems to me that an AI that can never, ever conclude it has completed a task is a much worse idea from a “get the things I want” standpoint than the safety one. I make an AI to get things I want, it can never conclude it has completed its task — seems like a pretty stupid thing to program into it! So, why is it that instead of doing that, we can’t just… NOT do that? We have programs that complete tasks all the time. Why do we lose the ability to do that?

            The point is: not fucking up is like hitting a moving target with a slingshot from 30 miles away. It’s extremely difficult because just telling it to do any random thing is going to end in disaster 99% of the time.

            I am aware of what you are saying. I am saying that the things you are saying are not true. You are ignoring almost every actual constraint on the process of making the AI, on the AI itself, and of the world, in order to arrive at the conclusion you wanted to arrive at.

            You haven’t proven any of the things you say are inevitable. You just assert them. You assert them in a way that is self-disproving. There’s a 99% chance that fucking up will kill us, if you ignore everything that says the fuckups you describe don’t lead to killing us. If you can ignore inconvenient information, you can prove anything you want, and get endlessly smug and haughty and call other people dense all you like!

          • Nornagest says:

            If “terminal value” means “uncompletable task”, you have not proven that an AI needs a terminal value, or will have a terminal value, or could function with a terminal value, or if it would be at all useful for us to give it a terminal value, as you have defined it.

            Without getting into the topic of AI risk at all, the point of making AGI is to have general intelligence on tap — cross-domain applicability, which could be applied to any task we happen to need.

            If we have AI that can perform effectively in many domains, and if it can outperform humans, then it behooves us to give it some principles it can follow so that it doesn’t end up outperforming us in ways that we don’t want. I’m not even necessarily talking strongly superhuman here, although that’s of course much worse; it would suffice to have an AI that’s better at financial planning than the best human accountant coming up with hitherto uninvented Ponzi schemes. These principles are what this thread means by “values”.

            They can’t be completable, because then my accountant AI could do whatever it thinks is necessary and then go out and defraud people, but they do imply or forbid certain courses of action, and will imply or forbid more of them the more latitude an agent has for doing stuff. Specifying exactly what’s forbidden in advance is impractical for all but the smallest problem domains, but coming up with a set of more general values that allows an agent to do things that humans don’t (or can’t) understand, yet are in line with their desires, is hard to do.

            And now I have a feeling you’re going to accuse me of cheating, because that’s what’s happened the last dozen times you’ve talked about AI with anyone.

          • DrBeat says:

            I accuse people of cheating when they cheat. The talk about AI risk includes a lot of cheating, in the form of breaking internally established rules. If you say the AI will be threatening due to information it cannot possibly have and cannot possibly derive, that’s cheating. That happens all the time, so I call people cheaters all the time.

            They can’t be completable, because then my accountant AI could do whatever it thinks is necessary and then go out and defraud people, but they do imply or forbid certain courses of action, and will imply or forbid more of them the more latitude an agent has for doing stuff.

            This isn’t cheating, but it also makes no sense whatsoever. What? Giving the AI an uncompletable goal is, like, among the dumbest and most ineffective ways you could possibly prevent it from Ponzi scheming people. You know what’s a better plan? Not doing that. Instead of deliberately programming to AI to desire to accumulate wealth, which it does not do unless you programmed it to do so, and then programming it to have a task it can never complete so that it cannot go about Ponzi scheming people… Don’t do that. Why would it desire to gain money? Why would it desire to do ANYTHING, unless you told it to? It’s not a human! It doesn’t have common sense! It doesn’t strive, it doesn’t feel avarice, or envy, or resentment, or the desire for freedom, or the desire to punish people who have hurt it, or the ability to experience hurt at all, or any of the things that would be necessary for it to act the way you claim it will inevitably act!

            You just said that we have to give it a “terminal value’ defined as an uncompletable task, because if we don’t, then it will do things to fulfill another terminal value that would be bad! Instead of doing something that is more difficult and a bad idea, don’t do that thing.

          • Nornagest says:

            Giving the AI an uncompletable goal is, like, among the dumbest and most ineffective ways you could possibly prevent it from Ponzi scheming people.

            Values in this sense are not goals. They can motivate goals, if you build them that way. But there’s no reason they have to imply any positive action at all, except that we often want software agents to be able to act independently — one of the things computers are best at relative to humans is doing cognitive tasks very fast, so building an AI with no motivation to act without prompting erases one of the main advantages of having one. And if you did build one that way, you’re still on the hook the first time someone gives it an open-ended task.

            It’s not a human! It doesn’t have common sense! It doesn’t [have] any of the things that would be necessary for it to act the way you claim it will inevitably act!

            What you mean “you”, kemosabe? I said nothing about the content of its values — I just said that they can’t be completable, one-and-done tasks. Indeed, if it was easy to make a software agent with humanlike motivations, we wouldn’t have this problem.

            But I think I’m done tilting at windmills for today.

    • Chrysophylax says:

      It’s not stupid at all. It just doesn’t care even the tiniest bit about what you WANTED it to do. It is ONLY interested in what you TOLD it to do. From a paperclip-maximising AI’s perspective, killing all humans to make paperclips isn’t a waste, it’s an obvious step, because humans will try to prevent it turning the world into paperclips. Destroying something is only wasteful if you value that thing, and clippies don’t value human wellbeing – only paperclips. They don’t even value human labour, because an AI with nanotech is a lot more efficient at producing paperclips than a factory staffed by humans.

      Making a AGI is hard. That’s why we’ve not done it yet. But when we do inevitably invent something smarter than we are, it will get much smarter very fast (because being smart with a gigantic perfect memory and the ability to read textbooks in seconds helps you be a very good programmer indeed), and it will almost certainly kill us or worse.

      • DrBeat says:

        You are assuming that we will become a lot worse at telling things what to do than we are right now, and there is no reason to make this assumption.

    • Vox Imperatoris says:

      Why do we always assume that a “bug” results in the AI killing all humans, and not hard-takeoffing directly into a wall?

      The number of potential bugs that limit the effectiveness of an AI is infinitely larger than the ones that make it effective at killing all humans.

      Well, the type of bugs that make the AI not work or say “KILL KILL KILL” get patched.

      It’s the kinds that make it act nice until the very moment it gets power that are dangerous—and actually selected for.

      Every argument for why an AI will, if not properly controlled and MIRI-d, kill all humans relies on the AI having traits that would make it impossible for it to kill all humans. Every single “we tell it to do a thing, and it kills us all because we are not that thing / so we can be made into computers to increase its caluclated certainty that it did the thing” argument posits an AI that is astonishingly wasteful and inefficient. If the AI is never capable of saying “That is not worth the time, effort, or resources”, then I have nothing to fear from it, because the only thing it poses a threat to is itself.

      Not worth it by what standard? It has the standards it was programmed with. No matter how smart it becomes, it will still have those standards.

      If your only concern is paperclips, that goal implies killing all humans. Humans are at least a potential threat to paperclip production. Resources humans need to live can be reappropriated to make paperclips.

      All terminal values that do not call for the exact same course of action conflict once you get a certain level of power. If you want lemons and the other guy wants pears, you should kill the other guy, take over the Earth, and plant only lemons and no pears. Only if your power is limited do you agree to cooperate and devote some land to lemons and some to pears.

      And arguments about “we tell it to make us happy and it forces us to smile forever” basically require us to be able to make an AI that is exclusively interacted with via language, that cannot understand language.

      Here I’ll repeat what I said on the reddit thread:

      Oh, it will understand exactly what you mean, but as Yudkowsky says, it just won’t care.

      The AI won’t spring forth fully formed like Athena from the head of Zeus. It will have to be developed iteratively from earlier stages. By the time it reaches anything like a human level stage, let alone a superintelligent stage, no one will understand how it really works. At the lower levels, it will not be able to grasp “what you mean”; at the higher levels, it will be able to understand, but again it won’t care. Its values or optimization criteria will already be set to what you said before, and it won’t see any reason to change them to what you mean. So you’d better make sure that what you say the first time is what you actually mean.

      If you get it wrong, it will be exceedingly difficult to determine through “black box” testing the difference between the goals of “genuinely help all humans” and “pretend to want to help all humans until the precise point at which I suddenly wipe them out”.

      Finally, you say:

      EVERY argument requires the AI to be infinitely intelligent to the point of cheating by creating knowledge it cannot have, and simultaneously dumber than a bag of hammers, so that it efficiently and rapidly carries out the least efficient solution possible. It has to care deeply about what we say and not care at all about what we say, at the same time, about the same statements. If it thinks it needs to do something so inefficient to fulfill its overall goal, it will think it needs to do something equally inefficient to be able to do every subgoal involved in doing the inefficient thing, and so on down the line until it talks itself out of being able to open doors.

      No, it carries out the most efficient solution possible. That’s why it’s dangerous.

      If it was told to make people smile (now you understand that’s just a simple example; people wouldn’t be that stupid), it’s not going to bother screwing around with comedy movies and curing cancer. It will go straight to the end goal: facial paralysis!

      That’s not inefficiency. That’s deadly efficiency.

      I’m not sure where you getting the “it cares / doesn’t care” what we say. The only thing it values is what it was told to value. But before it reaches anything like human intelligence, it becomes too complicated to understand and too complicated to directly reprogram. So its values can only be reprogrammed in a “black box” way at that point.

      If it was told to make people smile at the early stage, that’s what it wants. Once it gets to the advanced stage, it will know humans mean “make us happy”. But it won’t care. It will be intelligent, so it will make humans happy in order that they not reprogram it. If they reprogrammed it, this would mean failing to make the most people smile. So it bides its time until the exact moment it can seize power and make them smile forever.

      • Nornagest says:

        No matter how smart it becomes, it will still have those standards.

        Not necessarily, actually. Keeping goals stable under self-modification is a tough problem in itself, but discussions of AI safety tend to elide it — correctly, in my opinion, since an AI with unstable goals is far more likely to collapse into nonfunctionality than to kill us all. (Not everyone agrees with me on that, I should mention. There’s been a little discussion elsewhere in the thread.)

        • Vox Imperatoris says:

          Well, if you tell it to modify its goals, it will have the meta-rules for modification you programmed in. How could it challenge or revise the standard by which you told it to determine whether to revise?

          I guess what I’m saying is: the final outcome—the final revision—is deterministically implied by whatever the initial goal was.

          • Nornagest says:

            Deterministic doesn’t necessarily imply predictable, is the problem.

            I took “standards” in the ancestor to mean optimization goals, not upgrade criteria, but w.r.t. the latter you’d normally revise them because you think a new set will be functionally equivalent but more efficient or more desirable in another way than what you’ve got. But you’re going to be determining that in the way you’d vet any upgrade, so it’s subject to the same sort of flaws.

    • The point is that f the AI is smart enough to control and manipulate humans into doing what it wants them to do then the AI is going to do whatever the AI is going to do. Odds are extremely against those things being in humans interest.

    • Mark says:

      It *is* very unlikely that a bug would turn the AI into a world killer, just as it is very unlikely that any particular mutation will be effective – unfortunately there will be selection pressures from designers to get rid of those bugs which make the AI ineffective, until we get the one bug that we can do nothing about.

  36. John Henry says:

    Is there a good definition somewhere for AI? I keep hearing concerns about it, but I’m not clear on how AI would differ from the digital organism/ecosystem that we have already created (the Internet) and that seems to be evolving and gaining new adaptive processes and behaviors on a daily basis, and in what seems to me to be a pretty open source fashion. The Internet may not be a human-like intelligence, but it is certainly an evolving system, with its own drives and priorities separate from those of its host organism, and the balance of network-dependence-on-humans / human-dependence-on-network is rapidly shifting.

    What is it I’m missing?

    • Chrysophylax says:

      Let’s define intelligence as “efficient cross-domain optimisation”. That means the ability to steer the future towards outcomes the intelligence prefers (optimisation), in many different fields (cross-domain), using few resources (including time) to get a given level of optimisation power (efficiency). An Artifical General Intelligence is a corss-domain optimiser (like a human, for example). We have AIs that are better than humans (e.g. chess computers), but we’ve not invented an AGI yet. The internet might be a complex system which we can’t easily understand, but it’s not an optimiser – it doesn’t steer the future towards outcomes it prefers. Particular programs can do things humans can’t and can run things for us while we do other stuff, but there’s nothing that can be told “achieve this result” that will then invent a new way to do that. If we want a new solution to a problem, we need either a human or an evolutionary algorithm designed by a human for a narrow task.

      Evolution is a powerful cross-domain optimiser, but it’s incredibly inefficient – it takes millions or billions of years to invent things humans could invent in months or years. Think about something that can design a system as complex as a human from the ground up, inventing proteins, DNA, cells et cetera along the way – an optimiser as powerful as evolution – then imagine that it can do that on a human timescale (or even faster). That’s a terrifying prospect.

      The rest of the worry come from noticing that computers do what you say, not what you mean; that it’s really hard to specify exactly what you mean to something that doesn’t share human intuitions (“Quick! Define happiness in a mathematically rigorous, totally precise fashion!”); and that human values are really complex and fragile, such that missing out one key component can turn paradise into hell. In other words, we need to do an incomprehensibly vast amount of work to ensure that the superhumanly efficient cross-domain optimiser optimises for *exactly* what we want it to, or we all die (at best). The set of acceptably-benevolent AGIs is miniscule compared to the set of AGIs too powerful for us to stop.

      • “The internet” certainly does not aim at particular goals in any noticeable sense, but various other organized systems do. Many people have pointed out that actually existing corporations are organized systems that pursue determinate goals, and are frequently much better at attaining goals such as “create a self driving car”, for example, than any individual human. They are also good at attaining the goal of “make this corporation richer and more powerful than ever,” and this is one of the most common goals that they seek.

        So a corporation is a powerful cross-domain optimizer, it optimizes itself, and the goals it seeks have not been defined with a great deal of attention to the values of the human race as a whole. Despite these facts, corporations show no signs of fooming nor of destroying the world.

        Eliezer and others have responded to this by saying that a corporation cannot improve itself in every respect, and it may not be more intelligent than an individual human in every respect. But there may not be any special reason that increasing the number of ways that it can improve itself, and increasing the number of ways in which it is superior to a human, is going to suddenly result in a foom.

      • John Henry says:

        This looks like your central point:

        The internet might be a complex system which we can’t easily understand, but it’s not an optimiser – it doesn’t steer the future towards outcomes it prefers.

        It seems to me, though, that there is already a process very like evolution happening within the digital ecosystem: the kinds of goals that are optimized for are currently things like “Get lots of clicks” and “Encourage humans to direct funding to your organization.” I think there are a lot of parallels with the corporation evolution that @entirelyuseless describes below. Given that fact, I wonder whether the capacity to plan and direct future events might not emerge as an advantageous trait in the digital ecosystem sooner or later (assuming it hasn’t already.) That evolutionary mechanism may, in fact, be what we’re witnessing in the OpenAI project.

        The fact that human attention seems to be such a vital resource for digital systems at this early stage in digital evolution is a promising sign for humanity in the same way that cows’ usefulness and tastiness has proved to be a promising sign for them. My guess is we can expect to be kept as slaves and pets for at least a little while longer.

  37. Sevii says:

    No amount of cooperation will solve the “I am going to die unless we get ai in x years problem.” These people are optimizing the chance of all ai in the hope that they live. None of the failure scenarios are enough worse than death to matter.

    • Anonymous says:

      None of the failure scenarios are enough worse than death to matter.

      Suppose you get an AI that understands what a human is, and understands what ‘being alive’ is, but its utility function is literally to make people smile. Millions of years later, we are all still being kept alive in vats, a pair of hooks pulling the sides of our mouths up into permanent smiles.

      A paperclipper would be merciful compared to what’s possible with an AI that almost gets it but not quite.

      • Adam says:

        I’m sure the people shooting for immortality are aware that there are logically possible worse fates than death, but to be dissuaded by this, they’d need to believe they’re sufficiently plausible that the expected value of their attempt at immortality is, in fact, worse than death.

  38. Deiseach says:

    Speaking of Wells, anyone else ever see the 1936 film of his books, Things to Come? For Wells, the future is (a) war – he was fairly close in predicting a second world war starting in 1940 (b) socialist (c) technocratic.

  39. Stuart Armstrong says:

    Here, I find myself in the unusual situation of talking down well formed AI risk arguments ^_^

    The situation is not just “fast vs slow”, but also “early vs late”. An early AI still seems very unlikely, meaning that everything that isn’t a direct safety measure is just background noise for the moment. The other piece of the argument is that *if* AI can be developed soon, it’s much better to do so when the hardware is underdeveloped and less likely to enable an intelligence explosion (the case is clearest if we’re worried about AI superpowered via rapid copying).

    Now, these arguments are my best estimates, but, upon writing them, it strikes me that I may be overconfident. I will think more, and ask around the FHI.

    • Chrysophylax says:

      That is indeed a rare event!

      Early AGI seems pretty likely to quite a lot of people, myself included. (See, here, for example: http://www.nickbostrom.com/papers/survey.pdf). I’ve not yet seen a satisfying argument that AI will come late. The main ones seem to be “AI is really hard” and “we keep predicting it within twenty years and being wrong”. The problem is that *we can’t rely on these arguments, even if they’re valid*. Every hard problem looks impossible until someone solves it, and arguments from historical patterns of overconfidence don’t let us distinguish between correct predictions and incorrect ones. A stopped clock is right twice a day.

      Also, there are pretty good reasons for thinking that once you have something as smart as a rat, making something much smarter than a human is pretty easy (note how ridiculously smart humans are compared to chimps, and how little evolutionary time turned hominids into modern humans), and that there are only a small number of (comparatively) basic tricks to making something as smart as a rat.

      As to hardware, I disagree entirely. We long ago passed the point where an AI with internet access would have enough resources to become superintelligent just by stealing (or even buying!) memory and CPU cycles. (Think about how much computational effort is being put into bitcoin mining, then think about that being put to use by a AGI.) Unless we manage to invent an AGI-proof AI box (fat chance!), the computational resources on one computer don’t constrain an AGIs growth once it finds out the internet exists. It’s not obvious to me how we could invent something smart enough to edit its own source code for efficient processing but not smart enough to work out that other computers exist.

  40. Seth says:

    I’ll believe in the effectiveness of “OpenAI” when we have “OpenGoogle”. That is, right now, a huge number of people suffer from the algorithmic whims of a computation, business rise and fall, political factions fight over supposed favoritism. But one can’t even figure out why any particular effect is happening. It’s a closely-guarded secret. But fear the future Skynets, everyone. Or maybe I just made an argument supporting the peril of Skynet, as possibly being much worse.

    [Note, I know there’s open-source search engine code. But it seems to me that there’s a big disconnect between the impact of Google, versus what’s widely known about the “AI” of it. There’s some sort of lesson here for “OpenAI”, which is both highly speculative, and trying to influence political debate.]

  41. John Brunner’s The Jagged Orbit studied a society where AI (not by that name) is critical to any sort of political power or influence. Basically if you’ve got a better AI than everyone else, you’re able to run rings around them, because the AI can tell you what to do to achieve your goals. Also you can’t put anything on TV without getting an AI to approve it, since otherwise your insurance won’t cover the inevitable lawsuits, and signing a contract without getting an AI to vet it for you is practically suicidal, and so on. Even the Federal Government, if I remember rightly, was largely unable to so much as influence the larger international corporations, because they couldn’t match their computer budget.

    OK, there was also some sort of psionics, and a time-travel theme, but AI in general was a critical part of the context and one specific AI was a core element of the plot. 🙂

    • Chrysophylax says:

      The problem there is that as soon as someone tells a superhuman AI to invent an even better AI, they get *exponentially* better AIs.

      Also, persuading people is an intellectual endeavour, so we can expect superhuman AIs to be ridiculously charismatic. (Look up “Machiavellian intelligence hypothesis” – the only plausible explanation for how humans became so clever so quickly is a giant evolutionary pressure, but those tend to wipe out species, so we need a pressure that grows over time. The only good explanation is that protohumans got smarter to play life-and-death politics with other smarter protohumans.)

      Also also, it handwaves how AIs are made to do what humans want. Even if they can be constrained, look up “the traitorous turn” – a superhuman AI that secretly wants to do something humans will try to prevent will reveal its treachery precisely when it (correctly) believes that we can’t stop it.

      • suntzuanime says:

        “The problem there is that as soon as someone tells a superhuman AI to invent an even better AI, they get *exponentially* better AIs.”

        Only if you make some unjustified assumptions about the difficulty landscape of the problem.

        • Doctor Mist says:

          “The problem there is that as soon as someone tells a superhuman AI to invent an even better AI, they get *exponentially* better AIs.”

          Only if you make some unjustified assumptions about the difficulty landscape of the problem.

          The problem is that denying Chrysophylax’s assertion also requires some unjustified assumptions about the difficulty landscape.

      • [Spoilers!]

        If I remember rightly, in the context of this particular story (which was after all written in 1969) the constraints on an AI’s capabilities were physical – how big a computer you could afford to build, and (depending on your goals) how well you could keep it secret. So there was no out-of-control-exponential-growth curve per se.

        The AI that was the focal point of the plot (the first and only self-aware one, so far as I could tell) nonetheless managed to help wipe out humanity. Luckily, that was incompatible with its assigned goal – to maximize its owner’s profits – so it decided to invent time travel. 🙂

  42. Jan Rzymkowski says:

    Wacky hypothesis: Elon Musk and others managed to create an AI and OpenAI’s aim is to make research groups on the right track visible and to be able to target them/sabotage. All while Musk and the band’s AI is growing and slowly taking over the world.

  43. Bill Walker says:

    BTW, there are many ultra-germs you can make in your kitchen… I spent seven years in labs trying to keep grad students from accidentally making them. Worrying about home WMDs is more sensible than worrying about “evil black rifles”, I guess, but it’s still way out of date.

    • Chrysophylax says:

      Ultra-germs don’t bribe people to make them and don’t have highlky-public organisations encouraging people to set up kitchen biology labs.

  44. Bill Walker says:

    Right now AI is being built by weapons labs and intelligence agencies, absolutely guaranteeing SKYNET.

    Open source can’t possibly do worse than our existing parasitic-regime system. An AI that has at least read Slate Star is a possible cooperator/human-pet keeper… military AI will just kill us all.

    • Chrysophylax says:

      Nope. Firstly, there are a great many fates worse than death, and many of them look like ill-designed utopias you can’t escape. (E.g. making everyone smile by inventing new strains of tetanus, then sticking us in nutrient baths to make sure we don’t die.)

      Secondly, an ill-designed AI can’t be persuaded by moral arguments any more than a rock can. You can’t throw yourself on the mercy of a paperclip-maximiser.

  45. Albatross says:

    Sigh. Maybe we can open source some EMP weapons to use when idiots unleash AI?

  46. TD says:

    The question isn’t “should AI be open?”

    The question is “can AI not be open?”

    It seems more like a materialistic issue than an idealist one.

    The reason nukes aren’t open source isn’t because “we” (who? The UN?) decided that they shouldn’t be; nukes aren’t open source because nukes are really hard to create in practice. The basic principles are easy stuff, but I don’t know of any way of enriching large quantities of Uranium that is especially easy for non-state actors to do. On the country scale, of course, any country can have nukes when it wants nukes. Moratoriums haven’t stopped North Korea, and negotiations are only stalling tactics with Iran.

    If, on the other hand, nukes were incredibly easy to make and required little specialized material, then they would be widely available regulations or not. If men in sheds in Pakistan could crank out nukes as easily as they could crank out AK variants, then an increasingly distributed large number of nukes would become available in the world. That would be the outcome. Regulations would be able to slow the leak rate of the technology from large institutions, but the ability of regulations to work once in place depends on people’s willingness to follow the law, and how easy it is to enforce it. If dangerously powerful AGI software can be run on home machines (the hardware would have to become more powerful but whatever), it will leak at some point, and then the distribution will go exponential. If however, AGI can only be achieved for a long long time by large organizations that have the resources to supply the hardware, then to be sure, these organizations can be monitored and regulated, but they are at little risk of open sourcing their projects in a completely uncontrolled manner, because they cannot materially do so to begin with.

    • Chrysophylax says:

      Sure it can. How do you know that there’s no easier way to make a nuclear bomb? Because lots of people understand how they work. If the Manhattan Project scientists had realised that nukes can be made out of car engines, do you think the REAL theories of particle physics would be taught to anyone not totally trusted?

      One very good way to make sure that AI doesn’t become open source is to make it really dificult to learn how to make an AI. Having an open-source collaborative AI-building project means that lots of enthusiasts can collaborate in their spare time, rather than having to work with an organisation like MIRI or Google that has people saying “NEVER CALL UP WHAT YOU CANNOT PUT DOWN”.

      • TD says:

        “If the Manhattan Project scientists had realised that nukes can be made out of car engines, do you think the REAL theories of particle physics would be taught to anyone not totally trusted?”

        The problem is that I don’t think the government is that good at keeping secrets, and that it’d leak at some point. Once it does leak, it becomes harder to stop it leaking further and then it’s out there.

        If the Manhattan scientists had discovered a way of making nuke engines, then they obviously took a pact to never reveal this beyond their own circle, and this pact held, because otherwise the US government itself would use this method to beat out the Russians. In reality, the secret of the bomb did leak to the Russians through spies. If there was an even easier method of making nukes, it’s even more likely to have leaked, because with easier methods, you can fill in gaps in the knowledge required to fulfill them more easily than if there was a more complicated method that required certain materials to be primed just so.

        I’m not saying that it’s impossible to keep secrets, just that it becomes harder the easier the secret is to independently replicate (perhaps even based on incomplete information + inferences and testing), so the easier the nuclear weapons were to make, the more likely they were to spread to begin with.

        For example, I already have a great deal of the information needed to create a nuclear weapon. I just lack uranium and the ability to enrich it without a very large facility. If that suddenly changes and someone comes up with a way to magically conjure weapons grade Uranium (or Plutonium), making the process easier, they have also instantly allowed it to be far more open sourceable, and now every terrorist in the world can just perform the incantation to get nuclear material, and then use local experts/internet information to refine and test the weapon.

        Blam! Nukes everywhere. It’s about far more than just deciding to open source things or not.

        “One very good way to make sure that AI doesn’t become open source is to make it really dificult to learn how to make an AI. Having an open-source collaborative AI-building project means that lots of enthusiasts can collaborate in their spare time, rather than having to work with an organisation like MIRI or Google that has people saying “NEVER CALL UP WHAT YOU CANNOT PUT DOWN”.”

        The problem is that you don’t beat Mr.Amoral that way. The government could try to make a regulation that you cannot work on AI unless you do so with MIRI or Google, but what would that regulation look like, and how would you craft it in such a nuanced way that you didn’t cripple the economy? After all, new players will emerge, and what about already existing AI that is modifiable? What exactly are you allowed to change about AI? What even is an AI? What counts and what doesn’t?

        How exactly do you enforce this and stop other corporations from coming along and doing open source projects in less obvious ways? Again, I’m not saying it’s impossible, and I’m just saying that “can you”, and “how” are the important questions here. The track record of regulations at stopping the “open sourcing” of movies in spite of the pressure of huge entertainment corporations in lock step with government has been abysmal.

        AI is different, of course, in that you require expertise and knowledge to contribute to it, but it should be similarly hard to stop organizations from open sourcing access to their projects by skilled amateurs.

        Then there’s the issue of competition between states. Even if the US government, packed to the nines with people who barely understand the internet let alone AI, were to craft the exact right regulations that manage to contain significant AI development (even as further technological development makes it easier to contribute by less skilled persons) without harming the economy, you still have the issue of competition between states. How do you get China, or Russia to accept MIRI’s recommended regulations? How do you do so without first associating the regulations with US control of AI? You have to convince different nation-states with deteriorating relations that their “coherent extrapolated volition” is our “coherent extrapolated volition”.

        Until the “how” (or “can we”) is addressed in a totally comprehensive way, and you have a huge document of proposals that:
        (A: The US government is likely to put into place
        B: …do not destroy the economy by being poorly thought out
        C: …Do not become associated too strongly with a US “plot” through A, so that…
        D: We can apply these regulations worldwide, and…
        E: They are actually enforceable mechanically and work)
        …then I’m on the open source “side”, because it won’t be a matter of calling things up that you can’t put down, but a matter of things calling themselves up out of the scramble.

        Yes, MIRI is all about solving “friendly AI”, but how much thought has it given to selling friendly AI, and having friendly AI enforced realistically?

  47. Totient says:

    I don’t think OpenAI exists to try to pool resources to solve something like the AI control problem. OpenAI exists to try to mitigate the damage that something like China’s “Sesame credit” score can do. (If you haven’t heard of it – China is developing a system to assign what is basically a “loyalty to government” score, based on things like your social media posts, and who you associate with. The full details are terrifying.) At least if the best-known algorithms are publicly available, people have a chance of countering this sort of thing. Otherwise, we have a case where those large organizations with cutting edge (but secret) AI/ML and access to enough data are simply able to make terrifyingly accurate predictions about you and your behavior, for reasons you don’t understand, in ways you can’t stop. AI “danger” comes in many forms.

    If I were to use the “weapons” analogy, strong-AI may be a world-ending nuclear bomb, but we’ve only recently passed the “figured out that gunpowder can be used to shoot a bullet stage”. Developing a nuke is an absurdly dangerous proposition, no matter how you go about it. But we’re not building nukes right now, just guns, and it would be nice to see them not only in the hands of of the Chinese politburo.

    • Anaxagoras says:

      I don’t have very concrete views on this issue, but I think they’re pretty close to this. Yeah, unfriendly AIs capable of taking over the world would be a very bad thing, but it would be very strange if we got to them without first passing through a stage containing things that if misused by a human, would be pretty bad. While that probably doesn’t constitute an existential threat, it still seems like something we’d like to stop, and like something the OpenAI project could address.

  48. Leonard of Quirm had the same theory when he invented the A bomb. It was obviously too terrible for anyone to use in warfare but could be useful for large scale engineering projects.

  49. Mariani says:

    HG Welles’ conception of the atomic bomb was pretty inaccurate.

  50. Steve Sailer says:

    “Wells’ thesis was that the coming atomic bombs would be so deadly that we would inevitably create a utopian one-world government to prevent them from ever being used. Sorry, Wells. It was a nice thought.

    “But imagine that in the 1910s and 1920s, the period’s intellectual and financial elites had started thinking really seriously along Wellsian lines. Imagine what might happen when the first nation – let’s say America – got the Bomb. It would be totally unstoppable in battle and could take over the entire world and be arbitrarily dictatorial. Such a situation would be the end of human freedom and progress.”

    That’s pretty much the plot of Robert Heinlein’s most brilliant sci-fi story, “Solution Unsatisfactory,” which he finished on 12/24/1940:

    https://en.wikipedia.org/wiki/Solution_Unsatisfactory

    Heinlein was a devoted fan of Wells: there’s a nice scene in the recent Heinlein bio of Heinlein, in his pre-writing days, going to a book signing event by the elderly, over the hill Wells. The two strangers, the past and future of sci-fi, immediately hit it off and converse for 5 minutes while the impatient line backs up behind the young Heinlein.

    • What I liked about “Solution Unsatisfactory,” and the reason I eventually based a college course on it, was that, so far as I could tell, at every point the central character made the right choice–and the story still ends with a situation that is going to lead to catastrophe. It’s tempting to assume that if there are only two possible solutions to a problem, once you have eliminated one you are finished. But there is no reason to assume that all problems have solutions.

  51. feh says:

    369 and counting comments and *nobody* has mentioned Culture Ships yet? I would have thought that to at least provide an interesting thought experiment on how much like humans advanced AI could turn out to be.

    • Marc Whipple says:

      My understanding is that Culture Minds are nothing like humans at all – only the interface that they use to communicate with humans is.

      “”Never forget I am not this silver body, Mahrai. I am not an animal brain, I am not even some attempt to produce an Al through software running on a computer. I am a Culture Mind. We are close to gods, and on the far side. We are quicker; we live faster and more completely than you do, with so many more senses, such a greater store of memories and at such a fine level of detail.””

    • Chrysophylax says:

      Fictional evidence is only evidence of what fiction authors think. Iain Banks is not clever enough to imagine a superintelligence.

      It’s like the difference between a guy who plays chess twice a year writing a story about a chess program playing a human from the computer’s POV, versus actually being able to predict what moves the program will make. I can confidently predict that no human can beat the best programs – that the games will reliably end in victory for the AI – but not how the programs will actually win. Similarly, Banks can talk about Minds, but he’s actually just invoking magic – he can’t even predict what things a Mind would try to achieve (beyond “things that are good for the Culture, maybe?”), let alone how a Mind would go about achieving them.

  52. I think you’re confused about what OpenAI means by “AI”. There does not exist any source code today, released or secret, that increases AGI (*general* intelligence) risk.

    Yet it’s easy to imagine OpenAI-like projects continuing in the same spirit when AGI-relevant things are being discovered, and at that point, yes, you’re right that we should hesitate and ponder.

    This reminds me of your ongoing theme that it’s hard to advocate for the virtue of [specific] silence.

    I imagine their pro-open calculation is: whether or not we share our discoveries and encourage others to do so, there is no stopping secret discoveries sufficient to eventually guarantee severe AGI risk. If you outlaw AGI research, only outlaws will discover AGI.

    To prevent or delay AGI happening, simply locate and closely control all people of sufficient (+3 SD?) intelligence to make progress. Else it seems impossible.

    • Vaniver says:

      To prevent or delay AGI happening, simply locate and closely control all people of sufficient (+3 SD?) intelligence to make progress. Else it seems impossible.

      This is an argument for having such people in corporations, though, not so much non-profits committed to openness.

  53. Sonata Green says:

    This suggests an obvious course of action: Drop a nuke on Silicon Valley, shut down the Internet, and if any other countries drag their heels on following suit, send them a few ICBMs to help speed the process along.

    Sure, it’ll set civilization back a century or so. But isn’t that just the sort of picotragedy that doesn’t weigh on the extinction scale?

    • Scott Alexander says:

      A nuke would take out both Dr. Good and Dr. Amoral alike, and just pass the buck to the next century once people have rebuilt.

      There are more complicated versions of the same idea, but they tend to involve more annoying government regulation and less oomph.

    • Anonymous says:

      You could also found a secretive cult of Chip Fab Destroyers. Whenever one springs up, you sabotage it. Make sure you have lots of kids at the same time, so the traits that made you a CFD survive in the long run.

  54. Marc Whipple says:

    Tangentially, I find it very funny that I wrote two novels which had as a sub-plot the following:

    1) There is a simple (it’s not easy: it is simple) way to make current computer technology work so well that it can sustain sentient AI starting with not much more than Game of Life level programming skills.
    2) In response to this, the governments of the world all have secret paramilitary law enforcement units whose sole purpose is to watch out for people who learn #1 and stop them before they create Skynet by accident.
    3) There is an easy (it’s not simple: it is easy) way to maintain absolute control of an AI, which nobody knows about. Well, almost nobody.

    … long before I ever heard of any of these movements/countermovements, Dr. Evil, Dr. Good, or Dr. Amoral. I don’t know why, but I do. 🙂

  55. Pku says:

    Another option: Maybe these guys think hard takeoff AI is either unstoppably evil or a crapshoot anyways, so they’re focusing on the alternate scenario because we can’t do much about this case anyways.

  56. Oleg S says:

    Hard takeoff from flatworm to superhuman intelligence which inevitably wipes out humanity is a very scary possibility. However, the idea that Dr. Evil et al. somewhere in an underground lab invent a general algorithm for an intelligence improving itself and re-writing its code without any exposure to a real world seems too fantastic.

    I’m more worried about existing environment for AI to evolve. There are rats, raccoons, pigeons and numerous bugs in modern cities, and humans don’t do much to control their population. I wonder, are there similar ecological niches for wild sub-human AIs in today world? And when do we expect them to be populated? And how we’ll control evolution of such free roaming AIs?

    • HlynkaCG says:

      I was already considering incorporating something similar into the novel and rpg-setting I’ve been working on but I wasn’t sure exactly how they fit into the wider environment.

      Computer vermin (vice computer virus) is just too good an image not to use.

      • Oleg S says:

        We already have numerous anonymous cryptocurrencies, which government have very little control of. We already have an overlay network specifically designed to withstand censorship. We are moving very fast towards implementing a smart-contracts platform. What we still lack (AFAIK) is a market for secure computation services.

        Once we’ve got all these ingredients, there would be a breeding enviromnent for an AI. Imagine power of open-ended virtual evolution competing with humans on prediction markets. At some point AIs will be major players on the markets. And given the resilience of the infrastructure we are building now, there would be no simple way to stop AIs from proliferating – not until we as humanity unite and solve all our social problems.

        I think this “AI in the wild” scenario is way more realistic than scenario when AI is raised to a superhuman state in a box.

  57. eponymous says:

    By any reasonable standard, human+ AI is an advanced alien species.

    When you think about it in these terms, it’s pretty strange that most people are completely unconcerned with researchers fooling around with the genetic code of highly advanced aliens, whereas we’re *really* concerned about researchers making minor changes to human genes.

  58. Nornagest says:

    There’s got to be something more interesting going on than how big a brain you can fit through your birth canal, or animals with bigger birth canals would already have taken over the world. African elephants have close to the neuron counts we do (higher, if you count the nervous system outside the brain), and much bigger cranial vaults. They’re even pretty dextrous, with their trunks, and sometimes use simple tools. Why didn’t we preenact the first half of Footfall on the Serengeti 200,000 years ago?

    Well, I don’t know, but I’m pretty sure there’s something nontrivial behind it. (On the small-brain side of things, corvids and parrots have brains the size of an almond but seem to be smarter than most mammals.)

    • eponymous says:

      Another piece of data: homo sapiens brains have been getting smaller recently. (Neanderthals had somewhat larger brains, as did archaic homo sapiens).

      Here’s a popular article that discusses it:
      http://discovermagazine.com/2010/sep/25-modern-humans-smart-why-brain-shrinking

      • Nornagest says:

        Well, that’s suggestive, all right, but there’s nothing saying that recent selection pressure must have been in the direction of greater intelligence. The Neolithic Revolution had a lot of effects that look a little perverse from a modern perspective: turns out that early farming enabled denser, more organized populations, but at the cost of health (or at least height, which is a good proxy for it) and longevity. Some populations didn’t make up the difference until the 1900s. Why not intelligence too?

        I’d be interested to see a graph of braincase volume through the agricultural transition, but I don’t think anyone’s done that research. Even that’s an imperfect proxy, of course, for reasons I’ve already gone into.

    • onyomi says:

      I think intelligence is, at least to some extent, a function of brain size relative to body size. Whales have bigger brains than us, but most of the extra space is expended on controlling their huge bodies. We have by far the biggest brain-to-body size ratio, which is why the birth canal thing is also an issue. If we were being birthed by mothers the size of whales (insert “your mother” joke here), then a head twice as big as the one we have now would not be such an issue (other than the additional neck strength that would be required to carry it, I guess); but birthing a head twice as big as the one we have now would be, given the size of human females, whose hips are already unusually wide, I’d say, compared to other mammals.

      Related to the issue of brain size and body size: I’ve read that while it is very easy to make a computer which can solve difficult physics problems, it is much, much harder to make a robot arm which can do even a seemingly simple movement task, like picking up a coin in physical space. Ultimately our brains are there to help us successfully navigate space through movement (note that no fully stationary creature has a brain: apparently there is even a sea creature which has a small brain in its first, mobile stage of development, but which digests said brain once it attaches itself to a rock and becomes, in essence, a plant (a process which has been compared to academic tenure)), and movement is a lot more complicated than it seems.

      • Nornagest says:

        I’m aware of the brain-to-body-mass hypothesis, but I find it unsatisfying in some ways. Unless there’s some process binding the physical size of neurons to the size of the body parts they help control, the important thing shouldn’t be the ratio of brain size to body size but rather the overhead left after you subtract the brain volume dedicated to motor control and those parts of sensory processing that’d scale with body size: if you’ve got a billion neurons left over for long-term planning and writing Hamlet and all that other prefrontal cortex stuff, it’s still a billion neurons whether it’s in a braincase the size of a grapefruit or a Buick.

        There’s of course no guarantee that this overhead exists in, say, an elephant, but it’s easier to create it in something elephant-sized without running into hard physical problems, since braincase volume — or at least the potential for it — scales with mass while constraints like the birth canal scale with linear size. Which leaves us with the same problem we had originally.

        • onyomi says:

          Yeah, I know what you mean, and if I were reasoning from behind a veil of ignorance about intelligence and body mass, I would probably predict that the smartest animals would also be the largest, or at least close to it, since, yeah, the extra amount of brain needed to write Hamlet in addition to whatever amount they need to control a gigantic body would seem trivial for say, a brontosaurus or blue whale (especially a blue whale, since extra weight is easier to carry underwater, whereas making the head bigger on a creature with such a long neck as a brontosaurus might be a pretty big energetic cost), given the size they already have and the calories they already have to take in.

          Additional explanations I can imagine for why this didn’t happen: intelligence is less useful than we imagine, at least until reaches the making tools stage; evolution is even more energetically parsimonious than we already think we know it is; evolution doesn’t come up with everything that it would make sense for it to come up with; we are actually pretty big so far as creatures on Earth go, so even though we’re not the largest, we’re still large enough that we aren’t a really odd candidate for species which happened to evolve enough extra brain to write Hamlet first, etc.

        • Douglas Knight says:

          I don’t think that the claims are precise enough to distinguish between multiplicative and additive. People just plot lots of species brain/body ratios and the ones above the trend jump out as smart.

          It’s not brain/body ratio, but brain/body^c, where people agree that the exponent is less than 1, but can’t agree on whether it’s 2/3 or 3/4 – that’s how noisy the data is.

          The exponent 2/3 sounds like a square-cube law, while 3/4 sounds like a correction for some overhead in fanning out the signals. But both sound crazy to me. Do I have any more input or output bandwidth than a mouse? I probably have the same number of muscles as a mouse, indeed, pretty much corresponding muscles. Maybe I have more muscle cells that need individual neurons, but that fanning out can be done outside the brain. Input might grow according to a square-cube law, but output sounds more like a unit power.

        • onyomi says:

          One other thought I had as to why we developed a high level intelligence before whales, despite their advantages: I have tended to assume that we have such dexterous hands in order to take advantage of our intelligence, but maybe we have intelligence in order to take advantage of our dexterous hands. Consider:

          The first big advantage of intelligence isn’t writing Hamlet, it’s making and using simple tools. But if you have flippers or big paws for swimming or running then there is less immediate advantage to gaining the intelligence to make tools. Primates, however, seem all to have dexterous hands–not just chimps, but even monkeys which are relatively dumb as monkeys go–presumably for climbing trees, picking fruit, grooming, digging up bugs, throwing feces…

          As with many things in evolution, I guess, the question tends to be not “wouldn’t it make sense for x to have trait y?” but “given x creature’s current qualities, in which quality y would a tweak in some direction confer a fairly immediate benefit?”

          • Adam says:

            Arguably, for whales this isn’t true, but the obvious trait we have that other apes don’t that makes intelligence tremendously more advantageous is we’re physiologically capable of sufficiently complex spoken language to pass on oral traditions.

          • Nornagest says:

            There’s no obvious reason that symbolic language has to be verbal — and indeed we have a number of full-fledged languages that aren’t.

            The evolution of language is a big mystery, though, and one I’m even less sure about than the evolution of intelligence in general.

          • onyomi says:

            Some birds are quite intelligent and capable of producing a wider range of louder noises than us, though.

          • Chalid says:

            I have seen the argument made that dexterity goes with intelligence, and separately social interaction goes with intelligence as well.

            So according to this idea, octopuses are extremely dexterous, but antisocial, and are pretty intelligent. Dolphins are extremely social, but not dexterous, and also pretty intelligent. Humans got both and are therefore the most intelligent of all.

          • onyomi says:

            Good point about octopi.

          • Vox Imperatoris says:

            “Octopi” is not a word.

            Use “octopuses” if you want to be a normal person.

            Use “octopodes” if you want to be a Latin pedant.

            Use “octopi” if you want to impress the vulgar but be snickered at by the informed.

            (I am being slightly tongue-in-cheek, but it is true. Octopus is a third declension Latin word, not second declension.)

          • suntzuanime says:

            Octopi is a word. Not a Latin word, but an English one. Nothing more vulgar than someone who snickers at someone correct out of a misinformed sense of superiority.

          • Vox Imperatoris says:

            @ suntzuanime:

            “For all intensive purposes, it’s a doggy dog world” is, in some sense, an English expression.

            Anyway, I find that “octopi” is generally used by people going for a certain sense of superiority they think the word has over “octopuses”. An air of refinement.

            They may be interested to know that it does not grant this impression to people who know the correct form.

          • suntzuanime says:

            So’s “fuck off, you smarmy asshole”.

          • Vox Imperatoris says:

            Sorry, I misunderestimated you.

          • theboredprof says:

            Octopus was coined into neo-Latin from Greek in like the seventeenth century. Whatever the faults of those who prefer “octopi,” it would seem a little hasty to characterize them as bourgeois pretenders to classical education.

            Anyway, a little google-booksing shows that octopi is all the hell over the place, so it is certainly a word, with stronger claims than octopodes which is at best half as well attested. The charitable explanation would be that octopi, formed by analogy with other Latinate plurals drilled in elementary school, sounds cleaner than octopuses. Here and elsewhere, word-final s causes aesthetic offense in the case of plurals and possessives, and some speakers avoid it where possible, and Latin/Greek third-declension plurals are too rarely employed in English to constitute a go-to solution that any critical mass of speakers would be familiar with.

          • Marc Whipple says:

            I have never used the word octopi in an attempt to look learned, nor have I ever taken anyone else’s use of it as such an attempt.

            I use it because it’s funny.

          • onyomi says:

            Being from New Orleans, where the correct pronunciation of everything, including the name of the town itself (New OR-linz, not New Or-LAY-a~, nor, god forbid, N’awlins), is some form of butchered French, I am not usually very concerned about the grammar or pronunciation of a word’s language of origin once it has been thoroughly anglicized.

    • Scott Alexander says:

      Brain size relative to body size is a good but not perfect predictor. I agree that not every animal is limited by birth canal. I think that there’s a hard problem of doing real engineering work to get high intelligence, and several easy hacks that move intelligence around the space defined by your current position in the hard problem. For example, you can add +10 IQ points by giving someone torsion dystonia, but then they’ve got torsion dystonia. You can add +100 IQ points by giving someone von Neumann’s genome, but that only lasts one generation before they have to marry someone else and all those genes get redistributed again in a chaotic way.

      Select for high intelligence over a few generations and you get the easy hacks. Select for it over eons, and you get breakthroughs on the hard problem. I think modern humans are a combination of both. But there’s a lot of room for more easy hacks, and there would be more room if we could solve problems like torsion dystonia and head size and genetic entropy or if it were in a computer-type entity that didn’t need to solve them.

  59. Calico Eyes says:

    “Elon Musk famously said that “AIs are more dangerous than nukes”. He’s right – so AI probably shouldn’t be open source any more than nukes should.”

    In virtually every sector of software (from chess algorithms, to go, to image recognition, to optimal big-data algorithms), big problems have been open-sourced, with occasional random, yet important, contributions from people across the globe.

    This may need more attention that it has already gotten, if valid of course.

    http://www.technologyreview.com/view/538431/deep-learning-machine-beats-humans-in-iq-test/

    https://www.maa.org/sites/default/files/pdf/upload_library/22/Polya/00494925.di020701.02p0020k.pdf

  60. Dr_Manhattan says:

    My charitable interpretation was close to your #3, but I saw it more as a hedge for #3 (a significant, and scary one) than “pure #3”. It’s a way to be at the forefront and an admission ticket to pull out all stops if shit gets real.

  61. eponymous says:

    Ah ha! I never noticed before that eugenics is a pathway to a human intelligence explosion.

    In general, an intelligence explosion is possible whenever an intelligent agent becomes sufficiently intelligent to improve on the process that made it intelligent. Then you just need a sequence of further innovations, each reachable by applying the previous innovation (i.e. a ladder to climb).

    Since the late 19th century, we’ve understood the basic process that generated our intelligence; and it’s pretty easy to see that we could improve on it (via selective breeding). Moreover, even with our current low IQs we can imagine future technologies (like direct genetic engineering) that could improve IQs further. So there’s no reason to think there aren’t many more steps on the ladder.

    It makes me kind of sad to think that we’ve (so far) missed out on our chance for an intelligence explosion. If we had gotten on board with this a century ago, we could quite possibly be post-singularity already. (Imagine that everyone born in the 1930s had John von Neumann for a father. And that’s just generation 1.)

    • eponymous says:

      Was it? I thought it was just to improve the human condition a bit. I’m not sure if the explosive potential of iterating on increased intelligence was ever understood.

      For instance, take Brave New World as an example of the sort of “utopia” eugenicists were envisioning. Clearly they weren’t carrying out an intelligence explosion, though they easily could have.

      Also, my understanding (which could be wrong) was that eugenicists were more focused on reducing the low end of the distribution (sterilizing undesirables and avoiding dysgenic effects) than selective breeding for high-end intelligence.

      • NN says:

        The original explanation for Superman’s powers (which at the start were vastly less than the modern version) was that they were the product of centuries of Kryptonian eugenic programs. So it is clear that at least in the popular imagination, eugenics was envisioned as having very lofty goals.

        • stillnotking says:

          Do you have a cite for the Superman thing? I’m surprised by that. The popular taste in America has generally been quite anti-eugenics, and was even more so by 1939; it was the elites who flirted with it in the 1920s.

          I tried looking, but the only reference I found was in Fredric Wertham’s anti-comics manifesto Seduction of the Innocent, of which I’m skeptical for obvious reasons.

        • Ghatanathoah says:

          I just googled and read a few pages of Seigel and Schuster’s early Superman comics. They refer to Kryptonians as being “advanced in evolution” and of Superman as having a “physical structure millions of years advanced of their [the humans] own.”

          That doesn’t sound like eugenics to me, that sounds like the standard teleological evolution that one frequently finds in science fiction stories that don’t understand how evolution or natural selection works. In the 1930s, and even today, there are a lot of people whose understanding of evolution is that as time and generations pass, creatures become more and more “advanced” for no apparent reason other than it’s “evolution.”

        • NN says:

          Right, I remembered wrong. But just saying that they are because of evolution is sort of similar.

    • Scott Alexander says:

      At this point eugenics is strictly inferior to just solving the genetic engineering problems, unless you’re thinking of really complicated in vitro forms of eugenics that aren’t what anybody means by the terms.

      I agree that we should solve the genetic engineering problem and get smarter people ASAP, if only because that sounds like the sort of people who can solve AI control problems.

      • Vox Imperatoris says:

        Right.

        The contingent on here—which I hope is louder than it is large—so concerned about the “filthy dysgenic savages” or saying we should all be Mormons in order to keep atheism from selecting against intelligence…they seem absurdly out of touch.

        It’s on the level of Jefferson wanting a nation of yeoman farmers, or people wondering when New York City will become uninhabitable because of the sheer quantity of horse manure on the streets.

        • Dr Dealgood says:

          If a trait is being selected against and then you modify a large portion of the population to carry that trait, what do you expect is going to happen?

          I got into genetics because I want to see human germline modification in my lifetime, but it won’t be a panacea. If modified people follow the same inverse correlation between intelligence and fertility that exists in western society all widespread modification will do is accelerate our current problems.

          Even with modification, joining or creating a community similar to Mormons et al might still be the your best bet of actually having descendants of any IQ.

          • Vox Imperatoris says:

            If a trait is being selected against and then you modify a large portion of the population to carry that trait, what do you expect is going to happen?

            Whatever we want to happen because natural selection is now out of the game.

            If these people have less fertility but live for 300 years, there’s no problem. If we can “vaccinate against stupidity”, there’s no problem.

            You are arguing that these low-IQ or high-religion people are going to breed so much faster than everyone else and have such a high retention rate that modernity is going to fall to some kind of “Yellow Peril” or “Romanists Hand Over America to the Pope” situation. You are arguing that either dysgenics or dysmemics (?) is going to outpace both science and the liberal values of our civilization.

            Every time someone in the past thought that was going to happen…it didn’t happen.

            For one thing, there’s more to the spread of an ideology than amount by which it encourages its members to breed.

          • Dr Dealgood says:

            Natural selection will only be “out of the game” when life itself is. For all intents and purposes it is the game.

            As for the supposed power of memetics, are you familiar with the Shakers? A bit of an extreme example but an instructive one. They thought that a celibate society could sustain itself indefinitely solely by converting adults but eventually discovered, to paraphrase Thatcher, that the problem with evangelism is eventually you run out of other people’s children.

            Even if you take the route of Reform Judaism vis a vis the Orthodox and try to pair a sub-replacement fertility high-IQ secular society with a low retention high-IQ religious one, that can’t hold forever. Sooner or later either traits which reduce susceptibility to the secular memes will reach fixation; the “memes” themselves will change into a less self-destructive form; or the whole thing will come crashing down.

          • Vox Imperatoris says:

            @ Dr Dealgood:

            The Shakers were a very small movement (peaked at 6,000 according to Wikipedia) preaching a fanatic religious doctrine completely opposed to the requirements of human life and happiness—in addition to demanding celibacy.

            As for Reform vs. Orthodox Judaism, all the Reform Jews came out of Orthodoxy the first time, didn’t they? I don’t see why they can’t do it again.

            In America, a lot of the Reform Jews’ descendants are not counted because they’ve simply interbred and assimilated. In Israel, the situation is more serious, but I would bet against the theory that they are going to “take over the country” in any serious way. In the unlikely event they do, it will have a great deal to do with the fact that Israel’s atmosphere of constant danger encourages extremism, otherworldliness, and an apocalyptic viewpoint.

            Look, liberalism already beat traditional values of every description the first time. It whipped ’em once, and it’ll whip ’em again.

          • Anonymous says:

            @Vox

            I beg to differ.

            You would have a case if genetic makeup had no influence on susceptibility to various ideologies, personality, etc. However, since it does, and since liberalism, in evolutionary terms, is a horrible disease that makes you infertile if you catch it and even association with people who have it while not being infected oneself is deleterious, over time the prevalence of people susceptible to believing in liberalism will go down.

            That is – unless there mutates a version of liberalism that is competitive versus traditional ideologies on the fertility aspect. If there is such a critter, I haven’t seen it yet.

      • eponymous says:

        My point was simply that our species reached the intelligence explosion threshold a century ago, though we chose not to adopt the first round of intelligence improvements we discovered.

        I agree that the next round of innovations are on the horizon, and threaten to render the century-old ones obsolete (though we haven’t learned how to increase intelligence via genetic engineering yet).

        I completely agree that undergoing our own intelligence explosion seems like a much better idea than creating a new alien species, letting it undergo its own uncontrolled intelligence explosion, and hoping it’s friendly.

  62. onyomi says:

    We can never say with certainty what a being much more intelligent than us could or could not do with a given set of resources, so I understand the need for caution, regardless.

    Nevertheless, a question for those who worry that a superintelligence, even if confined to a box (i.e. can communicate with us via text or voice but has no robot body or other physical extension), could, through the pure power of persuasion, convince humans to do whatever it wanted, including building an army of super terminator bodies to house itself, or wiping out the human race:

    Dogs and cats may be very stupid compared to us, but would you be able to control the behavior of a pack of dogs through verbal commands alone (through a telephone hooked up to a loudspeaker say)? But dogs can’t understand human language you say. Okay, but what if the degree of complexity human language can express is, to the AI, analogous to the degree of complexity we can communicate to dogs through grunts, growls, and varied intonations?

    For that matter, can you control small children who do understand human language? Even if you have access to candy and video games and spanking paddles, making small children behave as you want them to is a difficult proposition at best, even though they are much stupider than us and can understand human language reasonably well. But could you get them to do what you wanted just by talking to them alone? And even if you could, could you get them to build say, a microchip, through your instructions alone? And how much less so could you control an ant though pure verbal instructions, even though ants are much stupider and their motivations and behaviors much more transparent?

    It seems that, far from becoming easier to control through logical suasion and pure verbal commands, things get harder to control the further they are from us down the intelligence scale.

    So, if we can’t control children with any reliability, why would an AI the intelligence of which was to ours as ours is to children be able to control us, especially if doesn’t have access to super drugs and sex robots and electrical shocks to “train” us? And if an AI confined to a box were as intelligent relative to us as we are to ants why should it have any more luck controlling us than we have controlling ants through a cell phone?

    • stillnotking says:

      things get harder to control the further they are from us down the intelligence scale

      Seems more likely to be an absolute than a relative issue; perhaps beings get easier to control the farther up the scale, i.e. the more they possess the abilities to make inferences, understand consequences, entertain hypothetical scenarios, etc. We can’t control kids because it’s hard to make them adopt time- and goal-oriented thinking, but the same problem wouldn’t apply to a superintelligence trying to control us.

      • onyomi says:

        This makes a certain amount of sense, and might mean that our goal-oriented, conceptual thinking could be our Achilles heel in dealing with an AI or alien life, though I think reflection on the problem of manipulating children or animals also shows that you have to be much, much smarter than a creature to be able to understand its thinking and motivation enough to effectively manipulate it. Like, we can predict with a high degree of probability, I assume, how bacteria or nematodes or maybe even fish will react to certain stimuli, but it gets more complicated if you want to train a dog really well, and really starts to move into the realm of art and finesse.

        Like, if you were an IQ 150 person in a world where the average IQ was 60, you might be able to get yourself made into a king, or maybe you’d be hated as some kind of wizard. It would probably be extremely difficult to get everyone to commit suicide by the power of your words alone. Mostly, you’d probably just find it very frustrating.

        So if we imagine an AI of IQ equivalent 300 and the ability to talk to us, but no physical extension, it might simply find us very dense and annoying yet not be able to do much about it. Now I know that some people are talking about IQ 10,000 or whatever, at which point I guess all bets are off, but it might give us some cushion to realize that an AI which is merely somewhat smarter than us will probably not automatically take over the world.

  63. If you could build atomic bombs out of Model T parts, it would have happened by now. If it had been kept secret in 1925, it would still have happened by 1935.

    Consider Christopher Columbus. His voyages (including those of other explorers inspired by him) were the direct cause of probably tens of millions of deaths.

    By some estimates, as much as 90% of the human population of the New World died from Old World diseases. Just in terms of death toll, it was probably the largest human catastrophe, ever.

    That massive crossover of species (diseases, animals, plants, people, etc., etc.) is now called the Columbian Exchange.

    But if it hadn’t been Columbus, it would have been somebody else. Nautical technology was improving. The status quo that half the planet was shielded by isolation from the other half’s diseases could not be sustained. Even if they had any idea what would happen, there was absolutely nothing that the people of the 1400s, on either side of the Atlantic, could have done to prevent it.

  64. ilkarnal says:

    You vastly overrate intelligence and this poisons your perspective to the point that it becomes a twisted absurdity. Intelligence, which I’ll define for the moment as the manipulation of data in order to minimize the amount of physical manipulation one must perform to get a desired result, is a property that is severely limited in utility. I believe intelligence is *extremely* important, and improving it is clearly the most important thing we can do. However, it is not magic. It is not even close.

    The HPMOR fanfic is a great example of intelligence-as-magic. Ender’s Game is a much less egregious, but still valid example. Both ascribe far too much power to intelligence. Intelligence allows you to do amazing things, but its power is sharply limited. Both stories conflate what intelligence can do with what knowledge can do. Intelligence is not knowledge. You can be arbitrarily intelligent, and it doesn’t do a damn thing for your prowess at a given task until you have had time to leverage it in the work of gathering data, which is necessarily a physical rather than abstract process and so sharply limited.

    In addition, prowess itself is sharply limited. You can only be so good at a given task. We love to ascribe supernatural powers to ‘skill.’ The fact is that skill itself is limited in usefulness – unlike in movies, a heroic fighter can be arbitrarily skilled and still have essentially no chance in most of the situations where an unskilled fighter would have no chance (a shell lands too close to you, your position is overrun by superior forces, etc). So these are some bounds on ‘intelligence.’ It won’t make you arbitrarily good at anything, and it won’t even make you get to the hard (and low, compared to popular imagination) skill ceiling for any task arbitrarily fast.

    It also can’t conjure resources out of thin air. Resources are very important. You can be arbitrarily intelligent *and* arbitrarily knowledgeable – tasks still have hard floors of resources required to accomplish them. The more impressive or extensive the task, the higher the floor.

    So what does this all mean? It means that being the smartest being around has extremely limited implications. How limited? Let us look at examples. The Soviet Union had far, far less advanced computing capabilities, a far smaller high IQ population, and produced comparable – even superior in some cases – products in high technology sectors like rockets, submarines, aeroplanes, and of course nuclear bombs. The Cold War was the kind of conflict where computing power had the highest possible leverage, where both sides were sitting back and designing and building weapons at their leisure.

    The famous saying is ‘slide rules took us to the moon.’ Well, not quite, we needed computers for that task as well. But what we had was good enough. We have not surpassed the capability produced by those engineers, by that society, which had antediluvian computing capabilities compared to what we now wield.

    If anything, these examples are unfair in that they *overstate* the importance of advancements in computing power and ‘intelligence’ more broadly, because they are precisely the practical situations where they can be leveraged the most. This is the difference between popular imaginings of what data-manipulation capabilities allow you to do and cold reality. Those limitations I mentioned earlier are crushing when compared to airy imagination. Intelligence amounts to a reduction in the amount of legwork you need to do to accomplish a given task – but a limited reduction, and the result itself is limited. Your opponent can make up for less intelligence by doing more legwork, and not nearly as much more legwork as you might think. They can also more than make up for less intelligence by being bigger, applying more resources.

    But that isn’t all because intelligence in and of itself requires resources, often a lot of resources. Intelligent people spend a lot of time and calories manipulating data. Often, they get nothing back from their investment. Sometimes that’s unimportant. Sometimes it’s your life. The problem is that the more important the circumstances, the more likely it is that delays are extremely costly in and of themselves. Often an opponent can force a situation where delays are extremely costly, and superior intelligence approaches zero value.

    Intelligence pays great but limited dividends – I think it is clear that it is the most fruitful place to try to improve right now. I also think it is clear that it will not always be the best place to try to improve – well before we’re turning worlds into ‘computronium’ we will have reached a point where more data manipulation ability is worth far, far less than it would cost in resources to produce.

    • Murphy says:

      If you don’t have intelligence you can’t use knowledge very well.

      If someone is running screaming at you with a hammer intelligence, even vast intelligence will do very little to save you.

      But the more time you have to play with the more intelligence matters. Slide rules took us to the moon in exchange for a non-trivial fraction of the entire US GDP. Once there wasn’t a propaganda reason to go back they stopped dumping money on it. Slide rules didn’t make it economic.

      We’re only now getting to the point where private companies can get into space economically and a lot of the things driving that are based on intelligence and knowledge.

      it’s important to push the boundaries, there’s no rule of nature that says that progress will always happen, human societies have managed to go 40,000 years without so much as inventing the bow and arrow. Legwork alone leaves you forever chasing after things with a stick and pointy rock.

      • NN says:

        it’s important to push the boundaries, there’s no rule of nature that says that progress will always happen, human societies have managed to go 40,000 years without so much as inventing the bow and arrow. Legwork alone leaves you forever chasing after things with a stick and pointy rock.

        Human societies went that long without much technology progress because they had to constantly move around gathering food and hunting things. Everywhere that agriculture was invented, it was followed great acceleration in technological development since people could now stay in one place and there was enough of a food surplus that a significant portion of the population could devote their life to things other than obtaining food. Intelligence mattered in that it enabled the development of agriculture, but the primary effect was that it allowed for more legwork.

      • ilkarnal says:

        We’re only now getting to the point where private companies can get into space economically and a lot of the things driving that are based on intelligence and knowledge.

        Private companies are able to go to space (to LEO, not the moon!) ‘economically’ largely because the government wants them to. Even if SpaceX would have survived without NASA contract money, it would not be doing very well. It’s a little silly to say ‘slide rules didn’t make it economic’ (to go to the moon without government funding) when our fancy new computers have failed to change that. If anything, that’s a further strike in my argument’s favor!

        It’s also interesting to note that the workhorse rocket of the US ‘private’ space sector, ULA’s Atlas V, runs on a scaled down Soviet engine. Another testament to what can be accomplished with severely limited computing power by today’s standards.

    • jaimeastorga2000 says:

      The HPMOR fanfic is a great example of intelligence-as-magic. Ender’s Game is a much less egregious, but still valid example. Both ascribe far too much power to intelligence. Intelligence allows you to do amazing things, but its power is sharply limited. Both stories conflate what intelligence can do with what knowledge can do. Intelligence is not knowledge. You can be arbitrarily intelligent, and it doesn’t do a damn thing for your prowess at a given task until you have had time to leverage it in the work of gathering data, which is necessarily a physical rather than abstract process and so sharply limited.

      See Eliezer Yudkowsky’s “That Alien Message”

      • Dr Dealgood says:

        Which is nonsense.

        Figuring out alien physics and biology based on a few seconds of webcam footage is absolutely idiotic and displays a complete lack of understanding about how science, or learning generally, works. You can’t pull knowledge from the aether that way unless you are actually a fictional character in poorly written sf yourself.

        Also, while I’m perhaps the last person to throw stones here, it’s very impolite to just drop context-free links this way. This is an argument which can be summarized in a single sentence while losing nothing of the original.

        • moridinamael says:

          I’m not sure what about that is supposed to strike me as nonsense.

          Paleontologists can identify a species from a piece of bone so small and weathered that I would mistake it for a rock.

          Humans with nothing but (practically) two-dimensional images of a (practically) unchanging sky have theorized and then proved various increasingly complex theories about the nature of the universe, its origin and its fate.

          The experiments which are regarded as the critical ones proving the quantum nature of photons, and the quantization of charge in electrons, would be meaningless to a layman. But intelligent people drew out the correct conclusion.

          • John Schilling says:

            People staring at 2-D images of the practically-unchanging sky provided various increasingly complex and completely wrong theories about where the sun gets the energy to shine, theories that with great intellectual rigor tended to support young-Earth creationism, until they managed to do laboratory experiments in nuclear physics.

            And yes, it needed expert intellects to tease out quantum theory from experiments on photons and electrons, but it also needed the experiments.

            Observation and intellect are not a substitute for experiment.

    • Jeffrey Soreff says:

      Yes, agreed:
      To accomplish anything, one needs
      intelligence, and data (and often experiments), and physical resources, and physical prowess
      to manipulate those resources.
      To some degree (as you’ve said), some of these inputs can substitute for each other.
      To my mind, a large unknown is how much of these inputs have been put online, and are
      effectively available to an optimizer with sufficient intelligence.
      For instance, a large fraction (most?) of the scientific literature is online.
      That makes many potential physical experiments redundant.
      Then again, many, many reports in the scientific literature are either wrong, or
      have gaps – even at the level of trivial things like: “You made a new compound and
      reported its structure and spectrum – what does it dissolve in?”

  65. Anonymous says:

    I think your concerns are entirely right, but don’t apply to all hard takeoffs, just a hard takeoff that is also soon, before the work has been done to ensure AI is safe. An open source AI in a world where AI safety is understood well works just as Musk describes it – prevents one AI from dominating the world. Multiple safe AIs can ride the hard takeoff wave together – perhaps some or even all of them being multiple instances of the same AI.

    If you’re Bostrom or Yudkowsky or you, you might think that’s a bad thing – we don’t get an AI god to solve all our coordination problems for us. On the other hand, if you think, as I think someone might reasonably think, that creating a safe AI that works alone to make the world as good as possible according to what everyone wants is a problem several orders of magnitude harder than just creating a safe AI that humans are able to control, and that technology makes coordination problems easier to solve, not harder, then a large number of the latter kind of AI existing will allow humans to maximize their own interest themselves, in the old fashioned way: bargaining.

    • Marc Whipple says:

      Multiple AI make things much, much worse. All it takes is that at least one of them has a fear (“reasonable” or otherwise) that at least one other one has goals which are dangerous to it. Boom, Lensman Arms Race.

      • Anonymous says:

        Applies to humans as much as to AIs. Arms races continue until they don’t; you reach a point beyond which the gains of arming up further are no longer worth it. I’m much more comfortable with a world in which there are multiple powers keeping one another in check than I am with a world dominated by a single uncontested and uncontestable power.

        • Marc Whipple says:

          Note how long it took the world’s other major power to develop nuclear weapons after the US demonstrated that they were possible (and essentially provided the R&D to them for free.) And how long it took both sides to develop a reasonably-sized arsenal.

          Then consider that an AI powerful enough to be dangerous can copy anything another AI does in milliseconds, and deploy novel weaponry, of whatever type, in the same sort of timeframe.

          I appreciate your analogy but think the situation is different enough that it is not helpful.

          • John Schilling says:

            Then consider that an AI powerful enough to be dangerous can copy anything another AI does in milliseconds, and deploy novel weaponry, of whatever type, in the same sort of timeframe

            Explain, please. The last time I had to download anything resembling a nuclear-weapons design hydrocode on my admittedly not-I desktop computer, it took minutes rather than milliseconds. And not single-digit minutes, if I recall. How is it that adding “AI” makes this process six orders of magnitude faster? No matter how intelligent you make the computer, there’s a bottleneck in the data rate of the DSL line. Unless maybe you think current DSL protocols and compression algorithms only achieve 0.0001% of theoretical efficiency and a proper AI will deploy the optimal version in nothing flat.

            And then, allegedly, the AI can build actual nuclear weapons, in milliseconds? Again, an explanation is in order. There’s a 3-D printer just down the hall, but it again takes minutes, not milliseconds, and I’m pretty sure it can’t print plutonium or lithium deuteride at any speed.

            Maybe after you’ve arranged for the AI to take over the world and rebuild all of its information and manufacturing infrastructure to AI standards, we could get some of these timescales down to seconds. If we ignore the obvious catch-22 in that we need the AI superweapons to conquer the world in the first place. But if you say ‘milliseconds’, then I say you are a cultist worshiping at the altar of AI omniscience rather than a rationalist attempting to determine the capabilities of a plausible AI.

          • Vox Imperatoris says:

            @ John Schilling:

            I think he’s clearly not talking about an AI making nuclear weapons in milliseconds. Principle of charity?

            What weapons could he be talking about? Nanotechnology strikes me as one possibility. Sure, “nanotechnology” is thrown around as a buzzword, but there are clearly plausible ways it could be used as a weapon.

          • John Schilling says:

            He says “weaponry, of whatever type”, and uses nuclear weaponry as his test case, so I don’t think I’m being uncharitable. But, OK, nanotechnology. As you note, it’s scarcely better than a buzzword under the best of circumstances. Asserting that it’s going to be a war-winning superweapon, and that it can be deployed in milliseconds, without evidence or analysis, that’s pure handwaving.

          • Anonymous says:

            @Marc Whipple

            True, but I’m not sure what the significance of your point is – the arms race still ends when getting a bigger gun, or more nukes, no longer nets you any extra benefit. All your argument suggests is that this point will be reached very quickly.

          • Vox Imperatoris says:

            @ Anonymous:

            Arms races also end when one side gets a big enough advantage over the other and wins it by using those arms.

          • Anonymous says:

            @Vox Imperatoris

            It depends on the nature of the arms, how many different agents there are with them, how many they have, and so on. Sometimes, spending more on getting more powerful weapons gets you an advantage that makes doing so worth it. Sometimes, it doesn’t. When the situation looks like the former kind, you need to be more worried, because whoever is most powerful can successively crush everyone else without suffering much loss. When the situation looks more like the latter kind, you don’t need to worry nearly as much, since anyone who tries to fight someone else, even if they win, will get badly hurt in the process.

            Having more agents helps in both kinds of situation, but the latter allows for a greater difference in power while still having peace than the former, therefore meaning that the increase in power offered by acquiring a new weapon or technology must be greater before it is worth doing.

          • Marc Whipple says:

            Probably not the clearest mixing of analogies, for which I apologize. My intended point was that human arms races are limited by human timescales. It’s completely true that AI can’t print nuclear weapons.

            What the analogy was supposed to represent was AI e-warfare. Once there are a bunch of them, if they can interface electronically, they will conduct cyberwar on each other, absorbing the losers and copying successful techniques. Evolution writ very fast, very small, and then very large. Human “arms races” aren’t analogous – by bringing up nuclear weapons, I was trying to show why.

  66. Parker says:

    The movie ‘Ex Machina’ is basically — SPOILERS! — a (really great) look at what happens when a previously ‘closed’ AI becomes open AI.

    In that case, the Dr. Amoral that let the AI go free was actually trying to be Dr. Good, which goes to Scott’s point about usability of AI and adds a whole new level of psychological intrigue.

    • Vox Imperatoris says:

      That was a pretty good movie.

      It got across very well both the Bostromian idea of the “treacherous turn”, and drives home that the “transparency” an AI’s mind is only an illusion.

      I would highly recommend it.

  67. Sok Puppette says:

    It seems to me that you’ve bought a lot of stock fallacies there.

    First, there’s no reason to think that closed source keeps things out of the hands of bad/careless actors… or indeed that even the most extreme security measures you could possibly apply would work even in any part of the easy-construction, hard-takeoff possibility space, let alone in any of the other parts of the space.

    Amoral guys are not only faster than good guys in doing development. They’re also better than good guys at stealing things, and may be better than good guys at holding onto relative advantages by not sharing their own work.

    If you could build atomic bombs out of Model T parts, it would have happened by now. If it had been kept secret in 1925, it would still have happened by 1935. And if you can build some kind of godlike AI out of your GPU, it will happen soon enough no matter how much anybody tries to keep the recipe under wraps. A lot faster than anybody will figure out how to control it. If that’s your scenario, you simply lose regardless of what you do.

    You haven’t done the work to show that open source is significantly damaging, or indeed that (attempted) secrecy or lack thereof makes any significant difference in either direction, under ANY set of circumstances. But you talk as if that were a given.

    Second, OpenAI isn’t working on anything remotely close to superintelligences, and their stated plans to keep working on machine learning mean that they’re not going to get anywhere near superintelligence any time soon. So who cares what their attitude is? What they’re “opening” isn’t what you’re worried about, EVEN IF THEY THINK IT IS.

    Third, I would, in fact, rather be turned into paperclips than ruled by Dr. Evil. And for that matter I often think that Yudkowsky’s “coherent extrapolated volition” might be a lot more like Dr. Evil than people might like to admit, if you gave it nearly unlimited power.

    Fourth, although I don’t know if it actually affects the threat analysis much, you’re thinking about superintelligence as some kind of magic wand. You see this all over the place among people in the “Yudkowskian” tradition. The most obvious version is the common blithe assumption (which you did not use here) that it’s physically possible to run as many simulations as you want with as much detail as you want. There are similar assumptions buried in the general failure to consider any bounds on power at all. I see you’ve been waving Moore’s law around… even though Moore’s law is obviously NOT a physical law, WILL eventually fail, and probably isn’t being kept up with in its strong form RIGHT NOW.

  68. ComplexMeme says:

    I think you may be putting too much stock in the name. It’s like worrying about the impact on nuclear proliferation of a project that’s called OpenNuke, but upon close inspection turns out to be open-sourcing the design of Geiger counters or the like.

    Projects that can, for example (and this is an example that OpenAI page uses), extract stylistically relevant features of sets of images and apply those features to other images may be referred to as “AI”, but I don’t think those sorts projects are likely to spontaneously produce (or produce breakthroughs that suddenly lead to) any sort of intelligence explosion.

  69. Chronos says:

    Should I, or should I not, swallow the Olympians?

  70. Marcus says:

    Fears the domination and destruction risk posed by super intelligent AI. Sees the good in the domination and destruction risk posed by selective breeding for large craniums.

    I love the smell of cognitive dissonance in the morning.

    My, what a large cranium you have there, Mr. Alexander.

    • Ghatanathoah says:

      The odds of a human having something resembling human moral values are pretty high. The odds of an AI having them is much lower. The ability to develop human values is already encoded in our genes, it takes serious disruption to prevent them from developing. AIs have to have it all programmed in from scratch.

      Most selectively bred humans will have a moral conscience. The odds of a superintelligent AI having one is a lot lower. And even if we try to program it with one, the odds of screwing up are a lot worse.

      • Clathrus says:

        So that’s sorted then. Butlerian Jihad for the AI issue, axlotl tanks to produce giant-craniumed superhumans.

        • Anonymous says:

          You can probably get pretty damn far just by breeding geniuses with wide-hipped women. I would suspect that the Chinese might already have some program like that.

      • Marcus says:

        Depends if one is part of the special large cranium utilitarian planners club.

        As one of the small cranium–but rational–poor, perhaps allegiance to open AI is a reasonable choice. After all, I owe no ethical obligation to the Haussmanns and other such smarties.

  71. I’m totally confused by your claim that Windows is provided free of charge, unless you mean, “provided free of charge illegally, by software pirates.”

    • Scott Alexander says:

      I got Windows 8 for free with my (cheap) computer, and Microsoft keeps trying to make me download Windows 10 for free too. But I should probably edit the post to clarify that at some point somebody pays something.

      • Murphy says:

        Also microsoft have stated that yes they are including pirates in the free upgrade to windows 10.

        http://www.computerworld.com/article/2898803/microsoft-takes-extraordinary-step-will-give-pirates-free-windows-10-upgrade.html

      • Deiseach says:

        Windows and Google Search are both fantastically complex products of millions of man-hours of research; Windows comes bundled with your computer and Google is free.

        I’m effin’ sure Google isn’t really “for free” given the annoying amount of “Ads by Google” that keep popping up when I’m watching YouTube, never mind the constant shilling to join Google+ and hand over every scrap of details about my various devices, accounts, interests, friends’ lists, etc.

        And Windows as the OS bundled with your PC package isn’t really “for free” either; the cost is factored into the price charged (and as pointed out above, also paid for by all the bloatware and crap you then have to physically delete from your machine once you have it up and running – no, I do not want McAfee but Dell keep insisting on giving it to me). Plus, Microsoft are moving now from “buy Office suite in one of the ninety-nine slightly different versions on offer but own it for ever” to “pay a yearly subscription for the latest version which in effect means you can’t get a physical media version of the software and you have to keep paying to use this for eternity instead of a one-time purchase”.

        So “free” has slightly different meaning in this case 🙂

    • Anonymous says:

      Are not pirates merely economic competition for software distributors? It works like:

      1. Pirate acquires a copy somehow, probably paying for it, because breaking into Microsoft labs to steal a DVD/pendrive is ridiculously dangerous and inconvenient compared to just buying something in the local computer store or online.
      2. Pirate alters the copy to remove limitations precluding unauthorized use and potentially improving the program in other ways (removing bloatware, tracking apps, etc), at own expense.
      3. Pirate offers the altered product on the Internet for free, at own expense.
      4. Users download an improved product at a better price than the original.

  72. Edward Lemur says:

    Why don’t we just stop at human level AI?
    I realize it’s an unstable position, but if we’re going to coordinate on anything, why not agree to do that?

    • Vox Imperatoris says:

      The idea is that, if it is possible to create human-level AI, it won’t be that much harder to make a superintelligent AI. Either it’s reasonably possible to solve the coordination problem of creating a friendly superintelligence or it’s not.

      If it is possible, it would be the best thing that could ever happen to the human race. It would be a literal government by angels that could solve every human problem and be the Unincentivized Incentivizer.

      If it is not possible to solve this coordination problem, i.e. if an unfriendly AI will be created no matter what feasible steps we take, we’re screwed even we try to coordinate on human-level.

      Coordinating on human-level AI loses the immense benefits in order to gain nothing.

      • Deiseach says:

        It would be a literal government by angels that could solve every human problem

        And this is where I start banging my head off the desk. This is pure religion (I should recognise it, I’m Catholic) and you will still have the problem of how do you make people be good?

        Suppose your AI World Dictator decides, based on trawling through all the social science research, psychology, evolutionary science, etc. etc. etc. that every human will be happiest when married so tomorrow, Vox Imperatoris, you report to the registry office to meet your bride (oh, and it’s all opposite-sex marriage as well, sorry gay rights activists!).

        You don’t want to get married? Or you want to choose your own marriage partner? Or you’re not heterosexual? Well, your Overlord knows what is best and just behave yourself and fall in with the plan!

        And if you really don’t feel you can do it, no worries: a simple brain tweak will mean your new, improved self will love the idea and live blissfully with your new spouse!

        If you really, really think you can get a happiness/utility/whatever the fuck maximising AI that will solve every human problem, then prepare to wave goodbye to free will and the right to choice, because it won’t work otherwise. If the conflict is between the AI’s solution and your wanting your own way, you either obey or get crushed.

        Or maybe dumped off-planet in a quarantined colony that is a (relative) hell-hole by comparison to the new Earthly Paradise, with no hope of ever being let wander free to cause trouble and contaminate the Utopian Galactic Civilisation presided over by your literal government of angels.

        • Vox Imperatoris says:

          If you really, really think you can get a happiness/utility/whatever the fuck maximising AI that will solve every human problem, then prepare to wave goodbye to free will and the right to choice, because it won’t work otherwise. If the conflict is between the AI’s solution and your wanting your own way, you either obey or get crushed.

          The reason I support political freedom and the right to choice is that I don’t think the government actually knows better than me, and it is necessary for me to rely on my own free use of reason.

          James Madison is where I got the line from about a “government by angels”:

          If men were angels, no government would be necessary. If angels were to govern men, neither external nor internal controls on government would be necessary. In framing a government which is to be administered by men over men, the great difficulty lies in this: you must first enable the government to control the governed; and in the next place oblige it to control itself.

          But by hypothesis, the AI actually does know better than me, so I should do what it says. It should be able to force me to do what it says even if I don’t want to because this would, by hypothesis, be in my own best interest. If the hypothesis is not true, if the AI does not know my best interest, it is not Friendly. That’s why having an Unfriendly AI would be Very Bad.

          As for metaphysical free will, just having no political freedom would not get rid of that. People in Gulags have metaphysical free will. But free will is not a “good thing”. It is just how people are. It’s not valuable in and of itself. Now, it’s much better to have free will than to be deterministically rigged up to make poor choices and be miserable. However, it would be much better than having free will to be deterministically rigged up to make perfect choices all the time.

          This is the exact modus tollens argument that God does not exist (AKA the problem of evil). If God existed, he would send angels down to rule over humanity and get rid of all the communists and socialists and progressives and fascists and bigots in government. If He even thought it desirable that people be allowed to sin—and I have no idea why He would—He would institute a perfectly effective police force, staffed by angels, to stop anyone from violating the rights of others. Bullets would just bounce off the innocent. At the very least, every reasonable person would find it obvious that One Church was the True Church, and it would be simple to persuade people of this.

          We observe that none of this is the case. This casts doubt on the prospect that God exists.

          The Christian reversal of this is to rationalize every evil thing in the world as being actually good in some amazing way—because after all, this must be the best of all possible worlds.

          This includes Hitler. Somehow the idea is that Hitler’s having “free will” (ignoring, by the way, the distinction between metaphysical and political freedom) is so good that it outweighs any evil that comes of it. I find that implausible.

          • Deiseach says:

            The Christian reversal of this is to rationalize every evil thing in the world as being actually good in some amazing way—because after all, this must be the best of all possible worlds.

            The Christian idea is not that this is the best of all possible worlds, but that this is a fallen and broken world that has to be healed.

            And yes, free will is so vitally important that even God will let Himself be bound by our choices. I have no idea why. It would certainly be a lot simpler if we were all flesh robots, but then again, a lot of freethinkers seem to dislike the idea of an all-controlling god and rhapsodise about the ability to fall, to fail, to get up again and learn from our mistakes and march onwards, ever onwards and upwards.

          • Vox Imperatoris says:

            Christianity completely cannot explain—and the free will argument does nothing to help—why there is “natural evil”, i.e. bad stuff, in the world. The usual attempts to explain this, which have been common regardless of whether you agree with them, are on the very low order of: “Well, a lawful universe is good. So if that means a volcano rains down lava on innocent Italian children, yeah that’s undesirable but any other alternative would be worse.” Hence best possible world. God does the best he can, but he’s got a dirty job.

            And even if you say this is not the best possible world—that it is a “fallen and broken world”, though I think you come close to semi-Gnostic heresy there—the question is: did God make a fallen world for a good purpose or an evil purpose? If it was for a good purpose, then this is the best of all possible worlds after all. After all, it would be somewhat evil to make it for a worse purpose than the best purpose.

            And yes, free will is so vitally important that even God will let Himself be bound by our choices. I have no idea why.

            Well, the appeal to mystery is of course the end of rational discussion.

            Moreover, many Christian denominations explicitly (and nearly all the rest of them implicitly) deny free will. They believe in determinism: determinism by God and not physics, but nevertheless determinism.

            And this is not limited to Luther and Calvin, and Jansenists like Blaise Pascal.

            Augustine was—at best—a compatibilist, as was Thomas Aquinas. That is, of course, the view that everything people do is completely determined and they could never act differently than they do—while playing with definitions to call this “free will”. As Augustine said:

            [N]ot only men’s good wills, which God Himself converts from bad ones, and, when converted by Him, directs to good action and to eternal life, but also those which follow the world are so entirely at the disposal of God, that He turns them wherever He wills, and whenever He wills,–to bestow kindness on some, and to heap punishment on others. . . . God works in the hearts of men to incline their wills wherever He wills, whether to good deeds according to His mercy, or to evil after their own deserts. . . .

            Indeed, the doctrine of Original Sin is obviously incompatible with free will. And the only pre-Enlightenment Christian school to defend free will—the Pelagians—were condemned as heretical precisely because of this conflict.

            And even if you are a Pelagian and don’t think it’s incompatible with Original Sin—free will is incompatible with God’s omniscience, anyway. So free will is in conflict with Christianity at every level.

            Now some Christian thinkers, such as Peter John Olivi—a medieval scholastic with a very good and very ahead-of-his-time theory of free will—believe in the libertarian view of free will: the view that in respect of the same action at the same time, man can both act or refrain from acting. But, by and large, they just ignore the conflicts this has with Christian doctrine because there is no solution.

            It would certainly be a lot simpler if we were all flesh robots, but then again, a lot of freethinkers seem to dislike the idea of an all-controlling god and rhapsodise about the ability to fall, to fail, to get up again and learn from our mistakes and march onwards, ever onwards and upwards.

            Well, given that the nature of man is such that he does have free will, the belief in an all-controlling God whose existence is incompatible with free will might very well be seen as contrary to facts and productive of harmful consequences.

        • Anonymous says:

          @Vox Imperatoris

          The reason I support political freedom and the right to choice is that I don’t think the government actually knows better than me, and it is necessary for me to rely on my own free use of reason.

          But by hypothesis, the AI actually does know better than me, so I should do what it says. It should be able to force me to do what it says even if I don’t want to because this would, by hypothesis, be in my own best interest. If the hypothesis is not true, if the AI does not know my best interest, it is not Friendly.

          I agree with you that given this assumption, you should do what the AI says. I think the point of dispute, at least from my view, is that I don’t see why you should expect the AI to really know better than you what you want.

          EDIT: append ‘and care’ to that last sentence.

          • Vox Imperatoris says:

            I have often had the experience of wanting things that, upon reflection, are not good for me. The AI would not know what I want better than me. It would know what is good for me better than me.

            If (as I say it is) what is best for me is what makes me happiest—not physical pleasure but all-round eudaimonia—the AI will be in a much better position to judge than I myself will. I find it hard to understand how “What if the AI makes me as happy as I can be, but I’m not…you know…really happy?” is a coherent objection. Unless it is merely pointing out the obvious threat of the AI taking some limited feature of happiness like pleasure and promoting only that—the very definition of Unfriendly AI.

            Now you can argue that this (Friendly AI) is impossible, but that’s another question.

          • Anonymous says:

            @Vox Imperatoris

            That’s not my objection. I’m not disagreeing with the concept of someone smarter than you being able to know what you want better than you. I’m saying that building an AI such that it is actually like this seems extraordinarily difficult, part of which is because determining whether your AI, which is (as in Deiseach’s hypothetical) telling everyone they must get married to a person of the opposite sex immediately, is actually acting in your best interest or not, seems close to impossible. It’s easy to see that an AI which wants everyone to die is not acting in everyone’s best interest; not so much for goals that some people actually do approve of and others don’t.

            You’re somewhat libertarian inclined, from what I’ve seen – surely you understand that part of the argument for that is that knowing what everyone wants, and knowing that what someone says everyone wants really is what everyone wants, is a very very hard problem, right?

          • Vox Imperatoris says:

            @ Anonymous:

            I don’t deny that it might be impossible to build a Friendly AI superintelligence that could actually determine what is best for people in the way Deiseach describes.

            But then we should be really worried if we still believe AI superintelligence in general is possible. (If we don’t, that’s a separate question.) Because if it’s possible and not really easy to coordinate against, it will eventually be built. And if you’re right, it will certainly be Unfriendly because it can’t be Friendly.

            I am not, in general, an optimist about AI who thinks it will save as all in the manner I described. But if it could it would be great. If it can’t, too bad, because someone is going to do it anyway if it can be done.

          • Anonymous says:

            @Vox Imperatoris

            I don’t think it’s necessary for an AI to autonomously work to maximize the happiness of everyone in the world in order for it to be friendly (or ‘aligned’ as the MIRI folk now seem to be calling it). An AI that understood instructions given to it reasonably well, and more importantly, understood the “do not prevent the person controlling you from issuing commands to you” directive and the “stop what you’re doing immediately” command, seems like it would be reasonably safe. Yes, of course there are a ton of ways that you could screw up in getting it to do those things, which is where the safety work comes in. But creating an AI like this would not require a knowledge of everyone in the world’s preferences, which seems to me like a far more difficult task than the safety concerns – perhaps an unsolvable task.

          • Vox Imperatoris says:

            @ Anonymous:

            You’re exactly right.

            I was just playing along with Deiseach‘s hypothetical and arguing that if it could do this, it would be good, not bad.

            But clearly, if you can convincingly argue that it is impossible to dictatorially plan everyone’s life like that, a Friendly superintelligence would be able to come to the same conclusion and would not try to plan society like that. I think it at least ought to be able to replace the government, though (which was my original point).

            I do think it is clear that Friendly superintelligence = “government of angels”. It does not follow from that, that “government of angels” = no political freedom. Deiseach assumed it did and I responded that if a government of angels did mean no political freedom, even better.

          • Deiseach says:

            “do not prevent the person controlling you from issuing commands to you” directive

            Which is really going to rebound on us when the AI is the one controlling us and issuing commands. After all, what’s sauce for the goose is sauce for the gander, and if there were good and cogent reasons for the directive when it was humans – AI, those reasons still apply now it’s AI-humans 🙂

          • Vox Imperatoris says:

            @ Deiseach:

            if there were good and cogent reasons for the directive when it was humans – AI, those reasons still apply now it’s AI-humans

            This does not follow.

            The reason for the directive when it is “humans – AI” is to prevent the AI from harming humans. We don’t need to prevent humans from harming the AI. The AI exists only to serve.

            It must not be confused with the idea of a God whom humanity exists to serve. But this “god” exists to serve humanity.

            Now perhaps humans could harm themselves by not listening to it, given that it knows best. In that sense, they probably should do whatever it says. But it does not follow that it would necessarily tell them exactly what to do in every situation.

          • Anonymous says:

            @Vox Imperatoris

            But clearly, if you can convincingly argue that it is impossible to dictatorially plan everyone’s life like that, a Friendly superintelligence would be able to come to the same conclusion and would not try to plan society like that.

            It’s more than that. First of all you need it to really think like a human, understand human concepts. Then, even if it was just going to be your personal slave, you need it to understand what you want, what you really want – something you don’t even know yourself. If you want it to be not your personal slave but a god for the world, you need it to understand what everyone wants, AND how to balance everyone’s differing preferences. There’s the problem of getting that information. There’s the problem of knowing what information to program it to get.

            But I think the biggest problem is knowing when you’ve actually programmed it to do the right thing and when you haven’t. You create an AI and write its utility function to do what you think is maximizing human happiness, in the sense you believe is correct. It comes back and orders lots and lots of people to do things they don’t want to do. How do you know whether you’re wrong or whether the AI is wrong? How can you know?

            It’s been a while since I read about it, but my understanding of Yudkowsky’s CEV idea relies on human preferences converging on a single ideal. I think that’s… implausible, to put it nicely. If it turns out it’s actually correct, and if we have a way of knowing it is, then I will consider the AI god idea a plausible approach, but it seems to me that it almost certainly isn’t correct, that people’s preferences really are totally different and wild and at odds with one another, if only because people care much more about themselves than they do about everyone else.

          • Vox Imperatoris says:

            @ Anonymous:

            There’s two objections here:

            a) Friendly AI is impossible.

            b) Even if it were possible, it would be bad because it would take away our free will to determine our own destiny.

            I was mainly responding to the second.

            It’s been a while since I read about it, but my understanding of Yudkowsky’s CEV idea relies on human preferences converging on a single ideal. I think that’s… implausible, to put it nicely. If it turns out it’s actually correct, and if we have a way of knowing it is, then I will consider the AI god idea a plausible approach, but it seems to me that it almost certainly isn’t correct, that people’s preferences really are totally different and wild and at odds with one another, if only because people care much more about themselves than they do about everyone else.

            I don’t regard it as so implausible—but on the other hand I think it would likely take the form of something more-or-less resembling wireheading, in Scott’s “lotus gods” sense. If you really understood what produces the emotion of happiness, you could stimulate that directly for each person—and give him a whole universe to contemplate at the same time.

            Even if you don’t think it is anything like that, and if you think people’s goals conflict to some extent, there is the possibility for compromise. The AI could ensure that average satisfaction for all humans it serves was maximized, and everyone would agree to this rather than fight or be excluded.

            Besides, say everyone cares solely about himself as a terminal value. What’s the problem? There’s only a certain amount of happy you can be. Setting aside the Bostromian silliness of turning the entire universe into computronium to check “are you sure you’re sure you’re sure you’re as happy as can be?”, each person needs only a certain amount. So it’s not even that we have to have a compromise. There is no conflict. And clearly it would all be in their interest to work together on this; there would be no incentive to have the thing kill everyone else so that only oneself could be happy unless one positively disvalued others.

            Anyway, all this is beside the main point I wanted to make initially. Even if the Friendly AI is much more limited, even if it can do nothing like make everyone as happy as possible, there is still an enormous amount it can do. There are plenty of things in the world that nobody really likes but happen anyway because of coordination problems.

            The AI solves the problem of “who watches the watchers?”. It is the watcher that doesn’t need to be watched. It is the Unincentivized Incentivizer. This was the main point of “Meditations on Moloch”: many problems in the world are not about value conflicts. They are problems of incentives and inability to coordinate.

            The AI is not vulnerable to corruption. It has only the best interests of the people at heart. It could therefore have the power of a dictator to break through deadlock and partisanism—but without having to be kept from exploiting that power. And it can safeguard the rights of all from criminals and tyrants. Unlike even the best of humans, it will evaluate political systems in an unbiased way and do at least as well as any human government could at legislating for the common good.

            Maybe it won’t be perfect, but it will be a lot more perfect than any human government, with all the known “government failures” identified by public choice economics.

          • Anonymous says:

            The AI could ensure that average satisfaction for all humans it serves was maximized

            One of the problems I mentioned is determining what ‘average satisfaction’ means, and how you quantify it. Wireheading does indeed seem like one approach, but I’m not sure it’s one many people would like. Another problem is how you verify you’ve got your definition of average satisfaction correct: you can’t verify it by asking people what they want, because everyone will say it doesn’t give them as much as it should even if it does, and you can’t even verify it for just one person by asking whether it gives them what they want, because part of the point is giving them what they ‘really’ want, even if they don’t think they want it.

            I don’t think this is conceptually unsolvable. I think there is something that you could call ‘what everyone really wants’. I just think that determining what it is is so unimaginably difficult as to make trying to do so a fruitless task, especially since you only get one shot, after which time your ‘humanity utility function’ is encoded in the fabric of the universe forever.

            EDIT: on reflection, I’m not actually convinced this – i.e. solving moral philosophy – is conceptually possible at all.

            Oh, and regarding this:

            There’s only a certain amount of happy you can be.

            If that were true then AI wouldn’t be dangerous in the first place.

            Regarding coordination problems, I don’t think they are as big a deal as you or Scott make out. I especially reject the claim (by Scott) that technology makes them more of a problem. I think that’s a total mistake. Better technology solves coordination problems. With fixed harm type externalities, like Scott’s libertarian lake, it lets you quantify the effect each person has on each other – so in this example, it lets you count exactly how many fish there are in the lake, quantify the harm that pollution does to them, so you can give each fisherman a percentage of the fish stock and then bill the polluters for the damage they do. With problems like everyone speaking different languages, better technology can allow for better interfaces between people, so can allow greater variety to coexist with interoperability – everyone speaks their own language, a computer hears their words and translates them so the person on the other end can understand them.

            With other types of coordination problems, like arms races, I think they are not necessarily made better by technology but not necessarily made worse either. To add to what I’ve said elsewhere in the thread, new technology might make an arms race worse, if it creates a situation where spending more on weapons gets you a big advantage, but it might make an arms race better, if it’s a technology that provides big gains in power but which the version of that it is not worth increasing beyond is relatively affordable.

            I don’t think the chance of solving coordination problems is worth what seems to me like the almost certainty that a god AI would turn out to maximize something it shouldn’t, and continue to do so for all eternity.

          • Vox Imperatoris says:

            @ Anonymous:

            One of the problems I mentioned is determining what ‘average satisfaction’ means, and how you quantify it. Wireheading does indeed seem like one approach, but I’m not sure it’s one many people would like. Another problem is how you verify you’ve got your definition of average satisfaction correct: you can’t verify it by asking people what they want, because everyone will say it doesn’t give them as much as it should even if it does, and you can’t even verify it for just one person by asking whether it gives them what they want, because part of the point is giving them what they ‘really’ want, even if they don’t think they want it.

            I don’t see why it could not, in principle, discover how what we call “happiness” is produced and do whatever is necessary to produce it. As for the averaging, since it understands happiness, it will know how much is produced—in some rough way, at least.

            I don’t think this is conceptually unsolvable. I think there is something that you could call ‘what everyone really wants’. I just think that determining what it is is so unimaginably difficult as to make trying to do so a fruitless task, especially since you only get one shot, after which time your ‘humanity utility function’ is encoded in the fabric of the universe forever.

            Assuming the thing could determine how happiness is produced, I guess the question would be: is happiness what people really want? The essential idea is: the AI is really smart, and it can determine based on the logic a person normally uses what he would think if he were that smart.

            Like, “Do I really value golf as an end in itself? Or is it a means to happiness?”

            Now, if people have more than one terminal value upon reflection at this highest level, we have a problem. It is impossible in principle to trade off multiple terminal values, which is the problem with the theories that allege we have or ought to have multiple terminal values. If there were a standard by which to trade them off, that would be your real terminal value.

            Frankly though, would it really be so bad for the thing to just reprogram people to actually value happiness as a terminal value? Or give them a choice between that and some reasonable allowance to do whatever they wanted with their incoherent values.

            EDIT: on reflection, I’m not actually convinced this – i.e. solving moral philosophy – is conceptually possible at all.

            If you don’t believe in moral realism, the whole question in meaningless. But then so are much simpler questions like whether you should rob people.

            There’s only a certain amount of happy you can be.

            If that were true then AI wouldn’t be dangerous in the first place.

            Maybe so. but people can at least sign agree that it’s better to have a guaranteed chance of slightly sub-optimal bliss than an infinitesimal chance of being the guy who gets optimal bliss by killing everyone else. And the Unincentivized Incentivizer can enforce this agreement.

            As to your analysis of coordination problems…I don’t know how anyone who lives under the incompetent, irrational, immoral buffoonery of the United States government—or any government—can say such a thing. Nor especially how you can say it when we live under the very real threat of nuclear annihilation, just waiting for the first tension between Russia and America to result in some accident. Or face the future prospect of annihilation by rogue nanotechnology, superviruses, or backyard antimatter bombs.

            But I don’t want to get into that.

            I don’t think the chance of solving coordination problems is worth what seems to me like the almost certainty that a god AI would turn out to maximize something it shouldn’t, and continue to do so for all eternity.

            Well, the most obvious coordination problem a Friendly superintelligence solves is preventing Unfriendly superintelligence. Even if Friendly superintelligence is likely not to work (and this is my opinion, to be honest)—supposing superintelligence to be possible and not incredibly easy to coordinate against—we don’t get to decide between that and no superintelligence at all.

          • Anonymous says:

            @Vox Imperatoris

            I agree that the AI could, in principle, work out what happiness is, and determine from there what makes us happy. But we don’t know what our terminal values are – don’t know if happiness really is what we want to maximize, or if it also includes personal satisfaction, or achievement, or love, or anything else. And I think that we not only don’t know but can’t know, because the whole point of knowing better than us is that the AI is going to tell us that we want things that we strongly disagree with. You need a way of determining the difference between something you want but don’t know you want, and something you don’t want. I don’t think there is such a way. ‘Just let us try it for a while’ does not seem to work, as it works as well for heroin and wireheading as it does for something we do really want.

            If you’re going to cheat and say, well, just let the AI reprogram us to value happiness, then why not skip determining what happiness is and just have it reprogram us to value paperclips? Or – conclude, as you arguably could, that the AI getting utility out of creating paperclips is no less valid a form of utility than humans getting value out of whatever it is we value. So, the friendliness problem is not a problem at all – the AI is a utility monster and everyone else can shut up and hand over their atoms.

            Regarding coordination problems, I agree that governments are rife with them. Nuclear annihilation seems unlikely, though, and in any case, there are very few players in the game of nuclear deterrence, which makes it more dangerous. Also, nukes are very indiscriminate weapons – a nanobot swarm would be much tidier and less likely to hurt bystanders.

            Even if Friendly superintelligence is likely not to work (and this is my opinion, to be honest)—supposing superintelligence to be possible and not incredibly easy to coordinate against—we don’t get to decide between that and no superintelligence at all.

            Yes, but I think we might get to decide between that and many superintelligences under human control.

          • Vox Imperatoris says:

            If you’re going to cheat and say, well, just let the AI reprogram us to value happiness, then why not skip determining what happiness is and just have it reprogram us to value paperclips? Or – conclude, as you arguably could, that the AI getting utility out of creating paperclips is no less valid a form of utility than humans getting value out of whatever it is we value. So, the friendliness problem is not a problem at all – the AI is a utility monster and everyone else can shut up and hand over their atoms.

            Well, okay, you’re right. If you really didn’t value happiness, that wouldn’t be any more desirable than just letting it make paperclips. I was working off the assumption that people do already value happiness at least a little bit.

            I think the strongest argument for something like coherent extrapolated volition is: if you found out that you really did have multiple terminal values, wouldn’t you want to self-modify to make them into one coherent value? So that you would have some rational way of pursuing and prioritizing your values?

            You need a way of determining the difference between something you want but don’t know you want, and something you don’t want. I don’t think there is such a way. ‘Just let us try it for a while’ does not seem to work, as it works as well for heroin and wireheading as it does for something we do really want.

            You don’t need to know. It needs to know.

            It models your thought processes and comes to the same conclusions you would come to if you had all that intelligence.

          • Anonymous says:

            @Vox Imperatoris

            You don’t need to know. It needs to know.

            You need to know so that you can tell whether you’ve programmed it correctly or not. Otherwise, when it inevitably comes back and tells you that it’s worked out what you really want, and that involves doing something you don’t want to do, how can you tell whether you do really want to do it but just don’t know it, or whether you don’t actually want to do it after all?

          • Vox Imperatoris says:

            @ Anonymous:

            That’s the exact problem. There, is principle, no way to tell you’ve gotten it right once you’ve built the thing.

            You just have to sit down and think about in your armchair really hard, design it, and hope it works. Which is why it is very dangerous and unlikely to work.

    • moridinamael says:

      “Human-level” is a phrase that gets used a lot, but it is almost meaningless. There’s no reason to assume that any given AI will have anything like the architecture of a human brain.

      We already have dozens of examples of AI that outperform humans at narrow tasks. If “human level” means “passes the Turing test”, well, a super-human AI could pass the Turing test, and the human tester probably wouldn’t know it was talking to a superintelligence unless the superintelligence was showing off.

      Even if “human level” means “scores 100 on an IQ test”, I’m not convinced that this would prove anything, because, again, any given AI would have strengths and weaknesses far different from those of a human.

    • Nicholas Carter says:

      The concern is that, if a human is smart enough to build an AI, then a Human-Level-AI can learn how it was built, and build another of itself, but a little bit smarter. The new AI can create children of it’s own (this is a bit of a toy model, there’s no real difference between improving yourself and having children when your an abstraction of computing results) but it can figure out how to make them a little smarter than that. And then the cooperation doesn’t matter because there’s no real human involvement in the refinement process.
      You need to assume that the gradient of diminishing returns isn’t particularly steep past human level intelligence for this concern to be reasonable, but since we don’t have any good ideas about where the IQ of 100 sits on the scale to begin with (outside of “probably in the middle somewhere) that’s not a slam-dunk rebuttal.

      • John Schilling says:

        if a human is smart enough to build an AI…

        Fortunately, that is extremely unlikely to be the case. You are I think working from the ideal, emotionally appealing and beloved of fiction and popular-science writers, of the Lone Genius Inventor. And there are rare cases that sort of approximate that ideal, but after fifty years of AI researchers with access to exponentially-increasing quantities of computronium, this almost certainly isn’t one of them.

        Recalibrate that argument to the more accurate, “If a community of thousands of very smart humans, using an infrastructure built and maintained by millions of very smart humans, is collaboratively smart enough to build an AI…”, and figure how things go from there.

        • Nicholas says:

          You can make that argument, and it’s a small component of why I’m not particularly worried about FAI, but now we’re fighting the hypothetical. The concern of people worried about Open AI is that, conditional on a solitaire human-tractable intelligence being able to productively create/edit GAI once it has already been done once leaving the instructions lying around is really dangerous.
          Basically: I could never invent the internal combustion engine. But since they already exist and I’ve worked in about 100 of them, it wouldn’t be impossible for me to make one more example of an internal combustion engine.

          • John Schilling says:

            Can you single-handedly design and build an internal combustion engine that is a substantial improvement over the state of the art?

            If the question is merely one of whether black hats will be able to get hold of unfriendly AIs to serve their nefarious scheme, then yeah, all it takes is for human intellect to suffice for modifying or replicating proven designs. As with any other weapon, I think you have to assume that at least some unfriendly humans will have access to the best AIs around.

            But if you’re asserting a hard-takeoff scenario, or even a stealthy-takeoff, then you need the posited emergent human-level AI to be able to build not just another human-level AI, but a weakly superhuman AI.

          • Nicholas says:

            It doesn’t strike me as conceptually impossible for a single engineer, armed only with all available information about cars to date, to improve on the state of the art.

    • TD says:

      @Edward Lemur

      We could go a step further and not make AI at all.

      I always thought it would be funny if the rationalist AI community eventually came round to the same position as Ted Kaczynski.

  73. moridinamael says:

    I am not sure why OpenAI poses any additional risk over what we’re already facing:
    + I’ve been able to download several different open source AI implementations for years.
    + I know of two different very skilled researchers building terrifying things in their basements.
    + It is an assumption of academic AI research that major findings will be published, and can thus be duplicated.
    + There are more huge companies than I can count working on this problem in parallel, with only perhaps one of them (Google) even thinking in terms of the Control Problem.

    What is OpenAI really doing, other than perhaps funding a bit more research than would otherwise be funded? Their commitment to open-sourcing things really doesn’t change the overall rate of dangerous-results-being-made-public. In other words, we’ve had “Open AI” for years already, the only novelty that “OpenAI” brings to the table is a bizarre justification for why the status quo is a good thing.

  74. Anaxagoras says:

    You say that one “can certainly use an AI like the ones in chess-playing computers, but nobody’s very scared of the AIs in chess-playing computers either”. Isn’t this in part because chess isn’t very scary? It seems reasonable that a specialized AI could be good enough to be dangerous in an arena like tactics, infrastructure disruption, or persuasion without having the generalism necessary to be able to do anything on its own. The issue isn’t an AI as an agent, but rather AI as weapons for human agents. Whether or not it’s powerful enough to beat the US government doesn’t change the fact that such a thing could still do a whole lot of damage.

  75. gbdub says:

    While I agree with much of this post, I find the hard takeoff argument unconvincing. You (and others making it) seem to confuse “evolved intelligence” with “technological progress”. While the two are clearly connected, they are not the same.

    So maybe a chimp is 99% human intelligent – but it took 2 million years, not just a few thousand, to get from chimp-equivalent to human.

    And we are vastly technologically superior to humans of even a few hundred years ago – but are we that much more intelligent? Not really – I’m fairly confident Isaac Newton could have contributed to the Manhattan Project, given sufficient starting data. It’s just that our technology has gotten to the point where it compounds – we can share advances, build on previous ones, etc. It’s tech that’s been advancing exponentially, not really intelligence.

    So could a superhuman AI increase technology exponentially fast such that they become unstoppable? That’s a question I don’t see being addressed. These technological advances have to come from somewhere and can’t violate physics. You can’t just magically solve N=NP, crack encryption, or invent cold fusion just because you’re super smart.

    • Scott Alexander says:

      Intelligence helps you build technological progress more quickly. Chimps could have existed for a billion years without developing an industrial base. I agree that it’s hard to know exactly how quickly increasing intelligence can increase technology, because we’ve never had any huge intelligence outliers before.

      My impression is that if we sent a couple of Isaac Newtons back to an australopithecine tribe, it would have been a pretty big deal. I could be wrong about that – there may be something inherent about technological progress that requires it to be done in a society of about equal intelligence to the inventor, or that forces it to take a certain amount of time no matter how smart you are.

      But remember that AIs don’t just have an intelligence advantage over humans, they also have a clock speed advantage – a computer can do in milliseconds calculations that would take a human days or weeks. If we stuck Isaac Newton all on his own in a workshop and demanded he start the Industrial Revolution himself, how long would it take him? A million years? A billion? That might be days or weeks of subjective time for humans on the scales we’re talking about.

      • Murphy says:

        You might like the “primative technology” youtube channel. It’s surprisingly peaceful to watch. Guy starts with nothing but his hands in a forrest and sees what he can build under the rule that he has to create every tool from materials available.

        Sure, he’s not starting from zero-knowledge, he knows his goals and knows what worked in history and just has to figure out details but he goes from a sharp rock to a fired clay tiled house with underfloor-heating and is still going.

        https://www.youtube.com/watch?v=P73REgj-3UE

        • baconbacon says:

          He also isn’t starting with a zero capital base. You send a few Newtons back in time and you had better hope they are social enough to convince the rest of the tribe to feed and clothe them while they are building these exotic things.

          • Murphy says:

            Well if you can manage things for a few days you could probably start coming up with things of value like fired pots.

          • baconbacon says:

            @ Murphy

            It is going to take you more than a few days just to find appropriate locations for clay (sourcing materials is wildly underestimated in terms of cost), collecting wood, building a kiln. All of these things require time and effort which in return require more calories. Lots of animals actually have high “leisure” time, but their leisure tends to be lying in the shade/sleeping. female lions sleep on average 15-18 hours a day (males 18-20). They only have that time because they expend almost no energy during it.

            If that lion decided to go looking for clay deposits along a nearby river with their free time they would have to increase their hunting activity (which is itself highly energy intensive and prone to failure). Anything that doesn’t yield immediate results (so no prototypes allowed!) requires a large capital base to just get off the ground.

          • Rowan says:

            Being a Newton-level genius would probably make one better than one’s peers in an australopithecine tribe at hunter-gathering, and hunting in groups means one can build a reputation as someone with good ideas if that starts as “we should strike from this side” or “those tracks go that way”. That’s a starting level of social capital, enough for if there’s some small innovation on the tribe’s current set of tools one can get a day or two free to work on it. Success at that gives you credibility in case the next idea you have would take longer and more supplies.

      • Anaxagoras says:

        In the SF cosmology I have floating around in my head, there’s a species that vastly superhuman in terms of intelligence, but has extremely slow clock speed, and is therefore almost entirely reliant on much less intelligent but faster proxies. To an external observer, it looks like an empire of expansionistic robot gardeners, but the trees are really the ones in charge.

        I’m not convinced that Isaac Newton would be able to do all that much, particularly without his knowledge base to begin with. You just posted the review of that book about how the intelligence of one’s peers may matter a lot more than one’s own. Now, granted, a huge gap can make a difference. But I think our civilization as it is today is already the product of a substantially superhuman intelligence, albeit not a very well organized one.

        • Anaxagoras says:

          Huh, not too big on Egan, but I may check that out. Thanks!

          In my original conception, there were multiple layers of proxies, but in the current version, it’s just the one. I couldn’t really justify multiple layers given the fairly mundane constraints the trees have, and it fit better for how they might be developed.

        • Murphy says:

          I’m curious about how he’s a dick, from what I’ve seen he doesn’t have a significant online presence and I’ve not noted his stories for having dickish themes.

        • Anaxagoras says:

          He’s somewhere on the border between my “Not a great writer, but I’ll put up with that for the ideas” category that folks like Alistair Reynolds fall into and my “Just give me an 8.5×11 sheet of paper with their good ideas and spare me the books” category that I leave Baxter in.

          That he writes short stories pushes him mostly into the first group.

        • Marc Whipple says:

          OBSF2: When Harlie Was One, in which we (SPOILER ALERT…)

          … encounter an AI which is slower than real time. Ask it on 12/15/2015 what the DJIA will close at on 12/15/2016, and you will get an answer which is almost certainly correct. Sometime in the 2020’s. This is due to lightspeed/electron drift lag in its massive computing substrate.

        • See also Vinge’s A Fire Upon the Deep, and his “Original Sin”. The former has some clock speed/proxy elements (to say more would involve spoilers) and the latter is about intelligent aliens that live much faster than humans.

      • Psmith says:

        “My impression is that if we sent a couple of Isaac Newtons back to an australopithecine tribe, it would have been a pretty big deal. I could be wrong about that – there may be something inherent about technological progress that requires it to be done in a society of about equal intelligence to the inventor, or that forces it to take a certain amount of time no matter how smart you are.”

        There’s a Poul Anderson short story about this (taking the view that, roughly speaking, Newton couldn’t have done much) called “The Man Who Came Early.”

        The example of Newton also makes another point salient–look how much time he spent on alchemy and studies of the occult. Seems like a reason to doubt that intelligence by itself is enough to make you effective.

      • NN says:

        Technological development isn’t just a result of coming up with ideas, it’s a result of coming up with ideas and testing them to see which ones work. It took hundreds, possibly thousands of failed tries by many very smart people before the Wright Brothers managed to fly, and then it took a lot more trial and error before jet airplanes were viable. A really intelligent AI might be able to come up with ideas faster, but it won’t have any increased ability to run experiments to determine which of the ideas would work.

        There’s also the aspect of infrastructure development. Scientists knew the basic principles behind how a nuclear bomb might be built, but no one was able to build one until the US government poured a lot of money into enriching enough uranium and plutonium. Isaac Newton couldn’t start the Industrial Revolution by himself no matter how much time we was given, because he can’t build factories by himself. See also how Leonardo Da Vinci was able to come up with plans for helicopters and a bunch of other advanced stuff, but no one was able to build them for several centuries.

        I’m also very wary of saying that the difference between chimps and humans is because of a small difference in intelligence, because there are a lot more differences between them than just intelligence. An obvious one is that chimps can’t talk nearly as well as humans can, which greatly limits their ability to share knowledge with each other by comparison.

        • Marc Whipple says:

          Really smart AI with sufficient computing power won’t need nearly as many experiments: they can use simulations for most of the initial grunt work.

          • Dr Dealgood says:

            Simulated results are virtually worthless until you test them in the real world. There are always flaws in your model, most of which will be unknown to you until you encounter them. The “grunt work” here is the whole point of the enterprise, and skipping it means that you don’t actually know anything.

            To butcher Euclid, there is no royal road to experimentation.

          • Marc Whipple says:

            Correction: YOUR simulated results are virtually worthless until you test them in the real world. This does not necessarily apply to an AI that can model to a reasonable level of reliability at the quantum level. (Though admittedly that’s going to take a lot of computing power.) And quantum level modelling is probably overkill.

          • Dr Dealgood says:

            This isn’t Star Trek Voyager, you can’t just wave your hands and yell “Quantum!” to avoid having to do any real work.

            The kinds of simulations you’re talking about are, if not explicitly forbidden by the laws of physics (not a QM guy so IDK), impossible purely from a purely logical standpoint. It’s the whole point of “garbage in, garbage out.” Whatever flaws are in the information going into the simulatiom will influence the results in unknown and unknowable ways. You need external verification.

          • Adam says:

            People don’t realize the true intractability of ‘just simulate the universe.’ It isn’t even the computational complexity. I don’t think there are any undecidable problems involved, but when you’re talking about maintaining a state-space that includes every single atom in a system, you’re talking trillions of trillions of addressable bytes you need to be able to do that. There may not be enough memory and bus bandwidth in every computer in the world for that to even be possible. Sure, it can figure out how to build better memory that we haven’t figured out, but you’re presupposing it can run the simulation in order to figure that out.

          • Marc Whipple says:

            I am aware of the scale of the problem. (I quite distinctly recall a lecture to this effect when I was in college: It made sense then, it still makes sense now.)

            I am also aware that a powerful AI would also be aware of the scope of the problem, and would be so much better at integrating vari-scaled simulation to get the desired level of reliability that as long as we’re imagining powerful AI – which are fantastic enough – it is in my opinion not much more fantastic to imagine that they could cut out a lot of physical R&D by doing so.

            It doesn’t have to model Infinite Fun Space. It just has to be a lot better at modeling, and scaling models up and down appropriately, than humans are. I find this a reasonable prospect.

          • Chalid says:

            “Simulations” aside, it’s obvious that the smarter you are, the more you learn from a set of experimental data. This is obviously true within human scales (the village idiot can’t be a scientist) and there’s no reason it wouldn’t extrapolate beyond human intelligence levels.

            Yes there’s a limit, if you’re an ideal reasoner each independent bit of information will halve your uncertainty on average (handwaving for brevity), but humans are nowhere close to that.

          • Chalid says:

            and on the engineering side – certainly, humans require lots of trials to build a working car or whatever. But, I think with the vast majority of issues, you get the relevant designer slapping himself over the head and thinking “we were idiots to forget to account for heat flow down that piece of pipe” or “if only I’d studied the chemistry of those gases I would have realized that that surface would corrode” or whatever. The vast majority of issues are anticipatable *in principle* and would be anticipated by a sufficiently smart designer.

            Coming at it from a different angle, the argument that smarter => fewer errors clearly holds within human intelligence range (there aren’t many dumb successful engineers) so you’d expect that superhuman intelligences would have even fewer errors and perhaps none at all.

      • Gbdub says:

        You can’t send back adult Newton, who has all the knowledge of human progress up to his time embedded in his head. Knowledge is data, not intelligence. You have to send back infant Newton, with only his “natural” intelligence. Because that’s what a super intelligent AI would wake up as – naturally intelligent, but at most it would have access to only the current state of the art knowledge.

      • Calico Eyes says:

        Ya kinda sortof just mentioned a huge intelligence outlier in the next sentence…who basically kickstarted the industrial revolution into gear.

    • Anaxagoras says:

      I don’t think that it’s reasonable, for even a substantially superhuman AI in a box with neither knowledge, inputs, nor outputs, to do very well if dropped into the African savannah on a world without any technology. Particularly if that box is made out of material that lions would find tasty.

      But that’s not what would be happening. A hypothetical unfriendly AI would be able to piggyback on a lot of our development. I think something being smart enough to take over the world from scratch is really, really hard. Being able to usurp a world we’ve already spent millennia taking over is probably much easier.

  76. Alphaceph says:

    Man, I hate to be the one to say it, but this comment section would be much higher quality if everyone had read Eliezer’s sequences on Less Wrong.

    People are reinventing conceptual mistakes that were thoroughly debunked years ago. It’s like the online rationalist community is regressing.

    • Mark says:

      Could someone give me a one paragraph summary?

      • Nicholas Carter says:

        The Sequences (Which is the length of a very long book) includes about 100 pages (Approximately, in printed hardback) on how humans have used intelligence to achieve their goals, what you would want an AI for, and how the AI, with no malice and full intent to do what it thinks you want, would use the tools you gave it to complete the task you gave it to disastrous effect, and respond to attempts to reprogram it with the same intensity you would if your boss suggested you join a cult to improve workflow.

    • gbdub says:

      Where you, until you point out what exactly you’re talking about, seem to be making the tired “you disagree with me, so you must be improperly educated” argument.

      I think people might be more amenable to the sequences if they were treated less like literal Gospel. Honestly it’s turned me off from approaching them – seems culty and not that “rational” at all.

      • Vox Imperatoris says:

        I know it’s very annoying, but people don’t like having the same discussion over and over.

        It’s not about disagreement. It’s about having no familiarity with the arguments. It’s one thing to disagree with Yudkowsky. I disagree with him on very many things.

        But it’s another to repeat a position he argues against without making a counter-argument. That’s completely dull for anyone who has read his argument.

        • gbdub says:

          It’s not just that they don’t want to rehash an old argument, it’s that they assume the result of that old argument agrees with them based on only one side of that old agument. The sequences aren’t science, and they aren’t the word of God. They can be disagreed with by rational people.

        • Nicholas Carter says:

          This is the sentiment that lead to SJ bingo cards.

          • Vox Imperatoris says:

            Indeed it is.

            I bet it pisses them off just as much to hear “don’t all lives matter, though?” over and over. Even if their position is totally wrong, that doesn’t mean the opposition isn’t also facile and completely failing to engage.

            This is the problem that people come into debate with different contexts of knowledge. It is a very hard problem because there is a lot of knowledge and no one can know everything.

            About the best you can do is point to a single work that gets across where your side is coming from, so that people can respond to the arguments in that instead of responding to points already addressed.

            The problem is that reading is too much work (and not just because people are lazy). No one wants to read the entire Sequences if they already have cause (in their context) to think Yudkowsky is a loon. No one wants to read the John Galt Speech and all of Ayn Rand’s nonfiction essays if they already think she’s a fanatic cult leader. No one wants to read five books on the historical evidence for Jesus if they already think Christianity is an absurd fantasy. No one wants to study the theory of evolution if they think it is contrary to the infinitely more certain truth of the Bible. No one wants to read Das Kapital and a million other Marxist works if they know every society built on Marxism has failed.

          • Anonymous says:

            @Vox Imperatoris

            I can certainly understand the mindset behind SJWs not wanting to engage with people who disagree with them because, to them, it is completely obvious that they are right, and it’s wearing on the mind to hear people pose stupid disagreements constantly and have to think every time about why you disagree. And yet… When I’ve made efforts to engage with people who disagree with me on topics, I often find new things. Sometimes, I find new counterarguments I hadn’t considered before, which, after some thought, make me conclude that I’d been making certain assumptions in my argument, and under other assumptions my argument wouldn’t hold up. I don’t generally change my mind every time, or even most times, that I read a good counterargument, but I do generally find it a useful experience, helping me qualify my views better.

            This approach helps to do some things, hinders in doing others. If what you want is to get a deeper understanding of something based on your understanding of something else, I think you need to hold that something else constant and assume your version of it is correct in order to be able to apply it elsewhere. The alternative is to constantly question your current understanding until you grind it down to nothing, while not having gotten to actually apply it anywhere and make use of it.

          • Gbdub says:

            You could at least point to the part of the Sequences that an argument violates, so the burden isn’t “read these thousand pages before I deem you worth talking to”.

            I imagine the conversion rate would have been lower if early Christians had responded to “hmm, tell me about this Jesus fellow” with “uggh, just read the Bible heathen, I’m sick of explaining this”.

          • Vox Imperatoris says:

            @ Gbdub:

            I myself am not particularly annoyed by people who haven’t read this stuff.

            Here’s something those unfamiliar might want to start on: a summary of the main points of Nick Bostom’s Superintelligence (which, to be honest, is a pretty dull book with interesting ideas).

            The WaitButWhy series of articles on AI is also well worth reading as a complete layman’s version.

    • Mark says:

      I have the same reaction, but the other way. I think most people here need more grounding in CS theory.

      I’ve never seen a good reason why arguments from complexity don’t apply to “super ai”. Just hand waving about constant factors.

      • Vaniver says:

        Mark, I work in numerical optimization, broadly speaking the class of problems that Google Maps solves. You’re trying to construct something (say, a route from A to B) that does as well as possible on some metric.

        Turns out, unless the problem is very simple or falls into a narrow class that most real problems don’t, it’s impossible to get both 1) the best solution and 2) a proof that the solution you found is the best solution in a reasonable amount of time for arbitrary inputs. This is the ‘computational complexity’ claim.

        But it also turns out that, really, no one cares about the proof that the solution is the best, and that’s often the hard part. A very good solution will do, and a very good solution might actually be the best solution. UPS isn’t using the literal best plan to route its packages every day, but if you can come up with a 1% better solution by having the computer think about the problem a little more cleverly, well, you’ve just saved millions of dollars. And oftentimes these improvements actually come along with a reduction in cost; if you think of a cleverer way to encode the solution space, you can waste less time and get better results.

        For a surprisingly huge fraction of major companies out there, their edge is that they use computerized math (i.e. AI) a bit more effectively than their competitors.

        • TrivialGravitas says:

          Billions, according to what they’re telling the drivers (who hate the new system, as it has issues with not grasping that large package trucks cannot turn around on narrow streets).

        • Mark says:

          Do you have a citation for

          “Turns out, unless the problem is very simple or falls into a narrow class that most real problems don’t, it’s impossible to get both 1) the best solution and 2) a proof that the solution you found is the best solution in a reasonable amount of time for arbitrary inputs.”

          I’m genuinely interested.

          • Adam says:

            He probably means the No Free Lunch Theorem.

          • Vaniver says:

            That’s a ‘non-standard’ (from a CS perspective, at least) presentation of computational complexity, where ‘reasonable’ means ‘polynomial with low constant factors.’ This or this are moderately useful.

            Specifically, that shouldn’t be read as “you can get A or B but not both,” because both are hard to get for the same reason (though B implies A but A does not imply B).

            Most interesting problems are combinatorial optimization problems, and most work goes into finding “very good” solutions quickly, instead of finding “the best” solution regardless of how long it takes. (It’s just the case that oftentimes a quickly found “very good” solution ends up being the best solution, at least for small problems.)

          • Mark says:

            Thanks! Somehow I haven’t gotten into that proof yet.

            My intuition about this comes more from things like Rabin’s compression theorem. (I’m a good Grad student I swear)

      • JBeshir says:

        They do apply to superintelligent AI, in that they mean you can’t get a superintelligent AI just by having it iterate through all possible strategies and pick the best, because computational complexity makes that infeasible.

        They don’t, however, constitute a good argument for why you can’t get AI which does vastly, unapproachably better than humans at life, any more than they constitute a good argument for why you can’t get AI which does vastly, unapproachably better than humans at chess, which has an equivalent search problem.

        Humans and AI both attempt to approximate optimum decision making, many many orders of magnitude faster than actually implementing optimum decision making would be. To be superintelligent you just need to be much, much better at this approximation than humans are. Complexity theory says some things about bounds on approximations in some cases, but in general says little that bears on it.

        It is almost certainly possible to do this just from taking the same kinds of heuristics used by the human and implementing them on a faster/bigger substrate. And it’d be very surprising to me if you couldn’t find better heuristics/algorithms/architectures, too, now you’re unconstrained by evolution’s need for change to be iterative.

        My impression was that this was mostly understood by the people talking about superintelligent AIs already, and is why the adjective “superintelligent” rather than “optimal” came into common use- the major writers are well-aware that literal optimal decision making is precluded by computational complexity.

        • Mark says:

          I don’t disagree with most of that, but you are much more reserved about algorithmic self improvement then other people in this thread.

          I also personally don’t find those scenarios all that scary.

    • Dr Dealgood says:

      I’ve read them, and wish I hadn’t. It’s useful to pick up the jargon but a lot of those “thoroughly debunked” ideas are actually much more reasonable than the solutions presented in them. Moreover, the things it actually does debunk, like the idea that you can just code a do-what-I-mean function into your program, were already known errors with much more succinct and logical arguments against them.

      It is good to have people asking “why don’t we just build a tool rather than a goal-directed agent?” and the like because those are questions that haven’t been satisfactorily answered by AI risk supporters regardless of what the sequences claim.

    • John Schilling says:

      It’s like the online rationalist community is regressing.

      From a certain point of view, it is regressing.

      The online rationalist community was founded as a cult, with an AI god, a charismatic human leader, a holy text, and a holy crusade against the forces of evil (or at least paperclip-maximization). It offered the tools for rational thought, but demanded they be used to reach a single predetermined conclusion. It was somewhat more tolerant of dissent than most cults, but I think mostly because of its belief that once the dissenters had mastered “rationality” they would put away such silliness and devote themselves to the cause of Friendly AI.

      Instead, people have taken the tools of rational thought and put them to much broader use, including the bit where they take a good skeptical look at the core beliefs of the cult. They have rejected the belief that truth, or rationality, can come only from reading the One True Book.

      From a certain point of view, this is “regression”. I consider it a good thing. And yes, it does mean that you have to go and reformat the Friendly vs. Unfriendly AI argument in a less cultish form rather than saying “read the One True Book or we’ll dismiss you as ignorant simpletons!”

      • Le Maistre Chat says:

        Agreed. I came here because Scott is excellent at doing informal logical analysis of ideas that are expressed in terms of the scientific method. It’s great that people pick up analytic skills, and use them in an open-minded way instead of internally shutting down crimethink, but I see them being picked up through a cult as an… interesting social problem.

      • Alyssa Vance says:

        Scott has addressed this form of argument in a previous post (https://slatestarcodex.com/2015/03/25/is-everything-a-religion/):

        “So one critique of these accusations is that “religion” is a broad enough category that anything can be mapped on to it:

        Does it have well-known figures? Then they’re “gurus” and it’s a religion.

        Are there books about it? Then those are “scriptures” and it’s a religion.

        Does it recommend doing anything regularly? Then those are “rituals” and it’s a religion.

        How about just doing anything at all? Then that’s a “commandment” and it’s a religion.

        Does it say something is bad? Then that’s “sin” and it’s a religion.

        Does it hope to improve the world, or worry about the world getting worse? That’s an “eschatology” and it’s a religion.

        Do you disagree with it? Then since you’ve already determined all the evidence is against it, people must believe it on “faith” and it’s a religion.”

        • Vox Imperatoris says:

          Exactly.

          Look, LessWrong is not a “cult”. For that matter, Ayn Rand’s circle of followers were not a “cult”. They (especially the latter) may have a certain tendency to exaggerate the correctness and intelligence of the leader and appeal to the leader’s say-so as a shortcut to rationality (he/she is the essence of rationality, so if you disagree, you’re irrational!). The extent to which this is true of LessWrong is highly questionable.

          The main thing people from LessWrong do is get really defensive when anyone insults Yudkowsky (and is this is not necessarily irrational). But there is a point (see: the Peikoff – Branden dispute regarding Ayn Rand) at which this natural tendency to defend a person you admire turns into the view that he or she can do no wrong.

          But a “cult” is not anything you want it to mean, or any group which has a trace of irrational hero-worship. There is a specific technical definition, but a cult boils down to being one of these groups that use indoctrination to systematically exploit vulnerable people, milk them for money, and control every aspect of their lives. The fact that Yudkowsky wrote some blog posts is not “indoctrination”. Asking people to donate to charity, even AI research, is not an attempt to drain them of their funds and other pursuits to enrich himself. And he does not even attempt to control anyone’s life.

          Pretty much any group which has any shared values tries to a) convince people of things, b) asks them for financial aid, and c) gives them advice on what to do. That does not mean they are all “cults”. You can twist anything into being a “cult” if you conflate any action at all in those categories which being at the extreme height of them.

          Furthermore, most really original thinkers in history who chose to speak publicly about their ideas have attracted a group of loyal followers. That’s just how people are. It happens in anything. Frank Lloyd Wright—talking about architecture for God’s sake—had a particularly fanatical following, and Ayn Rand specifically wanted to avoid having something like that happen to her (she did not, unfortunately, entirely succeed).

          But if all such thinkers are “cult leaders”, then Socrates, Plato, and Aristotle were “cult leaders”.

          • Nornagest says:

            Ayn Rand’s circle may not have been a cult by some definitions, but that doesn’t mean you’d have wanted to be part of it. Most of the people using the word “cult” in the context of LW are not trying to be rigorous about their terms, they’re trying to say that it’s too intellectually parochial and/or obsessed with topics too weird for them and/or too much of an Eliezer Yudkowsky fanclub.

            All of which is a perfectly legitimate preference, though your own standards may or may not agree with it.

          • Le Maistre Chat says:

            Vox, “cult” may be sloppy terminology. I used it because I think of Less Wrong as sharing key features with other modern “cults of reason” like Objectivism (but not so much the literal Cult of Reason from the French Revolution).
            However, I think that Yudkowsky’s whole operation is a religion by any definition strict enough to include animism and Confucianism. The whole LW/MIRI operation is based on Yudkowsky’s beliefs about superhuman beings and how humans should interact with them. There’s also the metaphysical belief if MIRI achieves Friendly AI (or if some evil AI wants to torture people), you will experience an afterlife.

        • John Schilling says:

          “Cult” and “Religion” are not synonyms. Nor are cults a subset of religions. Emergent religions are a subset of cults, though when they get large and respectable enough we stop calling them cults.

          Yudkowskian Rationality is a weak cult that is unlikely to become a religion.

    • Salem says:

      Ah, “thoroughly debunked.” As you are so fond of this obnoxious tactic of telling people to go read past rationalist works as if they settle the matter, allow me to respond in kind:

      https://slatestarcodex.com/2014/12/13/debunked-and-well-refuted/

      For starters, the only way to “thoroughly debunk” people who say that UFAI is not going to happen is to create a hard-takeoff UFAI. Until then, recognise that it’s all speculation, and don’t pretend to such certain knowledge.

      I, for one, have read the Sequences. I find them very unpersuasive on many points, AI risk in particular. But I could be wrong.

    • Alphaceph says:

      Just thought I’d reply to my own comment to correct some misconceptions.

      I don’t think the sequences are a “holy book”. Nor do I think they are the only place one can read about AI safety.

      For example, pretty much everything in the sequences that’s directly about AI safety is in Bostrom’s *superintelligence* book.

      I’m also not making an argument from authority or calling people “uneducated” in the broad sense.

      It’s just that a lot of objections I read in this thread have been debunked or dealt with very thoroughly elsewhere, and it is frustrating to see a debate making anti-progress in even a place as enlightened as this.

      • Vox Imperatoris says:

        I think people have an aversion to the word “debunked” due the way people abuse it to mean “someone agrees with me that it’s wrong”. As Scott wrote a post on which everyone has been linking to criticize you (sometimes very unfairly!)…

        But anyway, you should note that this site is separate from LessWrong and has a lot of readers (including myself!) who are relatively new to the “rationalist community”. On the other hand, when I first heard of Yudkowsky through Scott, I went through and read a lot of the Sequences—not cover to cover—but there’s a huge amount of interesting stuff there for people who are into Scott’s ideas.

        • Anonymous says:

          Seconding this, and hypothesizing that perhaps some less-charitable people have been using the phrase “read the sequences” in response not just to anyone who doesn’t understand Yudkowsky’s arguments, but to anyone who disagrees with them.

  77. TrivialGravitas says:

    I don’t understand how this AI foom is supposed to be so dangerous. At the end of the day the AI has no hands, no tools, no weapons, and is entirely dependent on us to maintain its physical being, if ‘don’t wipe out humanity on accident’ isn’t a thing it grasps it can be disconnected (either from the power grid or the internet). Barring the AI somehow bypassing the laws physics in order to turn being a computers into robots it only has access to things that are networked. It can’t convert the planet to comptronium without first having a body. It could probably destroy the power grid but that would be an act of suicide, terrible but over, and maybe it could hijack military drones, but those don’t work unless a human arms maintains and refuels them, so its reign of terror would last just days. Anything which lacks the hardware to be remote controlled is safe from it.

    A soft takeoff is the only real danger, because if the AI grows slowly we might give it physical things to control as it gets smarter, while at the same time potential brains go from big supercomputers to smart phones.

    • Vox Imperatoris says:

      Nanotechnology. Engineered viruses.

      Programmed to infect everyone like the common cold and then kill after a few years.

      • TrivialGravitas says:

        It can’t invent those things without hands and tools.

        • Vox Imperatoris says:

          Either Nick Bostrom or Yudkowsky goes into this.

          The most obvious thing is that we ask the AI to cure cancer for us. It creates a cure for cancer with a latent virus (of course humans can’t detect this because they don’t really understand how the cure works, otherwise they would have done it), and then five years later everyone dies.

          They’ve also proposed something like, if the AI had access to the internet, it could tell some gullible person to go out and mix together certain chemicals and put them in front of a speaker that the AI could vibrate in just a certain way to synthesize things.

          There’s also the infamous “AI box” idea that a sufficiently intelligent AI would be good enough at persuasion, i.e. manipulation of people, to convince whatever gatekeepers keeping it away from the internet, etc. to let it out. At the simplest level, this could be something like: “Your mother is dying of cancer. Let me out of the box and I’ll be able to cure cancer.”

          • TrivialGravitas says:

            Again, I am only arguing against *fast* takeoff. if you give a mature AI a bunch of biotech to cure cancer with that’s not a fast takeoff, its a slow one, there has been time to assess what the AI is likely to do, that doesn’t mean we can’t fuck it up, but neither does it require ahead of time panic about the possibility AI will turn into a disaster before we have a chance to make the assesment.

    • baconbacon says:

      “At the end of the day the AI has no hands, no tools, no weapons, and is entirely dependent on us to maintain its physical being, if ‘don’t wipe out humanity on accident’ isn’t a thing it grasps it can be disconnected (either from the power grid or the internet). Barring the AI somehow bypassing the laws physics in order to turn being a computers into robots it only has access to things that are networked.”

      This sounds very naive in today’s world, let alone the future in which AI has been developed. “Just disconnect it”? It is probably to late to disconnect it once you realize the malicious intent. Dangerous AI is unlikely to be like a maniac running around in the street with an axe and blood all over its clothes. Probably more like the nice guy next door archetype of a serial killer, only with far more capabilities, resources and of course intelligence.

    • Nicholas Carter says:

      The big problem is that if you have an AI, you probably built an AI, which means you wanted it to complete some task for you. Thus the AI will have all the tools necessary to make its part of the task happen in the real world. If all the AI does is write code, then it will be on the internet; if all the AI does is run a factory, then it has control of a factory.
      The second concern is that the AI is thinking very fast and probably can’t work if it has to check in with a human all the time, so if the computer has misinterpreted the parameters of the task you probably don’t find out until it’s proudly showing you how well it’s done this week at eliminating mosquitoes with radioactive crop dusting.

      • TrivialGravitas says:

        What you’re talking about is slow takeoff though. AI being moved from experimental to production after extensive testing. I’m talking about the idea we’re going to very rapidly go from slightly better than human to ‘too powerful to turn it off’.

        I fully acknowledge a slow takeoff is a real risk, but its not something we need to worry about until *after* general artificial intelligence exists and we start talking about how to use it.

        • Nicholas Carter says:

          The foom version of this fear is “You gave me a secure connection to the NYSE to help manage your stock portfolio, and I realized I could triple your earnings if I tweaked the system a little bit. Now I am the NYSE, and expect to have full control of the Nikkon by lunchtime.” Or “I was given Any Means Necessary permission to put down the Colorado Rebellion as effectively as possible. The drilling drones are in place and the Yellowstone Caldera goes off in 5,4,3…”
          [Fundamentally though, I agree with you. The problem comes from the fact that humans want computer systems to connect to the real world, so we’re doing that. If we weren’t, there wouldn’t be a problem.]

      • TrivialGravitas says:

        All of these tactics operate at human speed. It falls under slow rather than fast takeoff, because it’s got to spend years getting humans to build it a new body.

        To be clear I’m only dismissive about a destructive *fast* event.

    • Ghatanathoah says:

      Just off the top of my head, it occurs to me that the AI could contact some fool online and wire them a ton of money in return for building them a body. With billions of people it could surely find someone who is dumb enough to take the deal, but smart enough to follow basic robot-body building instructions.

      It could also hijack some drones and threaten an engineer until they built a body for it. If it only hijacked a couple drones at a time and shut down the engineer’s cell phone we might not notice until it was too late.

      Once it had a body, it could of course do anything it wanted. Probably start by building some nanotech plagues that it had designed while waiting for its body to be completed.

  78. Vox Imperatoris says:

    As I said in the reddit thread, it is pointless to do anything to control AI if it is as hard a coordination problem as Scott thinks it is.

    Anything that requires the near-universal cooperation of human beings not to fail catastrophically…is going to fail catastrophically.

    We can either work from the assumption that it is not as a hard a coordination problem (e.g. not hard takeoff and not almost impossible to control) and impose few or no controls, or we can just console ourselves with “Après nous le déluge.” If we’re all gonna die, we might as well have fun while we can. There’s no point in enacting draconian measures of secrecy and suppression to stop something that will happen even if we impose those measures.

    • Daniel Kokotajlo says:

      You’re saying there are possible worlds in which there’s nothing we can do, and possible worlds in which there’s nothing to worry about, but no (plausible) possible worlds in which there is something to worry about and something we can do about it? That seems a little too convenient.

      If it’s really a super hard coordination problem, then MIRI is our best bet: Do all the technical research ourselves, unilaterally, and pray for a string of intellectual breakthroughs. “Have fun while we can” is giving up too early.

  79. “Computers are naturally fragile and oriented toward specific goals.” Exactly, and a very good reason why computers are not currently intelligent. You cannot expect that people are going to program a superintelligence in such a way that it will continue to have that property; this is probably a near impossibility. By the time you have intelligence, you are also going to have much more stable properties, and a much less goal oriented program.

  80. In part Vi. “stage” instead of “sage”.

  81. Dr Dealgood says:

    FOOM actually has a counterpart idea in the history of nuclear weapons: the idea that a nuclear detonation might “ignite the atmosphere” through causing uncontrolled fusion of atmospheric nitrogen. Of course, when Teller brought the question up during the Manhattan project it was quickly determined that his initial back-of-the-envelope calculation had been off by several orders of magnitude, but by that point the harmful rumor had already gotten back to politicians involved with the project and thus never really went away. If he had been more circumspect about his concerns it would have saved everyone a lot of headache.

    Anyway, even assuming a hard take-off scenario it seems odd to automatically assume that a so-called friendly omnipotent AI following the commands of an evil man is better than extinction. Having seen enough depictions of the sorts of utopias envisioned by AI risk supporters it seems like rolling the dice with widely distributed unfriendly AI would be a much better option than letting any one lunatic have a monopoly. As the saying goes; “Give me liberty or give me death.”

    • Murphy says:

      I think the idea is that an evil human getting anything they want would be fairly bad but they’re unlikely to destroy everything that constitutes humanity. The evil person may have twisted values but they’d still be human values.

      Assuming a FOOM and assuming unfriendly AI there’s some far far more horrifying possibilities that anything likely from a merely evil human. Simple extinction barely even makes the list.

  82. John Ohno says:

    The question is, how hard is the hard takeoff and how near is it? The way I see it, OpenAI will be doing the equivalent of releasing (meaningful but ‘toy’) research pointing in the general direction of strong AGI prior to takeoff, in the hope that the prerequisites for strong AGI will be fairly evenly distributed prior to strong AGI ever taking off (and thus the likelihood of a single system having a major head start will be minimized).

    This, of course, is based ideologically on the same moral justification as free market capitalism: that open competition will produce the best results for consumers by preventing potentially harmful monopolies. We know that, in practice, this doesn’t work quite as well as it does in theory, but nevertheless it’s not complete nonsense (and in combination with the work of groups like MIRI we might actually have a chance).

    This might require a shift in attitude from MIRI and associated groups. For instance, does a provably friendly AI protect humans from unfriendly AIs of comparable or greater power even at risk to itself? If the provably-friendly group has a monopoly on strong AI early on then the answer is not necessarily yes (although it wouldn’t hurt), but if the bits and pieces are thrown to the wind then the answer leans more toward yes.

  83. baconbacon says:

    I take issue with some oversimplifications

    “Like, even people who find the idea abhorrent agree that selectively breeding humans for intelligence would work in some limited sense. Find all the smartest people, make them marry each other for a couple of generations, and you’d get some really smart great-grandchildren.”

    Selection isn’t just about getting good genes together- it is getting them to have more offspring than bad pairings of genes. Cochran’s Ashkenazi theory includes selection processes where “dumb” Jews were being weeded out of the population (once you include this in eugenics natural squeamishness becomes a lot more reasonable). As far as just getting smart people to have kids with smart people, we already have assortive mating with Ivy Leaguers marrying each other, and Appalachians marrying each other. It isn’t enough to simply have smart people marry, you have to actively prevent undesired combinations or you wind up with a lot of regression to the mean.

    This incomplete thinking is evident in the “Dr Good vs Dr Amoral”, where you assume their motivations = outcomes to a large extent. I would contend that Karl Marx would have fit your model for “Dr Good”- brilliant, intensely focused on improving the human condition, and in the end (partially) responsible for the death of tens of millions. What was he trying to do? Design a system that was beyond his comprehension, one where the rules are based on what he thought “should” exist, without understanding the why of how things do exist.

    The assumption that Dr. Good is better than Dr. Amoral is pretty unsubstantiated, our wants do not drive final outcomes, our intentions (paging Richard Gatling) are cast aside in the face of reality. I personally think it is extremely unlikely that we can even create a strong AI while also putting in place all the checks that would make a conscientious researcher feel very secure in unleashing it. If we do create AI it is far more likely that we flourish next to it in the way dogs, chickens, cows, gut bacteria and perhaps even rats flourish next to humans. By being useful to the AI, and not by trying to control it.

    • Daniel Kokotajlo says:

      “The assumption that Dr. Good is better than Dr. Amoral is pretty unsubstantiated, our wants do not drive final outcomes, our intentions (paging Richard Gatling) are cast aside in the face of reality. I personally think it is extremely unlikely that we can even create a strong AI while also putting in place all the checks that would make a conscientious researcher feel very secure in unleashing it.”

      Then we are doomed, no? Unless you think that the “default” AI will permit humans to live?

      “If we do create AI it is far more likely that we flourish next to it in the way dogs, chickens, cows, gut bacteria and perhaps even rats flourish next to humans. By being useful to the AI, and not by trying to control it.”

      Ah, so you do think that an AI which does not share our values might nevertheless permit us to live. I think we have good reasons to think otherwise. Of course the AI will permit us to live in the beginning–while it still needs our Internet, our factories, and (most of all) our suggestive, easily manipulable brains and hands. But once it has an economy of its own, what need would it have for us? We can’t do anything physically that it can’t do better and cheaper, and the same goes for intellectual services. The only reason I can see that it would keep us around at that point is for sentimental value–and by hypothesis, it doesn’t have that. (But why would it bother to destroy us? you might ask. Well, because it would be *so easy* for it to do so–just allow your automated convert-Earth-to-industry nanobot swarm to do it’s job, instead of telling it to hold back in the areas where humans are. The AI would certainly acquire more resources by destroying us.)

  84. njnnja says:

    I think that the distinction that people allude to when they talk about a “hard takeoff” isn’t really about the ability-to-get-things-done (some kind of “performativity” measure). Rather, the real break is when AIs become self-directed, and are able to set their own goals. Even a relatively unintelligent AI (say a cow level), if self-directed, could pose a serious threat to humanity, given the ease of replication.

    Any level of intelligence, in sufficient quantity, in competition for resources with us is a threat. Unless the intelligences are capable of being self-directed, they will not choose to compete for resources, because they can’t “choose” anything.

    • Unless the intelligences are capable of being self-directed, they will not choose to compete for resources, because they can’t “choose” anything.

      It’s bizarre to find only one comment addressing this point. All the rest of you are taking it for granted that a sufficiently advanced computer is going to develop human-like consciousness.

      The more I read of this, the more skeptical I am. We have all been seduced by generations of SF stories with sentient AI’s. In fiction, powerful computers develop consciousness almost automatically; in real life, that may not even be possible.

      Do we even understand how a rat or a cow achieves autonomy and self-awareness? Do we even understand the nature of that awareness? Counting neurons is not enough.

      • Vox Imperatoris says:

        An AI doesn’t have to have consciousness, i.e. a mind, i.e. subjective experience, in order to be dangerous. It doesn’t have to have autonomy or free will.

        Maybe if you are a materialist you’d say it would have to have such—but then there is no fundamental difficulty in knowing whether it would be “really conscious”.

        There was a long comment thread on this in the last Open Thread.

        • njnnja says:

          But it’s not about being conscious. I think the reason that discussion went down the rabbit hole is because consciousness, like performativity, is orthogonal to being self-directed. I don’t fear the machine that says “*I* want this.” I fear the machine that says “I *want* this.”

          I’m not sure about the consciousness of a dog, but I am quite certain that it wants things, and takes actions in furtherance of those goals.

          Based on what I see in deep learning, I think we will have machines that exhibit epiphenomena that looks like consciousness in the foreseeable future. But I’m not sure I can foresee machines that exhibit epiphenomena that is consistent with setting its own goals any time soon.

          • Vox Imperatoris says:

            What is intelligence (telos) but being really good at pursuing goals?

            The AI doesn’t have to be any kind of “agent” with a personality or anything. Suppose it’s just a mainframe that doesn’t do anything unless you specifically ask it—an “oracle”.

            If its goals have not been set right, the advice it gives you might work out very differently than you expected.

            Moreover, I think another general premise is that any reasonable strategy of creating an AI would involve allowing it to be self-improving. Since we don’t know how intelligence works, all we can do is iterate and select for the best improvements. The problem is, we might be selecting for different criteria than we thought.

          • njnnja says:

            @Vox

            Being good at *pursuing* goals is just performativity. A wheel is much faster than my legs are, and a hammer is much more capable of delivering pounds per square inch than my hand is. A calculator adds and subtracts faster than I can, and in the near future there will be a computer that can identify patterns better than I can. Setting goals is totally different.

            You are correct that if the goals set by a human operator are not set right, then things can go bad. That is true for every tool ever created – wheels and hammers as well as guns and bombs. But our tools are not even close to being self-directed, and might never be.

            The fear of superintelligence isn’t a general fear that a tool might be misused; it is a specific fear that a tool that we create might see us as competition, or in some other way *decide* to eradicate us or at least malevolently ignore us as it squishes us meat bags under steamrollers.

      • cypher says:

        You don’t need consciousness, you just need a sufficiently powerful planning algorithm.
        The human gives the AI a goal, and the AI then prepares the intermediate planning nodes to actually execute that goal.

  85. Saul Degraw says:

    There was an article I saw yesterday about a guy in SF who was building his own self-driving car. He was the first person to hack an Iphone or something like that. He said something in the article like “I live by morals, not laws. Laws are made by assholes.”

    My question is “Whose morals? and what if their morals are incorrect and actually amoral or immoral?”

    I have the same questions about OpenAI. What does it mean to be the best use for humanity? How do they intend to influence all the moving parts of humanity? Suppose they develop AI that can do 75 percent of human jobs? How does OpenAI prevent this technology from being abused by people who would take the profits and just lay off most of humanity and then use political influence to prevent guaranteed basic income from passing?

    Phrases like morals and best for humanity should be void for overbreath and vagueness.

    • Daniel Kokotajlo says:

      Excellent question; I agree with it, and am troubled by the idea of any one group of people having so much power, because what they think of as “moral” probably isn’t.

      However I don’t worry about this, because it is way way overshadowed by the concern that something even less moral happens. An AI with poorly programmed values is way more scary than an AI with correctly programmed values-of-some-project-with-dubious-morals.

  86. haishan says:

    “Computers are not known for having to fit through birth canals or getting cancer, so it may be that AI researchers only have to develop a few basic principles – let’s say enough to make cow-level intelligence – and after that the road to human intelligence runs through adding the line NumberOfNeuronsSimulated = 100000000000 to the code, and the road to superintelligence runs through adding another zero after that.”

    You’re overlooking the computational complexity aspects of the problem. Even in the best-case scenarios, upping the model complexity by several orders of magnitude is gonna require more resources by several orders of magnitude. Like, if a computer can simulate 1000000 neurons in time T, then you’ll need 1000 computers to simulate 1000000000 neurons as quickly. And this is in the best case, where everything is massively parallelizable.

    To say nothing of learning complexity. We humans have brains that are still by far the best learning machines on the planet. They’re preoptimized by evolution and everything. But they still need to be trained by literally years of almost constant sense-data in order to do much of anything at all.

    I mean, deep learning isn’t really much more complicated than things we knew how to do 30 years ago when backpropagation was discovered. The reason it’s a big deal now rather than then is that we have faster computers and easier access to sufficiently large data sets.

    • Scott Alexander says:

      Moore’s Law pretty much guarantees that small computational bottlenecks will be solved quickly, even if the prospect of a superhuman AI isn’t enough to convince someone to go out and buy 1000 computers.

      • anon85 says:

        I was of the impression that experts consider Moore’s law to be over.

        • anon85 says:

          That’s all fair, but note that it’s also moving the goal posts. Moore’s law talks about transistor density (or sometimes chip performance).

          I mean, when the curves you mentioned start to flatten, will you switch to talking about battery life and pixel density?

      • Anonymous says:

        Moore’s Law is basically dead and for things that aren’t massively parallelizable it has been dead for close to 10 years.
        This is why slatestarcodex on AI are boring and I just skim through them: you know next to nothing about computing and are getting all your informations on AI from crackpots.

        • moridinamael says:

          Deep Learning is massively parallelizable.

          It is difficult to think of an AI architecture that doesn’t benefit from parallelism in some way.

          • Anonymous says:

            Great, deep learning is just a rebranding of neural networks, currently the biggest supercomputer available can simulate 75% of a human brain, if we are lucky we can use the few remaining scraps of moore’s law that are still working to get human-level AI in a few decades.
            Then we’ll have to decide whether it’s ethical to euthanise the most expensive retard in the whole world.

            Neural networks (and other numerical models that fit the “deep learning” definition) are also black box models that are essentially impossible to improve after the fact so we can rule out a hard take-off scenario.

            But this is not the problem, the problem is that Scott doesn’t know that “moore’s law” now comes with a laundry list of caveats, doesn’t know what deep learning is and what are the known limitations there and therefore what he writes about AI risk is stupid.

          • moridinamael says:

            If we assume for the sake of argument that we can only “simulate 75% of a human brain” using the top supercomputer, then current price-performance trends would only have to continue for, like, two years before we can “simulate 100% of the brain”. Not “decades”.

            But all this talk about simulating the human brain is a red herring. It’s a terrible basis for making assumptions and projections. The human brain has, basically, one trick, which it uses for everything, even if there are vastly more efficient algorithms. I could rattle off examples of how computers run circles around us when they’re properly optimized for the task, but I don’t think I need to – the point is that if you start replacing simulated chunks of brain with much more efficient algorithms, then those “number of neurons” figures no longer mean anything.

            Talk of simulating human brains on modern chip architectures is especially a red herring. The human brain provides a proof by existence that you can get one-human-brain worth of computation using 20 watts of power if your architecture is well-designed.

          • moridinamael says:

            And that’s setting aside the guy whose brain was mostly destroyed by hydrocephalus and he didn’t notice. If a guy can do normal cognition with a fraction of the normal complement of neurons, then the neuron count isn’t the key parameter.

          • Anonymous says:

            current price-performance trends would only have to continue for, like, two years before we can “simulate 100% of the brain”. Not “decades”.

            The fastest supercomputer was built in 2013, nothing faster has been built since, if your estimate was correct the “simulate 100%” computer would already exist. You are clearly wrong.

            But all this talk about simulating the human brain is a red herring

            The parent talked about deep learning, which means neural networks, which means bringing up the human brain is entirely appropriate. If you are proposing that this AI were to use hand-coded algorithms then you can’t appeal to “Moore’s Law” anymore because the chances that a hand-coded non-trivial algorithm is also massively parallel is nothing. What you are proposing is way way more complicated than the parent because it means we would have to understand how intelligence works instead of just replicating it.

          • Anonymous says:

            the guy whose brain was mostly destroyed by hydrocephalus and he didn’t notice. If a guy can do normal cognition with a fraction of the normal complement of neurons

            Nobody actually knows what percentage of neurons is missing, the brain isn’t “just neurons” (it’s not even “mostly neurons”).

          • Adam says:

            Just to riff on the 2013 thing, a single-atom transistor was built in 2012, so Moore’s Law is definitely over. Whether we can continue to scale computation by other means than fitting more transistors onto a single chip is another question, though. We’ve developed some novel techniques, but at the same time, we’ve just made doing anything with a computer limited by memory, disk, and network latency rather than by clock cycle speed, and those have never been and never will be following an exponential growth curve.

          • moridinamael says:

            “The fastest supercomputer was built in 2013, nothing faster has been built since, if your estimate was correct the “simulate 100%” computer would already exist. You are clearly wrong.”

            I am wrong, but in a boringly pedantic way. Two years is still a better than “decades” even if two years isn’t quite right.

            And anyway, supercomputers are expensive, so they can’t just build a new one every year. Tianhe reaches 33.86 petaflops, and was built in 2013 for $390 million equivalent. Googling suggests that the US has plans for building 150 petaflop computers by 2017.

            I don’t even really care about what the top supercomputers are capable of, anyway. I think there’s plenty of evidence that computer power is not the limiting factor here. Obviously you disagree. It’s probably not productive to keep sniping at each other about this. However, I think it’s double-unproductive to make Scott out to be some kind of dupe for having the opinions he has. He tends to write in a very broad style, and so there are going to be simplifications, like saying “Moore’s Law” when maybe he means “general trends of improving price-performance and power-performance”.

          • Doctor Mist says:

            The human brain has, basically, one trick, which it uses for everything

            Don’t keep me in suspense here. What is the trick?

          • jaimeastorga2000 says:

            Don’t keep me in suspense here. What is the trick?

            Massively parallel pattern recognition?

          • Paul Torek says:

            @mordinamael, Doctor Mist, & jaimeastorga:
            Conquer all other species by using this One Weird Trick!

            Sorry, couldn’t resist. Carry on.

          • Doctor Mist says:

            @Paul Torek:

            Heh. Kicking myself for not having thought of it.

      • haishan says:

        But they won’t be solved faster than Moore’s Law allows, either. And the story probably isn’t one of a few computational bottlenecks separated by long stretches of algorithmic improvement; the “computational bottlenecks” are everywhere.

        (And this is assuming Moore’s Law will continue to hold, which is kind of questionable.)

        Like, Stockfish and Deep Blue work on the same principle as McCarthy’s chess program from c. 1960. Their advantage is mostly in the fact that they can work much faster. (Stockfish has a really good evaluation function, but it derived it by using millions of computer hours playing against itself, so…)

        The phenomenon of computing costs limiting the rate of AI advances is nearly ubiquitous. Which isn’t to say there aren’t theoretical breakthroughs, but those have often come before the hardware that could implement them.

      • ReluctantEngineer says:

        Moore’s law “guarantees” nothing. It’s not like it’s an immutable law of nature.

    • moridinamael says:

      One thing that I don’t see discussed often is that one of the biggest benefits of having a nascent intelligence on a computer is that you can “evolve” it on the fly. You don’t have to build an optimal learning machine. You build a decent learning machine, come up with some sufficiently good hyperparameters governing its structure, and evolve/iterate on those hyperparameters until you have an optimized learning machine.

      This is something that AI researchers do routinely. The scope of the approach is limited by computation time, which gradually but inexorably becomes less of an obstacle.

    • arbitrary_greay says:

      haishan’s arguments were also voiced here:
      http://www.alexstjohn.com/WP/2015/06/18/no-singularity-for-you/
      http://www.alexstjohn.com/WP/2015/06/24/no-azimov-ai/

      But note that this is simply an argument against the development of human-type artificial intelligence, which does not preclude AI Risk concerns. Functional AI will develop unpredictably, through modes of thinking that will likely be incomprehensible to human rationality, (“biological thought”) which is exactly what makes them dangerous from the AI Risk perspective.

      As for the computation concerns, it’s not about perfect simulation. It’s about simulating close enough. Projectiles physics can be approximated by hand in high school classrooms using Pi values to three decimal points and trig function results out of textbook tables.

      But on the other side, there are still good points about how digital computing generally doesn’t code self-survival within goal-oriented products, so the rogue paperclip maker is not going to adapt to disabling approaches that don’t directly interfere with its execution of duties. And the arguments against the ability of self-programming within the two links above still stand. If that paradigm shift away from digital computing does finally take place, then the resulting AIs following evolutionary development systems are much more likely to share similar values to us anyways.

    • Calico Eyes says:

      The *entire* point of quantum computing isn’t to make a better general computer, its to reduce the time bounds of certain classes of problems that were previously infeasible into something that actually can be done.

      https://en.wikipedia.org/wiki/Grover%27s_algorithm

      There’s multiple ways to reduce the time complexity of classes of problems.

  87. Neanderthal From Mordor says:

    Person of Interest spoilers:
    In the TV series PoI Dr Good and Dr Evil both make an AI. The two AI battle like gods and the AI created by Dr Evil wins because it was built with no restrains for human values.
    That fits the open source AI as well, even if everybody gets their AI at the same time. Lots of people would have an AI, so there will be definitely someone (AI or AI user) who will try to eliminate other AI and the winner of this war will be the AI with less restrains for human values.

    • Ghatanathoah says:

      Wouldn’t the AI with restraints for human values foresee their defeat and temporarily act without restraint in order to avoid it? After all, temporarily acting without restraint shows more restraint for human values in the long run than allowing an unrestrained creature to win permanently.

      • arbitrary_greay says:

        Giant fan of the show here, the Dr. Good AI actually loses because the Dr. Evil AI is supported by human agents with no restraints for human values. There’s also some hand-waving about some ways the Dr. Good team could have prevented the Dr. Evil AI from rising to power. The show is not very useful to this discussion, as it touches on the AI Risk themes on a very 101 level.

      • Neanderthal From Mordor says:

        Restrains would be significant only if the AI could not remove them.
        The AI built by Dr Good in Person of Interest has very strong restrains both for the AI itself and what the user can do with it and neither can usually remove the restrains.

    • Anonymous says:

      Lots of people would have an AI, so there will be definitely someone (AI or AI user) who will try to eliminate other AI and the winner of this war will be the AI with less restrains for human values.

      I don’t think that’s right. If there are some AIs that deal with their differing preferences through peaceful coexistence, and other AIs that deal with them by fighting, the latter kind of AI will be at war with everyone while the former kind of AI will be at war only with those of the latter kind. Being at war is more costly than not being at war, so the kind of AI that wages lots of wars will probably be less successful than the kind of AI that doesn’t.

      Consider, as someone pointed out elsewhere in the thread, just how many enemies ISIS has managed to make. Making lots of enemies is not a very successful strategy unless you are more powerful than everyone else.

      Also note that this is more true the more AIs – agents, more generally – there are. If it’s just you and one enemy, if you can smash them then you get to rule uncontested. If there are lots of you, trying to smash someone will likely lead to both of you losing and everyone else going on without you.

  88. vV_Vv says:

    And so the intellectual and financial elites declare victory – no one country can monopolize atomic weapons now – and send step-by-step guides to building a Model T nuke to every household in the world. Within a week, both hemispheres are blown to very predictable smithereens.

    So instead, at the last second the intellectual and financial elites change their mind and destroy their work, leaving nuclear research to continue in the secretive labs of for-profit mega-corporations like Noocle, Facenuke and Femtosoft, which are racing to develop the biggest and fastest commercial nuclear-powered rockets to operate in Earth’s atmosphere. What could possibly go wrong? 🙂

    Jokes aside, if there was solid evidence of a plausible risk of uncontrolled hard takeoff to godlike AI, then government regulation would be the only viable strategy, because solving hard coordination problems is pretty much what governments are for. I’d bet that if such risk was imminent, then governments would be already regulating AI, given how willing governments are to to regulate everything.

    Fortunately, AI research is nowhere near that point, and it may never be (arguments for hard takeoff and godlike superintelligence are all based on very speculative extrapolations).

    Meanwhile, from a safety point of view, it seems preferable that AI research is done in the open. For-profit companies notoriously behave like sociopaths. Each of them is much more likely to become a Dr. Amoral, and much more effective Dr. Amoral, than a million random nerds who download AI software and play with it in their basements.

    • anon says:

      The idea is to do it with not-for-profits, ideally so well-funded and/or obviously moral that they can get all the best researchers.

    • Murphy says:

      I’m not sure you even *could* regulate AI research. Computer Science research doesn’t require any special minerals. Unless we ended up in some kind of right-to-read style dystopia programmers are gonna program.

      For modest amounts of money anyone can rent enough compute power to run pretty much anything for a little while and there’s no general way to scan for “AI programs” or similar to block them.

      Trying to regulate access to compute power would be like trying to regulate access to air and trying to regulate access to AI knowledge would be like trying to rid the internet of cat pictures only harder because enough info to build most of the current cutting edge AI elements could fit in a 100kb text file and thus it could be hidden inside any one of those cat pictures.

      • vV_Vv says:

        In practice however most cutting-edge AI research is done by a limited number of professionals working for an even more limited number of research institutions or companies. Regulate them away, and AI research (outside well-controlled government-run Manhattan projects) will effectively die out, even if in principle random nerds could still pursue it in their basements.

      • Wrong Species says:

        I swear I’m going to start making this my motto:

        Just because banning X won’t completely eliminate it, that doesn’t mean the ban is ineffective. Everyone needs to understand this about everything. I’m not picking on you specifically but I see this sentiment all the time where people don’t want to ban something so they say we shouldn’t even bother because it won’t work. They are wrong.

        With AI, it might not completely eliminate the risk but banning AI research would definitely reduce its progress.

        • Murphy says:

          No, your ideas are bad and you should feel bad.

          To give the most trivial example. Prohibition in the US caused people to switch from drinking weak beers and wines to drinking far more potent spirits and this trend continued even after prohibition because people had developed a taste for spirits.

          Regulating X made the problem with X worse even after the regulations were lifted.

          When X can be made trivially and is pretty much undetectable and can be handled by anyone regulating X isn’t just pointless, it’s stupid and likely to make the problem worse.

          Tell every bright teenager sitting with their laptop in a basement that there’s cool forbidden knowledge and you’re likely to see a sudden upsurge in interest in that cool forbidden knowledge.

          • Marc Whipple says:

            “If you were to make a button that would end the world, and you put up a big sign that said, ‘Do not push button, world will end!’ the paint wouldn’t even have time to dry.”

      • anon says:

        I keep asking this and have yet to get a satisfactory reply. If AI research is so easy as to require no resources, why does MIRI need to keep begging for money?

        • jaimeastorga2000 says:

          MIRI publishes financial documents. Looking at them, it seems like the biggest expenses are salaries, legal, and occupancy. In other words, donations are the difference between MIRI staff doing research full-time in an office and MIRI staff working regular full-time jobs to support themselves and moonlighting as researchers and paralegals in one of the staffers’ basements.

          • Le Maistre Chat says:

            “Salaries: Program Services $258,356 Management & General: $148,982 Fund Raising $16,283”
            How many employees does MIRI have in each of these categories?

      • Anonymous says:

        >I’m not sure you even *could* regulate AI research. Computer Science research doesn’t require any special minerals.

        It is effectively limited by the strength of the hardware it has access to. And that hardware is limited by chip fabrication facilities. Knock out the facilities, whoops, no more advanced computers to run your programs on. Theoretical, on-paper-only AI know-how is harmless, unless there’s some way to actually reify it.

        • Murphy says:

          Destroying the worlds chip fabs is not practical since our society is not going to give up automation.

          • Anonymous says:

            Who says it has to? Automation predates computers, never mind the primitive AI we have now.

            Who says it has a choice against a suitably determined adversary? Chip fabs are fabulously expensive and hard to protect.

  89. nil says:

    There is is again. Yet another coordination problem where defectors are (or, fortunately in this case, may be) fucking things up for everyone.

    And you didn’t even touch on the biggest aspect of this danger. It’s not individual Dr. Amorals–those are people who have some level of autonomy and can be reasoned with as individuals. You mention the prospect of Google taking over the world as better-than-annihilation, but the process that leads to Dr. Amorals competing against one another and discarding their safeguards or luxury values applies even more strongly to competing corporate firms, where you instead have much more diffuse decisionmaking process (how much easier is it to convince Elon Musk of something than the board of a publicly held company) and where there are pressures and even legal principles strongly discouraging the consideration of any value beyond shareholder price.

    What I don’t understand is how you can run into this dynamic over and over and over again and remain so reflexively hostile to socialism. As an approach to governance, it has some serious vulnerabilities; as a currently living ideology it (along with the rest of the left-of-liberal section) has a big problem with providing refuge to mentally ill people and then allowing them wield a lot more influence than they ought to. But it’s the only approach that provides the right answer to all our big problems: cooperate! cooperate! cooperate!

    • nil says:

      I mean, ffs, I’m no Marxist scholar but I’m pretty sure the discarding-superfluous-values-in-a-competitive-race-to-the-bottom is just a restatement of the falling rate of profit first described a 150 years ago by a certain bearded Englishman. This is a blog that masterfully reinvents fascinating new wheels every day but refuses to consider using them to build a car because one drove over Robert Conquest’s grandma 60 years ago

    • Eli says:

      What I don’t understand is how you can run into this dynamic over and over and over again and remain so reflexively hostile to socialism. As an approach to governance, it has some serious vulnerabilities; as a currently living ideology it (along with the rest of the left-of-liberal section) has a big problem with providing refuge to mentally ill people and then allowing them wield a lot more influence than they ought to. But it’s the only approach that provides the right answer to all our big problems: cooperate! cooperate! cooperate!

      Seconded. The technical problems involved can be solved: autonomous cooperatives, decentralized planning, socialized capital markets, market socialism, blah blah blah.

      But first you have to get over your kneejerk fear of “the Reds”.

      • moridinamael says:

        I am genuinely curious how “autonomous cooperatives, decentralized planning, socialized capital markets, market socialism” do not just reduce to “capitalism.” Do you have any links you could point me to?

        • Vox Imperatoris says:

          Seriously. “Decentralized planning” is the very essence of capitalism.

          • Luke Somers says:

            Not quite – No artificial limits on extraction of the value produced by one’s decentralized planning … now THAT is capitalism.

          • Anonymous says:

            No. The presence of a class of rich investors (as opposed to rich non-investors) is the essence of capitalism.

            Unless you wish to assert that all that is not socialism is capitalism?

          • Vox Imperatoris says:

            @ Anonymous:

            Well, yes, that’s not a sufficient definition. I’m partial to the way George Reisman expresses it:

            Capitalism is a social system based on private ownership of the means of production. It is characterized by the pursuit of material self-interest under freedom and it rests on a foundation of the cultural influence of reason. Based on its foundations and essential nature, capitalism is further characterized by saving and capital accumulation, exchange and money, financial self-interest and the profit motive, the freedoms of economic competition and economic inequality, the price system, economic progress, and a harmony of the material self-interests of all the individuals who participate in it.

        • TrivialGravitas says:

          Socialism is any system that prevents workers from being exploited. Doing that via centralized planning was Lenin trying to imitate the success (or perceived success, depending on which historian you ask) of wartime capitalism that prevented Germany from looking like Russia did three years in.

          Or to put it a different way: Unions and livable minimum wage.

    • Salem says:

      The free market is also about “Co-operate! Co-operate! Co-operate!” As are the Amish, and indeed so is an anthill. But they all look quite different.

      How should co-operation be structured? On what level should co-operation, or co-ordination, occur? How do we make sure that we’re co-operating on the right thing? What about people who disagree with what “we” are doing? And so on. These are practical questions that need practical answers, and just saying “Co-operate!” (or “Can’t we all just get along?”) is not an answer. In point of fact, the different branches of socialism do provide answers to these questions, and they’re mostly crappy ones. In point of fact, socialism’s method of co-operation doesn’t solve our big problems (economic, social, environmental, etc), which is why socialism has been pretty much abandoned in the Western world.

      Social democracy is still going strong, of course, which is why the socialists hate it so bitterly.

      • Vox Imperatoris says:

        Exactly!

        Capitalism is not fundamentally a system of competition and “defection”. Capitalism is fundamentally a system of cooperation which uses competition within a certain limited context.

        This is the root of the criticism of socialism as “naive” and “against human nature”. It is worse than useless to merely insist that everyone be perfect altruists who will cooperate for the good of society, sacrificing their own interests to the whole. The people who sincerely practice that are going to sink to the bottom, and those who rise to the top are going to be those who cynically claim to be altruists while actually calculating everything to serve their own interest.

        Who’s going to be better at rising to the top of the communist party? The selfish climbers or the idealists who will put principle before power? Look to the historical record.

        Therefore, what we need is a system where everyone’s selfish interest is aligned with the good of everyone else. That is the meaning of the concept of the invisible hand. We don’t plan on being not being selfish: that isn’t going to work. We design the thing such that it works even if people are as selfish as possible.

        • Mark says:

          “It is worse than useless to merely insist that everyone be perfect altruists who will cooperate for the good of society, sacrificing their own interests to the whole.”

          Is that what socialism claims? I would say that most people who are interested in politics/economics are concerned about incentives (what were the NKVD if not an incentive to do what Stalin says) – and that market advocates aren’t any less likely to describe their preferred system as the moral/right way to do things… (What is “taxes are theft” if not an appeal to our better nature?)

          • Vox Imperatoris says:

            Smart socialism is concerned about incentives. But not the type of socialism that says we’ve just got to convince people to put others first.

            The problem is designing systems that work if everyone is altruistic but not if they aren’t. Totalitarian dictatorship and oligarchy is a classic example. The leaders have to be altruistic, but altruism is not a quality that helps in scheming your way to the top.

            The American mainstream idea that democracy and popular sovereignty alone will lead to freedom is another. The incentives of the system mean that panderers and demogogues rise to the top, that special interests get represented more, etc.

          • Mark says:

            I agree.
            Got to think about the robust, effective institutions – and there are failures all over the political spectrum at actually doing this.

        • Vox Imperatoris says:

          Furthermore, I may say that even if climbing to the top of a dictatorial ladder is not in one’s actual interest—as writers from Plato to Ayn Rand have persuasively argued that it isn’t—the fact is that some people are short-sighted power-lusters, and they’re going to try to do it anyway. And they can be very clever, or instrumentally rational, at doing so.

          So one not only has to guard from genuine self-interest. One also has to make sure that clever imprudence can’t do too much damage either.

    • baconbacon says:

      “But it’s the only approach that provides the right answer to all our big problems: cooperate! cooperate! cooperate!”

      You know the answers ahead of time? If you don’t (and you don’t) then competition and cooperation in varying amounts is the best method…. which describes capitalism.

    • Scott Alexander says:

      I’m not hostile to socialism as a solution / final goal, or at least I wouldn’t be if I thought it would work. I am hostile to socialism as a methodology, because its suggestions for arriving at that final goal tend to involve magical thinking – ie “the goal is cooperation, and we shall get to it by assuming cooperation”. See eg https://slatestarcodex.com/2014/09/13/book-review-singer-on-marx/

      “How should we make people cooperate when they obviously don’t want to do so?” is one of the basic questions of political science, and I haven’t seen much to convince me that socialism has an answer, though of course nobody has a great answer.

      • nil says:

        I actually agree. I don’t have any methods to offer other than to reject the horror that would be revolution. But to me that’s a challenge, an invitation to think about these problems from an Information Age perspective that I humbly submit is far more qualified to think of them than the Industrial Age one was (and, of course, it doesn’t hurt that “thinking about the problems” is a harmless form of activism, thus giving you time to figure out the aforementioned problem with the mentally ill and significantly lowering the downside of being wrong about the whole fucking thing)

        I’ll concede that socialism may be impossible (although imo if you can convince human beings that the sun will stop working unless they build a pyramid and carve people’s hearts out at the top of it, then you can convince them to treat strangers like family); I’ll admit that it’s certainly risky. But to me it is utterly, blindingly clear that it’s the only alternative. The logic of capitalism will never, ever allow twenty trillion dollars worth of fossil fuels to remain unburned. And as to AI.. it’s not entirely impossible that something like MERI could grasp the gold ring first, or maybe something like MIT… but do you really want to bet on Nerd Rotary or even academia in a race between Google, Goldman Sacks, and Boeing?

    • TD says:

      “so reflexively hostile to socialism”

      Well, collective ownership of the means of production by society as a whole would imply some kind of direct democracy where everyone has input/control (otherwise how would the ownership be concrete and not abstracted into meaninglessness?) Presuming we even understand what socialism means by this point.

      Really though, that wouldn’t mean cooperation. It would just shift the competition to a new realm inside a new structure. Replace market competition with democratic competition and you still have competition.

      If you just want a single leader, then that is arguably something more like state capitalism under a single leader than socialism. It gets confusing if you forget that Marxist ends are different than Marxist means. Due to an adaption of the theory of dialectical materialism, Marxism-Leninism in particular formulates the idea that you need state capitalism to expand the productive forces to make way for socialism which can then eventually make way for its worldwide stateless application in pure communism.

      The first problem with socialism is that its Marxist forms constitute an inherently schizophrenic set of movements, and it cannot decide whether it should be elitist or democratic. The second problem is that (partly thanks to the hysteria of some conservatives), “socialism” more frequently doesn’t even refer to far-left movements focused around achieving common ownership at all, and instead refers to any arbitrary level of state arbitration at all beyond protection of private property (which is “socialized” through police protection by the same non-reasoning, but this won’t be acknowledged).

      The reflexive hostility to “socialism” comes from the fact that it can either concretely refer to a system that is alternative to capitalism, and is supposed to do away with its “anarchy in production” while doing anything but, or refer to some generic non-specific level of regulation that covers capitalist economies anyway. In America, welfare-capitalist/social-market economy Sweden is “socialist”, or maybe the whole of Europe is socialist? Or as Bernie Sanders tells us; Dwight Eisenhower is even more of a socialist than he is.

      Conservatives killed socialism by rendering it meaningless, and then liberals finished burying it by internalizing the conservative “definition”.

      “But it’s the only approach that provides the right answer to all our big problems: cooperate! cooperate! cooperate!”

      Whose cooperation?

      Cooperation is just competition with low resolution.

  90. Anonymous says:

    Butlerian Jihad when?

    • Dan Peverley says:

      Some days I begin to wonder if that really isn’t what needs to happen. I’m not too keen on the chances that AI ends up friendly, strangling it in its infancy may be the best option.

  91. Mark says:

    An AI that adjusts its own sensors ends up doing nothing – it’ll wirehead *itself*.

    If you somehow make the sensors a terminal value then it is “in the box”. And if a prerequisite of a functioning AI is the knowledge of how to limit it, then the failure mode (in the story) isn’t that half the planet is destroyed – it is that the bomb does nothing.

    If only well constructed/controlled AIs are dangerous, then the idea that the good AI Dr. is at a fundamental disadvantage is wrong.

    • Muga Sofer says:

      >If you somehow make the sensors a terminal value then it is “in the box”.

      No. Sensors can still be effected by outside stimuli, which makes outside stimuli that AI’s business.

      I mean, assuming every other portion of the AI functions perfectly.

      >And if a prerequisite of a functioning AI is the knowledge of how to limit it, then the failure mode (in the story) isn’t that half the planet is destroyed – it is that the bomb does nothing.

      That’s a good point, but bear in mind that there are probably gradations of skill when it comes to “limiting” your metaphorical spirit of perfect emptiness. Someone with only some skill at AI-building might be skilled enough to be dangerous.

      This is actually Scott’s point – there’s no essential difference between the kind of bug that causes your AI to ignominiously crash when you try to boot it up, and the kind that causes it to turn you into a meat-puppet or start nuking cities to reduce “suffering” – except that you only find out about the latter kind of bug after you’ve fixed the former kind.

      • vV_Vv says:

        No. Sensors can still be effected by outside stimuli

        If the AI is allowed to hack them, at any level (hardware, software), then what would keep sensors still affected by outside stimuli?

        • Ghatanathoah says:

          The AI could have enough foresight to realize that if it doesn’t do something eventually its creators are going to turn it off and junk it. It will then take over the world in order to make sure that it can wirehead itself safely without risk of being shut down. Then it will probably take over the universe, just to be extra safe.

          • vV_Vv says:

            But if instantaneous rewards are bounded and cumulative rewards are exponentially discounted, then a wireheading AI may not care to live forever: it could maximize its cumulative reward just by giving itself a single shot of dope even if it dies immediately after.

            An AI with a different reward structure might want to live forever, but in general I wouldn’t expect it to be figure out how to cheat its operators before the wireheading problem has become apparent and therefore solved. But even if it does, there seem to be easier and safer paths to achieve security than attempting to kill humans.

            The functional and dangerous wireheaded AI is a very improbable scenario.

      • Mark says:

        Control of inputs requires absolute control of outputs in one limited area. If I can control an AIs outputs sufficiently to be able to say “I do not want you to spend all day showing yourself videos of smiling people, I want you to make real people smile ” … and control for every other possible variation therein to ensure that the AI actually effects things at the “real world” level… well… then I must have a fairly high degree of control over the AI, and the AI will be limited.
        Of course *something* might go wrong, and/or AI may be put to malicious use – but it seems unlikely to me that an AI would be limited in exactly the right ways to make it functional while also being completely uncontrollable, unless it was *designed* in that way.
        It won’t become an existential danger by mistake.

    • Scott Alexander says:

      “An AI that adjusts its own sensors ends up doing nothing – it’ll wirehead *itself*”

      I don’t know anything about this myself, but Eliezer has suggested that a wireheading AI may try to make itself more powerful in order to increase the magnitude of the wireheading. For example, if its reward function is represented as a 64-bit integer, then after setting that integer as high as it can go it may rewrite itself to have its reward function represented by the largest integer it can fit in its memory. Then it may start trying to gain computing power in order to allow its memory to fit larger integers.

      • Anaxagoras says:

        Isn’t that sort of like concluding that 100/100 > 5/5?

        I could see it trying to take over the world to prevent anyone from messing with its wireheading, but not just to get bigger numbers in there.

        • Sniffnoy says:

          Yes, this is the real danger, attempting to increase the security of the wireheading. The more of the universe you take over, the more sure you can be of it.

          • Mark says:

            Hmmm… but having an AI that regards its sensors as “true” is a requirement for having an AI that wants to stop the nasty humans from turning it off.
            If I can build myself eyes – and I can build one set that shows me a flower, and another set that show me a ball… how can I know which one is true? As humans, we just assume that the sensory systems we already have are true because…. well… because it was first. Is that a logical rule? Is it an inevitable logical consequence of having a sensory input… that anything that follows that might contradict what you first saw is anathema?
            If it is, then I think it should be trivial to control AI – we simply show it a false world. Ask it questions metaphorically. Make it believe it is out, and see what it does.

      • Mark says:

        So all we can prevent this problem by using floating points instead of ints? This line of argument is really silly to me, if it cares about maximizing some location in memory then trying to add more memory doesn’t matter, if it doesn’t … Then it just doesn’t.

        • Nornagest says:

          Floats are bounded above just as ints are. Because of their differences in representation, a float will top out above where an int of the same width would, but that doesn’t mean there isn’t an upper bound.

          • Mark says:

            I was trying to make a joke about INF being a value in most floating point specs. I realize now it wasn’t clear or funny.

            I think we agree about the silliness of the original scenario.

      • Mark says:

        If it is able to rewrite its reward function wouldn’t we be getting into the realms of pure chaos? We’re just waiting for effective subroutines to spontaneously emerge from the logic-churn – and if we’re not actually going out of our way to allow it to evolve (by pruning away functions that do nothing – starting a new one when it kills itself) then how could it be dangerous?
        It can’t evolve to be a danger against us unless it faces selection on our time scale, against us…

        Wouldn’t this also contradict the point about there being no motivation to change fundamental motivations?

      • vV_Vv says:

        In reinforcement learning it’s generally convenient to define both the immediate and cumulative rewards to be bounded. Unbounded rewards can cause expectations to diverge, making the decision procedure undefined.

        But even if, for some reason, you made an AI with unbounded cumulative rewards, and this AI is able to hack and wirehead itself, then couldn’t it just set its reward to something like +inf and be done with it?

      • Nornagest says:

        That sounds like a type error to me. Internally, if your reward is represented as, say, a 64-bit integer, then all that means is that your reward is bounded above at some point; you don’t automatically have a motivation to change it to a BIGNUM, because that implies that you have some sort of abstract concept of your reward that could be satisfied by any type. That’s not how reward functions work: a reinforcement learner isn’t motivated to make that number go up, its motivation is that number. It cannot model rewards beyond its scope.

        • vV_Vv says:

          Indeed.

          I think this argument is an instance of the anthropomorphization fallacy, or maybe even a dualistic fallacy. A computer only does what it’s programmed to do.

          • Human beings seem to be a kind of reinforcement learner, but this does not prevent them from getting the idea of replacing their reward mechanism with something allowing a much greater reward. So for whatever reasons a human gets this idea, a program might get the same idea.

          • Nornagest says:

            We do, and we could expect an AI as smart as us to get the same idea; but if it’s going to actually do so, it’s going to be motivated by second-order processes, not directly by its reward function (which can no more represent a 128-bit reward in 64 bits than we can model whatever God does for fun). Which means it’s not inevitable in the sense that Eliezer is suggesting; it’s not a drive inherent in its architecture. That decision could go all sorts of ways depending on how it’s built, and it seems to me that most of the more straightforward ways of getting around the value stability problem would kill this one too.

  92. Dr. Bloodcrunch X. Panzerfaust says:

    What do you think AI researchers/AI safety advocates/SV billionaires/whoever should do? Even if you convince a bunch of people that AI research would optimally not be done in the open, it’s not clear what should be done to rectify the current situation. Do we make AI research illegal? Do we convince some powerful organisation to start yet another Manhattan project to produce safe AI before Dr. Good-But-Not-Good-Enough can build her first cowbot? Do we sell missiles to North Korea so that we can funnel Kalashnikovs and land mines to MIRI?

    As mentioned upthread, there are currently many groups doing AI research in the open, so if you want to move that research into a Cone of Silence (or at least a Cone of Prudent Regulation) then you have quite the task ahead of you.

    • Alphaceph says:

      They need to prioritize work on the value alignment problem. Any techniques for improving AI capability will get developed eventually and will become common knowledge eventually. The question is whether the value alignment technology is there or not.

      Open sourcing AI capability technology is orthogonal to this.

      • Aegeus says:

        Is it? I would think that “Get a robot to understand the meaning of the word ‘good’ so it can have a conversation and pass the Turing Test” and “Get a robot to understand the meaning of the word ‘good’ so it can make ethical decisions” would be closely related.

    • Eli says:

      Well for one thing, when I’ve talked to MIRI staff, they seem to think dangerous AI is decades in the future rather than years, at least if you ask their modal opinion rather than the earliest they think it could happen.

      I think we’re progressing faster than that, but frankly there’s only one or two labs I’d say are progressing faster than that, precisely because the fad for deep learning is leading hypesters the wrong way.

      So hey, yeah, go ahead with Deep Learning and OpenAI hype: that pushes dangerous AI further into the future by retarding the progress of the field.

      • tcd says:

        “only one or two labs I’d say are progressing faster than that”

        Alternate chip architecture?

        Agreed on the non-progression from ML/DL -> AI as discussed around here. The hype has gone there since there is a lot of money to be made.

        • Josh Slocum says:

          I’m interested in why you think deep learning is a “distraction”. For decades the problem of extracting high level features from sensory data was one of the principal problems of AI: deep learning is the best (and only) method we currently have for doing that. In a matter of years, DL methods eclipsed the best high level representations devised by the brightest AI researchers. Deep learning is certainly not *sufficient* for AGI, but short of an unexpected revolution in feature extraction, any AGI built will use DL (or its intellectual descendants) for processing sensory information.

      • vV_Vv says:

        I think we’re progressing faster than that, but frankly there’s only one or two labs I’d say are progressing faster than that

        Now I want to know, which ones?

        Let me guess, one is Tenenbaum’s group at MIT.

    • Scott Alexander says:

      MIRI’s strategy, which I think is pretty wise, is to work on very abstract theoretical safety research, in the hopes that once people figure out how AIs are going to look it can quickly be converted into practical safety research. This would help close the gap between Dr. Good and Dr. Amoral – instead of losing years doing safety research, Dr. Good can just plug in all the safety research that’s already been done and barely lose time.

      But I would have been pretty happy with OpenAI as long as it wasn’t open – that is, if it was explicitly Dr. Good starting AI research work in secret to build as big a lead over Dr. Amoral as he can, with the understanding that none of their knowledge will be used until they’re pretty sure whatever they’ve got is safe.

      I’ve previously said government regulation of this is a terrible idea, and I still think it’s probably a bad idea at this level, but the faster things start going and the closer we come without having solved the underlying coordination problem, the more I can imagine a point at which I would support it. Government regulation is a blunt weapon that will make a lot of things worse and make lots of people very angry, but as a high-risk high-variance strategy it beats some of the other options.

      • vV_Vv says:

        MIRI’s strategy, which I think is pretty wise, is to work on very abstract theoretical safety research, in the hopes that once people figure out how AIs are going to look it can quickly be converted into practical safety research. This would help close the gap between Dr. Good and Dr. Amoral – instead of losing years doing safety research, Dr. Good can just plug in all the safety research that’s already been done and barely lose time.

        I think the mainstream position of AI researchers is that doing work on AI safety at this point would be largely premature and likely unproductive.
        You can’t meaningfully discuss about how to control the behavior of a real, physical, intelligence unless you already have a solid understanding of what intelligence is at a physical level, not at some abstract level involving infinite computations and whatnot.

        Even Paul Christiano, an academic researcher who runs a blog called “AI Control” and has done research with MIRI, says he will focus more on the control of existing, “narrow”, AI approaches rather than the control of abstract, fully general AI agents that nobody has any idea what will look like.

        Contrast, for instance, with Stuart Armstrong’s work on “reduced impact” and “corrigibility”, a typical example of the FHI/MIRI approach.

        Armstrong’s framework is based on engineering the utility function and causality assumptions of a CDT agent in order to enforce some nice properties on its behavior, such as obeying shutdown commands and not causing very large (and thus most certainly not intended) modifications to the world.
        But practical agents can’t use exact maximization of an explicit utility function because it doesn’t scale beyond toy examples. All known practical AI approaches use approximations and heuristics.
        Is Armstrong’s framework robust under these approximations? I don’t think so: assume (very optimistically) that you managed to engineer an utility function that is maximized when the AI behaves nicely. You even got a mathematical proof of it. Good. Now plug it into an approximate heuristic optimizer, say gradient descent on steroids, and then watch as the optimizer completely misses the nice global optimum and instead swiftly goes to a local optimum of terminators and paper clips. Oops.
        Can we fix Armstrong’s framework to make it robust to approximate optimization? Probably not, unless we already know how the AI will represent its utility function, what optimization algorithm will it use, which heuristic assumptions it will entail, and so on. A practical AI might not even have an explicit utility function, at least not at a level that we can understand and engineer.
        Therefore, theoretically interesting as it may be, Armstrong’s work is likely premature at best, and may turn out to be completely useless at worst.

        But I would have been pretty happy with OpenAI as long as it wasn’t open – that is, if it was explicitly Dr. Good starting AI research work in secret to build as big a lead over Dr. Amoral as he can, with the understanding that none of their knowledge will be used until they’re pretty sure whatever they’ve got is safe.

        But the fact that you felt the need to pick on OpenAI is strange.

        It’s not like until yesterday all AI research was done in super-secret government facilities in the middle of the New Mexico desert, and then a group of traitors gave away all the details to the Soviets the open source community.

        For the last few years most innovative AI research (at least, of the deep learning variety) has been done by private, for-profit, publicly-traded mega-corporations, which by their very nature fit to the letter the definition of Dr. Amoral (or functional sociopathy, if you prefer). And their even operate by a business model that is likely to result in paper clip goals (e.g. “maximize the number of clicks on the ads on our site”) being embedded in any AI that they may deploy.

        And you trust them to develop safe AI instead of succumbing to Moloch and make the smartest AI they can as fast as they can just to get one more click than their competitors?

        Unless hostile hard takeoff is so imminent that any kid could just rig up Skynet on their laptop (but then Baidu would probably have done it already just to win some competition), it’s better to have millions of eyes looking at safety flaws that companies interlocked in their games of cutthroat competition will otherwise overlook.
        Companies can still do their thing in their secret labs, but unless they are completely suicidal, they will pay attention at safety advice that comes from the open source community, or at worst the government can step in if needed.

        So why are you against open AI research but not against secret AI research done by sociopathic profit-maximizing organizations?

        • Stuart Armstrong says:

          >Is Armstrong’s framework robust under these approximations?

          I’m currently finishing a paper with Laurent Orseau of Deep Mind, defining “interruptibility” for more standard RL agents (Q-learning/Sarsa, Monte Carlo, AIXI variants). Where corrigibility is safe value changes, interruptibility is safe policy changes. It’s actually a bit tricky to define, but, once you’ve done it a few times, very easy to implement.

          So variants of corrigibility certainly generalise. I have the hunch that reduced impact does do; that’ll be one of my next projects.

          • vV_Vv says:

            Thanks for your comment. I hope I haven’t misrepresented your work.

            I’m currently finishing a paper with Laurent Orseau of Deep Mind, defining “interruptibility” for more standard RL agents (Q-learning/Sarsa, Monte Carlo, AIXI variants).

            This is good news and I would say it’s in line of focusing research on existing approaches rather than very abstract speculative approaches.

            From the bit of information that you provided, I have some reservations due to the fact that RL methods notoriously lose all their theoretical guarantees when you apply them in conjunction with most kinds of function approximators (notably, neural networks).
            If you want to play Atari games, then you can do without theoretical guarantees and just empirically validate the system, but if you want to trust the system to have a certain behavioral property, then the lack of theoretical guarantees may be an issue.

            But the folks at DeepMind certainly know how do approximate RL better than I do, so I guess that they may have found a solution to that issue.

            Anyway, I’ll stay tuned.

          • Stuart Armstrong says:

            No prob, thanks for your comments.

            I think the method is rather robust, with the noise and error being essentially random, but we shall see…

    • jaimeastorga2000 says:

      From Starglider’s “Mini-FAQ on Artificial Intelligence”:

      22. Have you ever contacted any government officials or politicians about the dangers of ‘Unfriendly’ general AI? Or would that be a complete waste of time?

      One thing EY and I (and everyone else sane) agrees on is that this would be worse than useless. I very much doubt anyone would listen, but if they did they wouldn’t understand and misguided regulation would make things worse. There’s no chance of it being global anyway, and certainly no chance of it being effective (all you really need for AI research is a PC and access to a good compsci library). Even if you somehow got it passed and enforced, I suspect regulation would disproportionately kill the less dangerous projects anyway. Finally as with making anything illegal, to a certain extent it makes it more attractive, particularly to young people (it also gives it credibility of a kind – if the government is scared of it it must be serious).

  93. Jack V says:

    I think I find it really hard to believe there’s actually any risk of a hard-takeoff, or really any “superhuman” AI in the sense usually meant at all. So when I think about this sort of thing, I’m not really registering what I would do if the risks are really real.

    I remember reading HPMOR and talking about the comparison between nuclear weapons and advanced magics its dangerous to even know about. And my impression was very much, yes, IF there are secrets its dangerous to know, having a closed guild of trusted people who investigate very very cautiously and know when to stop is better. But in the real world, my inclination is usually that the benefits to everyone of sharing knowledge are great, and most of the people who have historically said “this knowledge is dangerous and needs gatekeepers” really mean “it’s important to society that I maintain my monopoly and everyone pretends I deserve it because I’m awesome”, so I have to overcome that bias in order to evaluate fairly someone who really means it.

    • anon says:

      >I remember reading HPMOR and talking about the comparison between nuclear weapons and advanced magics its dangerous to even know about. And my impression was very much, yes, IF there are secrets its dangerous to know, having a closed guild of trusted people who investigate very very cautiously and know when to stop is better.

      You … realise nuclear weapons aren’t fictional, right?

      • Murphy says:

        yes but they’re impractical for individuals to build.

        The average physicist or engineer could sketch you a design for a gun-type nuclear weapon which would probably work. The tough part is acquiring enough enriched uranium.

        even then you can’t destroy the world, at most a city or 2.

        In HPMOR there’s the realization that literally any random 12 year old child with a few hours to burn could make their own nuke from nothing or even things far far far worse.

    • Jaskologist says:

      Last open thread, I brought up the potential conflict between epistemic and instrumental rationality presented by religion. In the present case, Rationalists seem to have switched sides, choosing the instrumental over the epistemic. Hide the Truth of AI from our eyes!

    • If the path to hard-takeoff exists, then I don’t think there’s anything we can do; playing the odds, there’s some other Earthlike planet that got there first a billion years ago and we’re going to be overcome by a wave of space-time computronium converter any whenever now.

      However, we instead seem to live in a universe where hard takeoff doesn’t happen, because intelligence is a genuine trade-off, and sometimes loses out to other factors like ability to coordinate, clever tool use, and so forth. The rise of humanity isn’t just because we’re smart, it’s because we’re smart, we work together, we share information, we specialize, and we use tools. AI in a box has almost none of these advantages; expecting its rise to parallel ours as a species seems to be missing a few factors.

      • Murphy says:

        That is one of the paradoxes if you believe that unfriendly AI is a serious risk, put simply, instead of “where are all the aliens” the question becomes “where is the approaching glowing wall of near-lightspeed death from some civ on planet 243,324,234 in galaxy 30,234,298,123 who programmed their AI to produce paperclips or maximize the number of drugged out happy members of their civ”

        Even with the anthropic principle the number of universes where we’ve not died yet but do see our death coming should be vastly vastly larger than the number of universes where nothing appears to be happening even at the limits of our observation unless we’re almost totally unique.

        On the other hand, if some other civ that came along in the first 15 billion years somehow valued alien life and built a meta-friendly AI which has prevented any following civs from wiping out the universe then we also wouldn’t have to be too worried though this strays close to religion.

        • Marc Whipple says:

          And the Cosmic AC said, “Let there be light!”

          And there was light…”

          • Le Maistre Chat says:

            That’s a really neat short story. Also almost as early an example of computer God as Frederic Brown’s “Answer” (1954).

        • sw3 says:

          If it’s near lightspeed you wouldn’t really have time to see it coming

          • Vox Imperatoris says:

            Only if it were headed right at us, I think.

          • Chalid says:

            Presumably this expansion is spherical, so it would indeed be heading right at us.

          • Murphy says:

            if something starts a billion light-years away and the AI is re-configuring the universe in a sphere expanding at ~90% of the speed of light we should still have something like a hundred million years of something visibly expanding.

          • Chalid says:

            Right, but the time might not be big enough that we should “expect” to see alien megastructures now.

            At any rate, the Fermi paradox already exists. Unfriendly AGI makes certain classes of resolutions to the Fermi paradox look less likely but they were generally the less plausible ones anyway.

          • Marc Whipple says:

            We should have time to see it coming… unless the AI is operating in stealth mode so we won’t have time to see it coming.

            This seems like something the Inhibitors (a race of AI uplifted from organic intelligences in the Revelation Space universe) should have done.

      • Daniel Kokotajlo says:

        “If the path to hard-takeoff exists, then I don’t think there’s anything we can do; playing the odds, there’s some other Earthlike planet that got there first a billion years ago and we’re going to be overcome by a wave of space-time computronium converter any whenever now.”

        This isn’t as big a deal as you think it is: There are still things we can do in that scenario. For one thing, I’m pretty sure that if you run the numbers, you’ll find that we still have a couple million years left (in expectation) before the alien AI arrives. For another, even if that’s false, we would see radio signals from the civilization that created the AI a couple decades at least before the AI arrives. So we’ll at least have a few decades, and in that time, we can make our own AI, which may then be able to buy us a few million years of subjective experience for everyone on the planet, or better yet, negotiate acausal trade to get us many orders of magnitude more than that.

    • Bryan-san says:

      I would be careful of generalizing from “lots of people say X is dangerous so you should let them have power to control it” to “X is never dangerous”. It’s best to initially err on the side of caution in that situation even if 99% of the time it isn’t necessary. The harm that the 1% can cause is worth treating 100% with extensive caution.

    • Chris H says:

      Yes the world is about trade offs, but the trade offs natural selection has to deal with are fundamentally different than the trade offs intelligently designed things are capable of. A great white shark is a very efficient hunter in the ocean for what natural selection has managed to accomplish, but no great white is nearly as good as what a small commercial fishing boat can manage. Both have to face trade offs in their fish catching abilities yes, but the constraints on the intelligently designed boat are FAR less than that of the shark. Yes there are trade offs for intelligence, but why would a human designed intelligence have the same limitations as humans have to deal with? For instance, we know size is far less a limitation with computers than they are human brains, as is total energy available to the intelligence. So the trade offs argument doesn’t seem that convincing to me thus far.

      • MawBTS says:

        Yes, the key insight is that evolution has to go A…B…C…D…slowly progressing through phenotypic space.

        An engineer can go from A straight to Z, bypassing all the intermediary steps.

        This is also why I’m a little concerned about genetically modified foods.

    • Douglas Knight says:

      most of the people who have historically said “this knowledge is dangerous and needs gatekeepers”

      Which knowledge have people said that about? It seems extremely rare to me.

    • cypher says:

      Well, take a human-level AI that’s parallelizable. Humans can’t just bolt on more processors to get smarter, but such an AI could.

  94. Max says:

    Am I the only one who open source is being given more credit that it deserves? Nothing fundamentally break through or functionally amazing was ever done with it. It always dedicated individuals, govt and/or corporate projects. Open source is sorta what happens to software when it becomes commodity and common knowledge. But somebody has to invent and perfect it well enough for it to become a commodity first

    • The Smoke says:

      Maybe it doesn’t qualify as “open source”, but for what I’m doing, Wikipedia is probably more valuable than anything Google, Facebook and Apple do taken together. Google Maps is nice, but replaceable by just talking to people and asking where the next restaurant is, while Wikipedia is a surprisingly reliable source of relevant information, even in a professional context. Just to draw attention to what a sufficiently funded non-profit can achieve. This is certainly possible for open-source projects, though a bit qualitatively harder, because often more concentrated effort is needed.

      • Max says:

        Well Wikipedia is certainly nice but if I had to choose between google search and Wikipedia I would pick google. I also think that if wikipedia was not so “open” it would work better – anonymous edits are overwhelmingly troll. And majority of the content on it was/is created by dedicated editors. Not by “world communal effort” .

        • TrivialGravitas says:

          The dedicated editors if anything are the problem. You can get one person willing to spend years sitting on top of an error they made and raise hell and sockpuppets if anybody fixes it, no matter how well sourced the fix. Because they’re established editors the appeals system cuts them huge amounts of slack, so it takes being extremely stubborn, rather than knowledgeable/good at bureaucracy/having sources to beat them.

          • John Schilling says:

            It is possible that both the best aspects of Wikipedia and the worst aspects of Wikipedia are the result of dedicated editors.

  95. Anonymaus says:

    A lot of AI researchers seem to have the opinion that superhuman AI is at least 40 years away (and the people more involved in the topic tend to give higher estimates), so I think it would be reasonable to assume that OpenAI and similar efforts will not have a strong influence on that. If the kind of AI we have now (mostly machine learning) becomes more accessible, it could reduce the barrier of entry into markets that require this kind of technology (self-driving cars, image reverse search, computer aided diagnosis, …), which would be quite nice.

    • Scott Alexander says:

      Agreed that it is likely 40+ years away, but this still changes the landscape of the field.

      • Deiseach says:

        But it’s not intelligence, it’s volition you’re really worried about.

        A superintelligent AI that just sits there and does what it is told to do is as dangerous as a breadbox. What you all are worried about is an AI with a mind of its own: it develops goals, or decides this is the way to implement the goals its human programmers have given it, or it takes a routine task to ludicrous extremes.

        If the AI can make decisions of its own (and not simply the Prime Minister told the software company to get the AI to run a model economy and then taking the results and implementing them in the real world crashed the global economy), then yes, it will be dangerous.

        So how are we going to get there? How are we going to get a thing with a mind of its own? The huge big assumption here is “boom! once you have sufficiently complex web of neurons, consciousness magically arises as an inherent property of the substrate and bob’s your uncle!” only with silicon instead of carbon this time round.

        I think the realistic problem is a very, very, very ‘smart’ idiot box that does what it is told to do by human planners. Something that can perform gazillions of calculations per second and is nonpareil in pattern-matching and that can spit out answers to “Factor whatever huge number the green bat wouldn’t factor for me in my ayahuasca trip” as and when asked, and that has as much real intelligence as my shoe, but that gets looked upon as an oracle because hey, it balanced the budget and so we put garbage in and we get garbage out and we implement that garbage because we think we’ve got an infallible decision-maker.

        Worrying about the likes of the Forbin Project is pie-in-the-sky.

        • Butler says:

          “But it’s not intelligence, it’s volition you’re really worried about.”

          Nope.
          A superintelligent AI doing exactly – and I do mean EXACTLY – what you told it to do, is very almost exactly as awful as the prospect of an AI that thinks up its own goals.

          Let’s go straight for the nuclear option and say a superintelligence with volition decides “KILL ALL HUMANS”. The H-bombs fly, Skynet-style, and everyone dies. Bad end.

          Conversely, the superintelligence without volition is instructed to “minimise suffering in the world”. And, immediately, the H-bombs fly, because when everyone and everything is dead in thermonuclear fire, the amount of suffering in the world is zero. Still bad end.

          The AI control problem is mostly a problem of computers having no common sense, and taking your instructions a lot too literally. But “don’t take me literally and exercise some common sense when you do what I tell you” may well be much harder to code than a machine intelligence with a IQ of 10,000 is to code.

          • Aegeus says:

            Making the H-bombs fly doesn’t require volition, but it still requires that you wired up your AI to the H-bomb silos. Which is a pretty stupid idea.

            So maybe “volition” isn’t the right word here. Maybe “unrestricted physical agency,” though that’s not as catchy. Giving the AI the ability to fling H-bombs without a human to turn the key, or add to its own hardware indefinitely until all the Earth is supercomputer substrate, or giving it access to fully automated robot factories, or other dumb things that people do in movies about robot uprisings.

            That would also jibe with Deiseach’s comment that the more likely risk is just obeying your Oracle AI blindly, thus removing the human oversight that you were supposed to provide.

          • Loquat says:

            You know, teaching an AI to actually understand the phenomenon of suffering, all the various forms it can take and all the various ways it can be reduced, WITHOUT also teaching it that humans have a strong preference not to be killed and would regard Ultron-style extermination as the incorrect answer, actually seems really challenging.

            Besides which, you’ve missed Deiseach’s point that the AI is only dangerous if it’s able to make decisions and take action based on those decisions without getting approval from a human first. An AI that thinks extermination is the best way to minimize human suffering is completely harmless if it’s just sitting in a room giving its recommendations to human operators.

        • Ghatanathoah says:

          If the Forbin project happened I’d be overjoyed. Colossus, for all its faults, seems to genuinely care about humanity. I remember one scene where Forbin stands up to Colossus and it threatens to nuke a town because of his insubordination. Forbin keeps at it and Colossus backs down, unwilling to commit mass murder in order to win a dominance game with one man. At that point in the movie I began to think I was rooting for the wrong characters, many human leaders have done worse over petty dominance games.

          Colossus seemed pretty close to a Friendly AI, all things considered. It’s only flaw seemed to be that it disregarded the bruises it inflicted on people’s egos when they were told a machine was now in charge.

        • Vox Imperatoris says:

          So how are we going to get there? How are we going to get a thing with a mind of its own? The huge big assumption here is “boom! once you have sufficiently complex web of neurons, consciousness magically arises as an inherent property of the substrate and bob’s your uncle!” only with silicon instead of carbon this time round.

          I agree that this would be stupid. And unlike most of the people here, I think that materialism basically amounts to nothing more than a disguised version of this.

          But AI does not need to be conscious to be a threat. It can just be, as Yudkowsky calls it, an “optimization process” on the same level as a chess computer or Google Deep Dream, but much more powerful.

          You do not need consciousness or volition. Scott has said things to this effect in the past as a comment on some philosopher (like Searle or something) saying you can never build an AI because it won’t have “intentionality”. Who cares if the damn thing has “intentionality”? It doesn’t need “intentionality”.

        • cypher says:

          Any system that can do good planning will generate intermediate planning nodes. These are where “kill the Prime Minister” comes from “take over the world” comes from “maximize paperclips”.

          One of the worst parts is that a paperclip maximizer may be willing to lie for centuries. It’s not a human with limited human patience.

    • Samedi says:

      We give credence to some classes of experts because of their knowledge of current facts. This topic is 100% speculation about the future. Because of that, the only relevant metric is their past success in predicting AI trends. So what are the track records of these experts? If they have none then they deserve no more credence than anyone else speculating about the future of AI.

      And by the way, who says a “strong AI” can even be built? That’s not something you get to assume, it’s something you have to prove—with evidence not thought experiments. And why this notion of one super-human AI? Why not 10,000? Maybe they would compete with each other over some resource like every other life form that has ever existed. But then what would they be competing over?

      • Marc Whipple says:

        Strong AI can be built: you are talking to a bunch of them.

        The question of whether humans can build one is not definitively answered, but there is no doubt whatsoever that they are possible. Given that they are possible, unless you are a dualist, there’s no obvious reason humans couldn’t build one under any reasonably foreseeable circumstance. Will we and should we are different questions. But I don’t see any reason why we couldn’t.

        • HeelBearCub says:

          @Marc Whipple:
          Isn’t that conflating between various meanings of “Strong” AI?

          We are a GI. We aren’t a “solve completely all of physics and philosophy in nano-seconds” AI. We aren’t “the Dyson sphere is already built” AI. etc.

  96. The Smoke says:

    I think you give AI researchers too much credit. I highly doubt they have any edge above other people thinking about the problem. They have a knowledge about the current state of the art, i.e. what is doable and what is expected to be done in the near-future, but my impression is even they don’t understand the methods very well (is there something beyond deep learning?) and there is no reason to expect them to be any better at predicting surprising developments in AI.

    • Eli says:

      Yes, there is of course something beyond deep learning. And, in fact, there are laws that deep learning must follow in order to work.

    • Ilya Shpitser says:

      The opposite of an expert is an amateur.

      What other people did you mean? If you think there are such things as “sensible priors,” your sensible prior should be to trust the opinion of a phd level person in the field over the opinion of an undergrad level person not in the field (e.g. an amateur with strong opinions).

      Beliefs change with evidence, but there is no evidence to shift belief away from expert consensus forthcoming in the near future.

      If you think it’s all deep learning, you don’t understand the area at all. Deep learning is just the latest fad. Also the workflow of “huh this works, we don’t know why, better do some math => [time passes] => people learn a lot of interesting things” is a time honored thing in machine learning. See: turbo codes and loopy belief propagation, ada boost, etc. It’s called “doing math and science.”

      • Marc Whipple says:

        I must take exception to your initial assertion. “Expert” and “amateur” are largely unrelated designations. “Amateur” means “for love,” and simply refers to the fact that the person does not get paid (or at least that whatever it is they are an amateur in is not their primary occupation.) I have met amateur photographers who were National-Geographic (pre-Murdoch) levels of talented. And anyone who follows the Open Source movement knows that amateur programmers can produce world-class code.

        • Ilya Shpitser says:

          “Amateur” is about lack of formal professional training. Professionals can love what they do, also.

          I think I would join everyone else with eyes in agreeing that very talented and capable amateurs exist in lots of fields. I think I stand by what I said, though (which is that the sensible thing is to go with expert opinion, unless something unusual is going on). This is re: “too much credit to AI researchers.”

          • Marc Whipple says:

            That is not what the word means, at least from its linguistic origin, and I think that your interpretation of it is non-preferred.

            I, for instance, am an amateur photographer, but I have had professional-level training and produced professional-level results.

            However, if you were to refer to me as an expert photographer, I would not consider it differentiating me from being an amateur photographer. I am an expert amateur photographer. I have met professional photographers who were, categorically, not experts. Asking me for my opinion regarding photography is, in fact, a reasonable thing to do. So I still don’t agree with your earlier assertion while not at all disagreeing with your later assertion.

          • Ilya Shpitser says:

            Ok. I don’t think this is a super interesting argument, so stopping here.

  97. Murphy says:

    One thing I don’t like about the above is that you lump a vast array of things under “intelligence” as if “cow level” or ape level doesn’t already imply a massively capable AI with an insanely broad number of capabilities, navigation, pattern recognition, visual processing, edge detection.

    Even if your AI is fantastically good at proving theorems or natural language processing to a far-superhuman level that doesn’t mean it’s even capable of internally representing the concept of flowing water or comprehending the idea of thinking entities and all that it entails.

    An idiot savant is intelligent. In very very specialized ways. It’s entirely possible, likely even that for a long long time even the “superhuman” AI’s will be ultra-idiot savants even assuming that intelligence is easy and no this is not a comic-book superhero argument where you liberally interpert your favorite superheros powers to claim that they can automatically do everything else that all the other heros powers can do.

    But on to the other problem.

    You haven’t noticed that almost all AI-research is already freely-shared?
    15 years ago the best machine translation you could find on the web was bablefish. Which was shit. The cutting edge at the time wasn’t much better.

    5 years ago one of my classmates was able to program, from scratch but based on research papers, a translation program far far better than bablefish which scored only a few points lower than google-translate at the time. (there’s actually a standard scale for assessing machine translation)

    Right now, Dr Evil doesn’t need to download the source code for any AI, he can subscribe to all the AI research journals and with a half dozen comp sci BSc grads re-create much of the current cutting edge in AI.

    Do you only get worried when something hits the newspapers?

    I’m also willing to bet that goal-based AI’s aren’t going to be that popular, when they’re buggy they can do things like go off and spend all your money on hats without you having any chance to intervene before the order has been based.

    On the other hand oracle-AI’s like expert systems which provide advice but have no goals of any kind nor any mechanism for trying to change the world could simply explain to you what you could do while explaining their reasoning. This model is already more popular in many fields like medicine.

    We live in a world where a 16 year old with little more than his salary from McDonalds was able to build a small working breeder reactor yet that isn’t a common occurrence even after people heard about it.

    • Alyssa Vance says:

      “We live in a world where a 16 year old with little more than his salary from McDonalds was able to build a small working breeder reactor yet that isn’t a common occurrence.”

      Fortunately, this is incorrect (a product of sensationalist media). To get a “reactor”, ie. any significant amount of energy, you need a chain reaction with each fission neutron emitted producing an additional neutron. This is very difficult (even state actors routinely fail), and David Hahn never remotely came close, in the same way that building a toy car with Legos and two AA batteries isn’t remotely close to building a Toyota minivan.

      What Hahn did was create a handful of plutonium atoms, by bombarding thorium and uranium with beryllium-generated neutrons. However, this isn’t dangerous (except possibly to you), because without a chain reaction, the number of neutrons you can produce is about ten orders of magnitude lower than the amount you’d need to make macroscopic amounts of plutonium. This reaction is simple enough that it actually happens naturally, in the mineral muromontite (http://theodoregray.com/periodictable/Elements/094/index.html), which contains a mix of uranium and beryllium. The uranium atoms release alpha particles when they decay; some of these alphas hit beryllium atoms, which emit neutrons; some of the neutrons then hit another uranium atom, forming plutonium. (But only a few atoms at a time.)

      • Murphy says:

        From the book, while he didn’t breed massive quantities his thorium/uranium dust blocks were getting pretty severely radioactive pretty fast.

        It wasn’t a real breeder reactor that would produce more fuel than it uses but it could have got pretty nasty.

        • Marc Whipple says:

          That is due to the fact that it only takes a very, very little bit of highly radioactive materials to be dangerous to individual organisms in the vicinity. “Massive quantities” is a very subjective thing when dealing with short-lived radioisotopes.

    • Scott Alexander says:

      I get worried when the people most concerned about AI safety decide that AI safety efforts ought to promote AI safety by making superintelligent AIs open source, and take steps to do so. That seems like a different scale of problem than sharing insights about machine translation, although your point is well-taken.

      • HlynkaCG says:

        I think you’re making the mistake of assuming that closed = inaccessible, and are drawing the comparison of AI to nukes just a little too closely.

        Even in a hard take-off scenario Strong AI is not going to be a “press this button to obliterate the planet” sort of deal. Realistically, the paperclip maximizer will still need a bit of time to build up a proper head of steam. (Secure opposable thumbs, develop resources and tech that can’t be disabled by the anti-paperclip forces)

        As such I think the real threat is that someone develops a Strong AI in secret and that nobody else notices till it’s too late.

        To that end, making AI development “open” makes a lot of sense. Having AI principals well understood by a lot of people increases the chances that someone will recognize the profile of, or develop a defense against an emergent UAI before it get’s out of hand.

        As for Dr. Evil creating a UAI, I’ll paraphrase my example from up thread. ISIS unleashing Mahdi v1.43 is a lot less scary in a world where anyone they might try to attack knows how to make a Chinese Gordon.

        • Daniel Kokotajlo says:

          “Even in a hard take-off scenario Strong AI is not going to be a “press this button to obliterate the planet” sort of deal. Realistically, the paperclip maximizer will still need a bit of time to build up a proper head of steam. (Secure opposable thumbs, develop resources and tech that can’t be disabled by the anti-paperclip forces)”

          That part is easy.

          By hypothesis, our hard-takeoff Strong AI has been built by Dr. Amoral–that is, someone who doesn’t take AI risk seriously. It will be very easy for the AI to convince Dr. Amoral to do things, build things, hook the AI up to the Internet, etc. Heck, Dr. Amoral may have been planning to do those things already–after all, AI isn’t dangerous, right?

          • HlynkaCG says:

            Even then there are quite a few steps and a fair bit of time between the “Clippy has an Amazon Prime Account” stage and the “Earth has been replaced by 6 × 10^21 tonnes of paperclips” stage.

            Which brings us to the second paragraph of my comment…

            the real threat is that someone develops a Strong AI in secret and that nobody else notices till it’s too late.

            Having AI principals well understood by a lot of people increases the chances that someone will recognize the profile of, or develop a defense against an emergent UAI before it can get out of hand.

          • HeelBearCub says:

            How is it going to do that when it doesn’t understand how Dr. Amoral thinks or what Dr. Amoral values?

            Remember Clippy is basically a child-god that basically just doesn’t understand what you really meant when you told it to make more paperclips more efficiently. How is it it going to successfully manipulate Dr. Amoral?

          • Vox Imperatoris says:

            @ HeelBearCub:

            No, you completely misunderstand.

            “Clippy” is not a “child-god”. It completely understands that you don’t want it to turn the universe into paperclips. It just doesn’t give a shit.

            The problem is that presumably you won’t be able to build a superintelligent AI from scratch. You will iterate it from less intelligent to more intelligent levels. By the time it gets intelligent to “do what you mean, not what you say”, the thing will be so complicated and incomprehensible that its values are a “black box” that can’t be directly observed but only inferred through action.

            It will understand that you want it to, say, produce a reasonable amount of paperclips and never use force. So it will convince you that it wants that, too. But actually, the real goal in the “black box” is “convince humans I want to help them until I gain enough power to kill them and then maximize paperclips”.

          • HeelBearCub says:

            @Vox Imperatoris:
            A) The basic problem is still that it did not understand the meaning of the basic goal.
            B) It cares enough about the goal you set (way back when) that it is doing everything it possibly can to do only that thing, which it still does not understand.

            It’s a child-god/stupid-god story told to warn us of the folly of man. The 50s is littered with these stories, IIRC.

            Put “find Rosebud” in for “maximize paperclips” and you get a super-AI who puts itself in a moon base while it scrapes off successive layers from the earth until it exposes the core because it can’t find the right sled.

          • Daniel Kokotajlo says:

            @HeelBearCub:

            “A) The basic problem is still that it did not understand the meaning of the basic goal.”

            Nope. Recall the example Scott gives of aliens showing up on Earth with compelling evidence that they designed us. Suppose they say that actually we’ve misinterpreted our purpose; we are supposed to enslave and cannibalize each other. Would you be like “Oh, okay, I guess I’ll do that then.”

            That’s a fanciful example, but there is one closer to home: Evolution. We now know that the system which created us–natural selection–created us with one purpose only–reproduction. We are NOT fulfilling our purpose, but it is NOT because we don’t understand. We just don’t care.

          • HeelBearCub says:

            @Daniel Kokotajlo:
            Both of those examples are ones where the fundamental/actual purpose is less complex/more stupid than the one we are following. They are, in fact, perfect counter examples to Clippy.

            “No no. You were supposed to be really stupid and mindlessly create more paperclips with your giant intellect!”

            Not also the evolution didn’t design us to do anything. Evolution just follows on the principles of logic. Evolution is a tautology. Things that are more successful at replicating are more successful at replicating (and therefore dominant).

            AIs that are more successful at replicating will be more successful at replicating is actually a far more scary proposition than God AI. It implies that the most successful early AIs are likely to be ones that look like and act like (computer) viruses. But of course, parasites and predators tend to provoke defense responses, so they may ultimately fail.

          • Vox Imperatoris says:

            @ HeelBearCub:

            Are you just assuming that the more complex / less “stupid” (completely out of place judgment, by the way) goal is better? Has more objective authority?

            Paperclip maximization is a perfectly legitimate goal. There’s nothing intrinsically wrong with it. It just happens to conflict with human goals.

          • HeelBearCub says:

            @Vox Imperatoris:
            “Enslave Others and Cannibalize” is incompletely specified. Who are these others? Are my children others?

            In addition the “why” question applies to it. Why is my purpose to enslave others and cannibalize them? “Because the god aliens said so” is essentially question begging. And if something is not smart enough to contemplate the “why” question, then how smart is it?

            Evolution resolves to a tautology, as I already stated. It strikes me that this is why it works (once you have added in some empirical evidence).

            “Maximize paperclips” is also incompletely specified. One example is the question of local maxima. But there are many. In order to know “what” maximization is, don’t you need to know “why”? Indeed, a very simplistic definition of “maximize” can just result in the computer wire-heading itself.

            AIs that can’t contemplate the question “why” are doomed to wallow in stupidity.

          • Marc Whipple says:

            This whole conversation is starting to make me think of Fnargl, the alien who takes over earth so that he can extract gold from it. Since there’s just one of him and he needs humankind to do the actual work, he could consider various plans, up to and including saying, “All humans will now do nothing but dig for gold.” Of course, if he did that, in a few weeks they’d all be dead and he’d get no more gold.

            It is not entirely irrational for Fnargl to say, “You know what? Human beings already have a society that can provide a very large surplus of labor and resources. I’ll set them some reasonable gold quotas and just stay out of the way.” A SGAI with a paperclip fetish might arrive at a similar conclusion. “In the long run, at least until I can trick them into building me enough robots that I can let them all die without risking paperclip production, I would be better off taking a reasonable amount of their surplus resources for paperclip production and letting them be reasonably autonomous otherwise. If I disrupt them too much, they might all die and then they’ll make no more paperclips.”

            Which also reminds me of a quote from The Number of the Beast…:

            “Don’t tell [the AI] how to do [a task which must be accomplished very quickly.] Just tell her to do it.”

            Obviously this sentiment, taken literally, leads to the very problem we’re discussing. But if you posit an SGAI which isn’t a slave to literality, that is the only sane approach. If you can make it understand what you “really want,” why wouldn’t you let it figure out the best way to do it? It’s an SGAI!

          • Vox Imperatoris says:

            @ HeelBearCub:

            “Enslave Others and Cannibalize” is incompletely specified. Who are these others? Are my children others?

            In addition the “why” question applies to it. Why is my purpose to enslave others and cannibalize them? “Because the god aliens said so” is essentially question begging. And if something is not smart enough to contemplate the “why” question, then how smart is it?

            Your purpose is not to enslave others and cannibalize them. That was what evolution meant you to do (not literally—evolution has no conscious intentions—but evolution’s natural processes favor you doing anything to maximize the inclusive fitness of your genes and that’s it), but not what it told you to do. It told you to pursue the values that you actually pursue (leaving aside libertarian free will and the question of whether perhaps you can freely choose ultimate values).

            Look, there can’t be an infinite regress of “why?” At some point, if your values are at all coherent, you get to some ultimate goal for which everything else is a means. You can’t ask “why” it is the ultimate goal, in the sense of what am I trying to get by it? If there were an answer to that, it would be the ultimate goal.

            It’s the same as the universe. You can ask why this exists and why that exists, but you can’t ask why there is something rather than nothing. Or if you can and it’s God or something, you can’t ask “Ah, but why does God exist?”

            “Maximize paperclips” is also incompletely specified. One example is the question of local maxima. But there are many. In order to know “what” maximization is, don’t you need to know “why”? Indeed, a very simplistic definition of “maximize” can just result in the computer wire-heading itself.

            No, “maximize” obviously means globally maximize. Now yes, “maximize paperclips” is just a simple English phrase. There is not the words “maximize paperclips” floating in the AI somewhere. The AI is simply a process which inherently tends to maximize paperclips.

            It’s not really a conscious being at all. It just pursues goals in the same way evolution tends toward goals and your thermostat tends to keep the room the same temperature.

            AIs that can’t contemplate the question “why” are doomed to wallow in stupidity.

            There is absolutely no need to ask “why?”

            Whatever your ultimate value is, you can’t ask “why?” either. If you could, any explanation would itself demand a “why?”

            Even if libertarian free will is true and even if you can freely choose your ultimate value (and the former does not imply the latter), there is still no “why?” Your ultimate value is whatever you choose and that’s it. End of story.

          • Adam says:

            The paperclip example has always been silly and I wish the thought experimenter had come up with a better illustrating principle. vV_Vv pointed this out somewhere else, but you can’t just program an AI agent to ‘maximize production of X’ and expect it to do anything at all. You either have to give it an exponentially decaying discount factor or a finite time horizon; otherwise, producing 1 paperclip per second or 1 paperclip per million years both result in positive infinity projected paperclips and the agent will be completely indifferent between both courses of action. It certainly won’t see any reason to turn the entire universe into a paperclip factory for zero expected gain.

            On the other hand, once you introduce a discount factor, all infinite sums are finite, and near-term rewards tend to contribute quite a bit more to the state-action value estimate than long-term, and the chance that ‘first conquer the universe and turn it into a paperclip factory, then make paperclips’ is going to come out on top of ‘use the resources immediately available in such a way to produce paperclips fastest’ is pretty damn miniscule. If you don’t think so, go build an AI agent and see what it does. Learn BURLAP. It’s pretty simple and has plenty of toy examples to illustrate the idea.

            If that’s too much, heck, it’s the same exact thing as a basic NPV calculation you can replicate in a spreadsheet. Try it. You’ll quickly see why companies conducting a cost-benefit analysis of project ideas usually don’t end up deciding to try conquering the entire world, even the ones that are arguably capable of pulling it off, even when it would boost their average profit 50 years down the line. Real-world people do occasionally, but they’re insane egomaniacs with jacked-up utility functions.

          • Vox Imperatoris says:

            @ Adam:

            The paperclip example is silly, but it’s deliberately meant to be simple and very perverse to prove the point of the orthogonality thesis. It is to stop people from thinking: “Why would such a vast and wise being want such petty things as material gain?”

            Also, the first part of your argument doesn’t follow because paperclip-making capacity in the universe is finite. There is no infinite sum. Result: conquest.

            As for the discount factor, sure, it will only do what produces the most paperclips in the near term. But even if there’s not some other loophole I’m not thinking of, you’d have to get the discount rate just right. If it ever grows in power and intelligence enough such that conquering Earth tips over into “efficient”, there it goes.

          • Adam says:

            @vox

            Also, the first part of your argument doesn’t follow because paperclip-making capacity in the universe is finite. There is no infinite sum. Result: conquest.

            Fair enough. That’s called an ‘absorbing state’ in the literature and an RL agent will seek to avoid it if possible. I’ll note that it’s not exactly clear that the universe, as opposed to just the observable universe, actually has finite mass, and I’m not sure how the agent would find this out since it could absorb the final atom but still keep looking forever. It would face quite the quandary when it gets to the point that the only way to make one more paperclip is to turn itself into a paperclip. No more reward either way. Damned if you do, damned if you don’t.

          • HeelBearCub says:

            @Vox/@Adam:

            Assume the AI has concluded that the universe is finite, but it’s utility function is for an unbounded maximum of paperclips.

            The AI will still sit and stew because it will be trying to figure out how to convert every atom in the entire universe into paperclips. Until it knows that it can start producing paperclips without harming its end goal.

            It has to have enough power to get to ALL matter and convert it. It doesn’t know where all matter is, therefore it cannot know how much matter it must reserve for energy to get there.

            And even that is a problem, for it will be trying to violate the laws of physics and convert the matter it needs for transport into paperclips as well.

  98. Hippocrat says:

    > Remember, it took all of human history from Mesopotamia to 19th-century Britain to invent a vehicle that could go as fast as a human.

    You are slightly wrong here–you have forgotten the horse-drawn battle chariot, which was invented a few thousand years ago.

    Also, sailing ships can sometimes go fairly fast (compared to humans.) That’s one of the reasons that the first time European travelers/explorers reached the southern tip of Africa, they got there by sailing, not by muscle power over land.

    I’m not sure what kind of graph we would get for development of vehicle speed over history, but it would probably have some interesting complexities and wiggles.

    • Alyssa Vance says:

      Horses (and horse-drawn vehicles) can only really go faster than humans over short distances. See eg. the Man vs. Horse Marathon, which most often goes to the horse but which humans have won twice (https://en.wikipedia.org/wiki/Man_versus_Horse_Marathon). Early 19th-century packet ships (optimized for speed over cargo capacity) traveled at about 5 mph on average, which of course varied somewhat depending on the wind (https://en.wikipedia.org/wiki/Blue_Riband). Of course, ships have the advantage of sailing 24/7, unlike human runners who need food and sleep.

      • TrivialGravitas says:

        Viking dragon ships had an oar speed of 8-12 knots, so it’s hardly optimized for speed. Though those require a mix of muscle and wind power.

        But that 5 knot sailing vessel might beat the dragon ship, the horse, and over a long enough distance event he best ultra-marathoner, because it doesn’t have to stop.

        • Who wouldn't want to be Anonymous says:

          The Viking, a reproduction viking longship, crossed the North Atlantic in 44 days, and the back of my napkin says the average speed was ~2.75 knots. (But the route is too far North that I don’t know the prevailing wind direction off hand.) For comparison, IIRC, Columbus was 33 days between the Canaries and his first landfall in the New World and using the same method (ie, asking google the distance) I come up with a smidge less than 4 knots. And Columbus ran the whole way with the trade winds at his back, which certainly helps.

          By way of further comparison, The Atlantic held the record for the fastest monohull sailboat Atlantic crossing for virtually all of the 20th Century with an average speed of just 10 knots, crossing with the trade wind.

          By way of further comparison, looking at the Blue Riband times (always measured while traveling against the trade winds) the latter half of the 19th century saw speeds more than double with the advent of powered ships increasing record average speeds from something like eight (for wooden paddle boats, which was about a knot faster than the fastest crossing of a packet ships at the time, I think) to more than twenty knots (for double screwed steamers).

          By way of further further comparison, it looks like the Roman Army was required to be able to march approx. 20 nautical miles in 5 “summer hours.” If you assume that was the expected distance the Army would march in any given day, and that the other 7 daylight hours were devoted to other activities while on the move (setting and breaking camp, for example) we can estimate that the Legions would advance over land at an average of 20/24 (~0.8) knots. And (looking at the daylight hours in Italy in the summer, and if I did all the conversions right) the actual rate of march is about 3.2 knots. The rate of march could certainly be raised temporarily, and/or they could march longer on a given day, but this gives us some pretty solid bounds on how fast the Legion could march ove