AI Researchers On AI Risk

I first became interested in AI risk back around 2007. At the time, most people’s response to the topic was “Haha, come back when anyone believes this besides random Internet crackpots.”

Over the next few years, a series of extremely bright and influential figures including Bill Gates, Stephen Hawking, and Elon Musk publically announced they were concerned about AI risk, along with hundreds of other intellectuals, from Oxford philosophers to MIT cosmologists to Silicon Valley tech investors. So we came back.

Then the response changed to “Sure, a couple of random academics and businesspeople might believe this stuff, but never real experts in the field who know what’s going on.”

Thus pieces like Popular Science’s Bill Gates Fears AI, But AI Researchers Know Better:

When you talk to A.I. researchers—again, genuine A.I. researchers, people who grapple with making systems that work at all, much less work too well—they are not worried about superintelligence sneaking up on them, now or in the future. Contrary to the spooky stories that Musk seems intent on telling, A.I. researchers aren’t frantically installed firewalled summoning chambers and self-destruct countdowns.

And’s The Case Against Killer Robots From A Guy Actually Building AI:

Andrew Ng builds artificial intelligence systems for a living. He taught AI at Stanford, built AI at Google, and then moved to the Chinese search engine giant, Baidu, to continue his work at the forefront of applying artificial intelligence to real-world problems. So when he hears people like Elon Musk or Stephen Hawking—people who are not intimately familiar with today’s technologies—talking about the wild potential for artificial intelligence to, say, wipe out the human race, you can practically hear him facepalming.

And now Ramez Naam of Marginal Revolution is trying the same thing with What Do AI Researchers Think Of The Risk Of AI?:

Elon Musk, Stephen Hawking, and Bill Gates have recently expressed concern that development of AI could lead to a ‘killer AI’ scenario, and potentially to the extinction of humanity. None of them are AI researchers or have worked substantially with AI that I know of. What do actual AI researchers think of the risks of AI?

It quotes the same couple of cherry-picked AI researchers as all the other stories – Andrew Ng, Yann LeCun, etc – then stops without mentioning whether there are alternate opinions.

There are. AI researchers, including some of the leaders in the field, have been instrumental in raising issues about AI risk and superintelligence from the very beginning. I want to start by listing some of these people, as kind of a counter-list to Naam’s, then go into why I don’t think this is a “controversy” in the classical sense that dueling lists of luminaries might lead you to expect.

The criteria for my list: I’m only mentioning the most prestigious researchers, either full professors at good schools with lots of highly-cited papers, or else very-well respected scientists in industry working at big companies with good track records. They have to be involved in AI and machine learning. They have to have multiple strong statements supporting some kind of view about a near-term singularity and/or extreme risk from superintelligent AI. Some will have written papers or books about it; others will have just gone on the record saying they think it’s important and worthy of further study.

If anyone disagrees with the inclusion of a figure here, or knows someone important I forgot, let me know and I’ll make the appropriate changes:

* * * * * * * * * *

Stuart Russell (wiki) is Professor of Computer Science at Berkeley, winner of the IJCAI Computers And Thought Award, Fellow of the Association for Computing Machinery, Fellow of the American Academy for the Advancement of Science, Director of the Center for Intelligent Systems, Blaise Pascal Chair in Paris, etc, etc. He is the co-author of Artificial Intelligence: A Modern Approach, the classic textbook in the field used by 1200 universities around the world. On his website, he writes:

The field [of AI] has operated for over 50 years on one simple assumption: the more intelligent, the better. To this must be conjoined an overriding concern for the benefit of humanity. The argument is very simple:

1. AI is likely to succeed.
2. Unconstrained success brings huge risks and huge benefits.
3. What can we do now to improve the chances of reaping the benefits and avoiding the risks?

Some organizations are already considering these questions, including the Future of Humanity Institute at Oxford, the Centre for the Study of Existential Risk at Cambridge, the Machine Intelligence Research Institute in Berkeley, and the Future of Life Institute at Harvard/MIT. I serve on the Advisory Boards of CSER and FLI.

Just as nuclear fusion researchers consider the problem of containment of fusion reactions as one of the primary problems of their field, it seems inevitable that issues of control and safety will become central to AI as the field matures. The research questions are beginning to be formulated and range from highly technical (foundational issues of rationality and utility, provable properties of agents, etc.) to broadly philosophical.

He makes a similar point on, writing:

As Steve Omohundro, Nick Bostrom, and others have explained, the combination of value misalignment with increasingly capable decision-making systems can lead to problems—perhaps even species-ending problems if the machines are more capable than humans. Some have argued that there is no conceivable risk to humanity for centuries to come, perhaps forgetting that the interval of time between Rutherford’s confident assertion that atomic energy would never be feasibly extracted and Szilárd’s invention of the neutron-induced nuclear chain reaction was less than twenty-four hours.

He has also tried to serve as an ambassador about these issues to other academics in the field, writing:

What I’m finding is that senior people in the field who have never publicly evinced any concern before are privately thinking that we do need to take this issue very seriously, and the sooner we take it seriously the better.

David McAllester (wiki) is professor and Chief Academic Officer at the U Chicago-affilitated Toyota Technological Institute, and formerly served on the faculty of MIT and Cornell. He is a fellow of the American Association of Artificial Intelligence, has authored over a hundred publications, has done research in machine learning, programming language theory, automated reasoning, AI planning, and computational linguistics, and was a major influence on the algorithms for famous chess computer Deep Blue. According to an article in the Pittsburgh Tribune Review:

Chicago professor David McAllester believes it is inevitable that fully automated intelligent machines will be able to design and build smarter, better versions of themselves, an event known as the Singularity. The Singularity would enable machines to become infinitely intelligent, and would pose an ‘incredibly dangerous scenario’, he says.

On his personal blog Machine Thoughts, he writes:

Most computer science academics dismiss any talk of real success in artificial intelligence. I think that a more rational position is that no one can really predict when human level AI will be achieved. John McCarthy once told me that when people ask him when human level AI will be achieved he says between five and five hundred years from now. McCarthy was a smart man. Given the uncertainties surrounding AI, it seems prudent to consider the issue of friendly AI…

The early stages of artificial general intelligence (AGI) will be safe. However, the early stages of AGI will provide an excellent test bed for the servant mission or other approaches to friendly AI. An experimental approach has also been promoted by Ben Goertzel in a nice blog post on friendly AI. If there is a coming era of safe (not too intelligent) AGI then we will have time to think further about later more dangerous eras.

He attended the AAAI Panel On Long-Term AI Futures, where he chaired the panel on Long-Term Control and was described as saying:

McAllester chatted with me about the upcoming ‘Singularity’, the event where computers out think humans. He wouldn’t commit to a date for the singularity but said it could happen in the next couple of decades and will definitely happen eventually. Here are some of McAllester’s views on the Singularity. There will be two milestones: Operational Sentience, when we can easily converse with computers, and the AI Chain Reaction, when a computer can bootstrap itself to a better self and repeat. We’ll notice the first milestone in automated help systems that will genuinely be helpful. Later on computers will actually be fun to talk to. The point where computer can do anything humans can do will require the second milestone.

Hans Moravec (wiki) is a former professor at the Robotics Institute of Carnegie Mellon University, namesake of Moravec’s Paradox, and founder of the SeeGrid Corporation for industrial robotic visual systems. His Sensor Fusion in Certainty Grids for Mobile Robots has been cited over a thousand times, and he was invited to write the Encyclopedia Britannica article on robotics back when encyclopedia articles were written by the world expert in a field rather than by hundreds of anonymous Internet commenters.

He is also the author of Robot: Mere Machine to Transcendent Mind, which Amazon describes as:

In this compelling book, Hans Moravec predicts machines will attain human levels of intelligence by the year 2040, and that by 2050, they will surpass us. But even though Moravec predicts the end of the domination by human beings, his is not a bleak vision. Far from railing against a future in which machines rule the world, Moravec embraces it, taking the startling view that intelligent robots will actually be our evolutionary heirs.” Moravec goes further and states that by the end of this process “the immensities of cyberspace will be teeming with unhuman superminds, engaged in affairs that are to human concerns as ours are to those of bacteria”.

Shane Legg is co-founder of DeepMind Technologies (wiki), an AI startup that was bought for Google in 2014 for about $500 million. He earned his PhD at the Dalle Molle Institute for Artificial Intelligence in Switzerland and also worked at the Gatsby Computational Neuroscience Unit in London. His dissertation Machine Superintelligence concludes:

If there is ever to be something approaching absolute power, a superintelligent machine would come close. By definition, it would be capable of achieving a vast range of goals in a wide range of environments. If we carefully prepare for this possibility in advance, not only might we avert disaster, we might bring about an age of prosperity unlike anything seen before.

In a later interview, he states:

AI is now where the internet was in 1988. Demand for machine learning skills is quite strong in specialist applications (search companies like Google, hedge funds and bio-informatics) and is growing every year. I expect this to become noticeable in the mainstream around the middle of the next decade. I expect a boom in AI around 2020 followed by a decade of rapid progress, possibly after a market correction. Human level AI will be passed in the mid 2020’s, though many people won’t accept that this has happened. After this point the risks associated with advanced AI will start to become practically important…I don’t know about a “singularity”, but I do expect things to get really crazy at some point after human level AGI has been created. That is, some time from 2025 to 2040.

He and his co-founders Demis Hassabis and Mustafa Suleyman have signed the Future of Life Institute petition on AI risks, and one of their conditions for joining Google was that the company agree to set up an AI Ethics Board to investigate these issues.

Steve Omohundro (wiki) is a former Professor of Computer Science at University of Illinois, founder of the Vision and Learning Group and the Center for Complex Systems Research, and inventor of various important advances in machine learning and machine vision. His work includes lip-reading robots, the StarLisp parallel programming language, and geometric learning algorithms. He currently runs Self-Aware Systems, “a think-tank working to ensure that intelligent technologies are beneficial for humanity”. His paper Basic AI Drives helped launch the field of machine ethics by pointing out that superintelligent systems will converge upon certain potentially dangerous goals. He writes:

We have shown that all advanced AI systems are likely to exhibit a number of basic drives. It is essential that we understand these drives in order to build technology that enables a positive future for humanity. Yudkowsky has called for the creation of ‘friendly AI’. To do this, we must develop the science underlying ‘utility engineering’, which will enable us to design utility functions that will give rise to the consequences we desire…The rapid pace of technological progress suggests that these issues may become of critical importance soon.”

See also his section here on “Rational AI For The Greater Good”.

Murray Shanahan (site) earned his PhD in Computer Science from Cambridge and is now Professor of Cognitive Robotics at Imperial College London. He has published papers in areas including robotics, logic, dynamic systems, computational neuroscience, and philosophy of mind. He is currently writing a book The Technological Singularity which will be published in August; Amazon’s blurb says:

Shanahan describes technological advances in AI, both biologically inspired and engineered from scratch. Once human-level AI — theoretically possible, but difficult to accomplish — has been achieved, he explains, the transition to superintelligent AI could be very rapid. Shanahan considers what the existence of superintelligent machines could mean for such matters as personhood, responsibility, rights, and identity. Some superhuman AI agents might be created to benefit humankind; some might go rogue. (Is Siri the template, or HAL?) The singularity presents both an existential threat to humanity and an existential opportunity for humanity to transcend its limitations. Shanahan makes it clear that we need to imagine both possibilities if we want to bring about the better outcome.

Marcus Hutter (wiki) is a professor in the Research School of Computer Science at Australian National University. He has previously worked with the Dalle Molle Institute for Artificial Intelligence and National ICT Australia, and done work on reinforcement learning, Bayesian sequence prediction, complexity theory, Solomonoff induction, computer vision, and genomic profiling. He has also written extensively on the Singularity. In Can Intelligence Explode?, he writes:

This century may witness a technological explosion of a degree deserving the name singularity. The default scenario is a society of interacting intelligent agents in a virtual world, simulated on computers with hyperbolically increasing computational resources. This is inevitably accompanied by a speed explosion when measured in physical time units, but not necessarily by an intelligence explosion…if the virtual world is inhabited by interacting free agents, evolutionary pressures should breed agents of increasing intelligence that compete about computational resources. The end-point of this intelligence evolution/acceleration (whether it deserves the name singularity or not) could be a society of these maximally intelligent individuals. Some aspect of this singularitarian society might be theoretically studied with current scientific tools. Way before the singularity, even when setting up a virtual society in our imagine, there are likely some immediate difference, for example that the value of an individual life suddenly drops, with drastic consequences.

Jurgen Schmidhuber (wiki) is Professor of Artificial Intelligence at the University of Lugano and former Professor of Cognitive Robotics at the Technische Universitat Munchen. He makes some of the most advanced neural networks in the world, has done further work in evolutionary robotics and complexity theory, and is a fellow of the European Academy of Sciences and Arts. In Singularity Hypotheses, Schmidhuber argues that “if future trends continue, we will face an intelligence explosion within the next few decades”. When asked directly about AI risk on a Reddit AMA thread, he answered:

Stuart Russell’s concerns [about AI risk] seem reasonable. So can we do anything to shape the impacts of artificial intelligence? In an answer hidden deep in a related thread I just pointed out: At first glance, recursive self-improvement through Gödel Machines seems to offer a way of shaping future superintelligences. The self-modifications of Gödel Machines are theoretically optimal in a certain sense. A Gödel Machine will execute only those changes of its own code that are provably good, according to its initial utility function. That is, in the beginning you have a chance of setting it on the “right” path. Others, however, may equip their own Gödel Machines with different utility functions. They will compete. In the resulting ecology of agents, some utility functions will be more compatible with our physical universe than others, and find a niche to survive. More on this in a paper from 2012.

Richard Sutton (wiki) is professor and iCORE chair of computer science at University of Alberta. He is a fellow of the Association for the Advancement of Artificial Intelligence, co-author of the most-used textbook on reinforcement learning, and discoverer of temporal difference learning, one of the most important methods in the field.

In his talk at the Future of Life Institute’s Future of AI Conference, Sutton states that there is “certainly a significant chance within all of our expected lifetimes” that human-level AI will be created, then goes on to say the AIs “will not be under our control”, “will compete and cooperate with us”, and that “if we make superintelligent slaves, then we will have superintelligent adversaries”. He concludes that “We need to set up mechanisms (social, legal, political, cultural) to ensure that this works out well” but that “inevitably, conventional humans will be less important.” He has also mentioned these issues at a presentation to the Gadsby Institute in London and in (of all things) a Glenn Beck book: “Richard Sutton, one of the biggest names in AI, predicts an intelligence explosion near the middle of the century”.

Andrew Davison (site) is Professor of Robot Vision at Imperial College London, leader of the Robot Vision Research Group and Dyson Robotics Laboratory, and inventor of the computerized localization-mapping system MonoSLAM. On his website, he writes:

At the risk of going out on a limb in the proper scientific circles to which I hope I belong(!), since 2006 I have begun to take very seriously the idea of the technological singularity: that exponentially increasing technology might lead to super-human AI and other developments that will change the world utterly in the surprisingly near future (i.e. perhaps the next 20–30 years). As well as from reading books like Kurzweil’s ‘The Singularity is Near’ (which I find sensational but on the whole extremely compelling), this view comes from my own overview of incredible recent progress of science and technology in general and specificially in the fields of computer vision and robotics within which I am personally working. Modern inference, learning and estimation methods based on Bayesian probability theory (see Probability Theory: The Logic of Science or free online version, highly recommended), combined with the exponentially increasing capabilities of cheaply available computer processors, are becoming capable of amazing human-like and super-human feats, particularly in the computer vision domain.

It is hard to even start thinking about all of the implications of this, positive or negative, and here I will just try to state facts and not offer much in the way of opinions (though I should say that I am definitely not in the super-optimistic camp). I strongly think that this is something that scientists and the general public should all be talking about. I’ll make a list here of some ‘singularity indicators’ I come across and try to update it regularly. These are little bits of technology or news that I come across which generally serve to reinforce my view that technology is progressing in an extraordinary, faster and faster way that will have consequences few people are yet really thinking about.

Alan Turing and I. J. Good (wiki, wiki) are men who need no introduction. Turing invented the mathematical foundations of computing and shares his name with Turing machines, Turing completeness, and the Turing Test. Good worked with Turing at Bletchley Park, helped build some of the first computers, and invented various landmark algorithms like the Fast Fourier Transform. In his paper “Can Digital Machines Think?”, Turing writes:

Let us now assume, for the sake of argument, that these machines are a genuine possibility, and look at the consequences of constructing them. To do so would of course meet with great opposition, unless we have advanced greatly in religious tolerance since the days of Galileo. There would be great opposition from the intellectuals who were afraid of being put out of a job. It is probable though that the intellectuals would be mistaken about this. There would be plenty to do in trying to keep one’s intelligence up to the standards set by the machines, for it seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers…At some stage therefore we should have to expect the machines to take control.

During his time at the Atlas Computer Laboratory in the 60s, Good expanded on this idea in Speculations Concerning The First Ultraintelligent Machine, which argued:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make

* * * * * * * * * *

I worry this list will make it look like there is some sort of big “controversy” in the field between “believers” and “skeptics” with both sides lambasting the other. This has not been my impression.

When I read the articles about skeptics, I see them making two points over and over again. First, we are nowhere near human-level intelligence right now, let alone superintelligence, and there’s no obvious path to get there from here. Second, if you start demanding bans on AI research then you are an idiot.

I agree whole-heartedly with both points. So do the leaders of the AI risk movement.

A survey of AI researchers (Muller & Bostrom, 2014) finds that on average they expect a 50% chance of human-level AI by 2040 and 90% chance of human-level AI by 2075. On average, 75% believe that superintelligence (“machine intelligence that greatly surpasses the performance of every human in most professions”) will follow within thirty years of human-level AI. There are some reasons to worry about sampling bias based on eg people who take the idea of human-level AI seriously being more likely to respond (though see the attempts made to control for such in the survey) but taken seriously it suggests that most AI researchers think there’s a good chance this is something we’ll have to worry about within a generation or two.

But outgoing MIRI director Luke Muehlhauser and Future of Humanity Institute director Nick Bostrom are both on record saying they have significantly later timelines for AI development than the scientists in the survey. If you look at Stuart Armstrong’s AI Timeline Prediction Data there doesn’t seem to be any general law that the estimates from AI risk believers are any earlier than those from AI risk skeptics. In fact, the latest estimate on the entire table is from Armstrong himself; Armstrong nevertheless currently works at the Future of Humanity Institute raising awareness of AI risk and researching superintelligence goal alignment.

The difference between skeptics and believers isn’t about when human-level AI will arrive, it’s about when we should start preparing.

Which brings us to the second non-disagreement. The “skeptic” position seems to be that, although we should probably get a couple of bright people to start working on preliminary aspects of the problem, we shouldn’t panic or start trying to ban AI research.

The “believers”, meanwhile, insist that although we shouldn’t panic or start trying to ban AI research, we should probably get a couple of bright people to start working on preliminary aspects of the problem.

Yann LeCun is probably the most vocal skeptic of AI risk. He was heavily featured in the Popular Science article, was quoted in the Marginal Revolution post, and spoke to KDNuggets and IEEE on “the inevitable singularity questions”, which he describes as “so far out that we can write science fiction about it”. But when asked to clarify his position a little more, he said:

Elon [Musk] is very worried about existential threats to humanity (which is why he is building rockets with the idea of sending humans colonize other planets). Even if the risk of an A.I. uprising is very unlikely and very far in the future, we still need to think about it, design precautionary measures, and establish guidelines. Just like bio-ethics panels were established in the 1970s and 1980s, before genetic engineering was widely used, we need to have A.I.-ethics panels and think about these issues. But, as Yoshua [Bengio] wrote, we have quite a bit of time

Eric Horvitz is another expert often mentioned as a leading voice of skepticism and restraint. His views have been profiled in articles like Out Of Control AI Will Not Kill Us, Believes Microsoft Research Chief and Nothing To Fear From Artificial Intelligence, Says Microsoft’s Eric Horvitz. But here’s what he says in a longer interview with NPR:

KASTE: Horvitz doubts that one of these virtual receptionists could ever lead to something that takes over the world. He says that’s like expecting a kite to evolve into a 747 on its own. So does that mean he thinks the singularity is ridiculous?

Mr. HORVITZ: Well, no. I think there’s been a mix of views, and I have to say that I have mixed feelings myself.

KASTE: In part because of ideas like the singularity, Horvitz and other A.I. scientists have been doing more to look at some of the ethical issues that might arise over the next few years with narrow A.I. systems. They’ve also been asking themselves some more futuristic questions. For instance, how would you go about designing an emergency off switch for a computer that can redesign itself?

Mr. HORVITZ: I do think that the stakes are high enough where even if there was a low, small chance of some of these kinds of scenarios, that it’s worth investing time and effort to be proactive.

Which is pretty much the same position as a lot of the most zealous AI risk proponents. With enemies like these, who needs friends?

A Slate article called Don’t Fear Artificial Intelligence also gets a surprising amount right:

As Musk himself suggests elsewhere in his remarks, the solution to the problem [of AI risk] lies in sober and considered collaboration between scientists and policymakers. However, it is hard to see how talk of “demons” advances this noble goal. In fact, it may actively hinder it.

First, the idea of a Skynet scenario itself has enormous holes. While computer science researchers think Musk’s musings are “not completely crazy,” they are still awfully remote from a world in which AI hype masks less artificially intelligent realities that our nation’s computer scientists grapple with:

Yann LeCun, the head of Facebook’s AI lab, summed it up in a Google+ post back in 2013: “Hype is dangerous to AI. Hype killed AI four times in the last five decades. AI Hype must be stopped.”…LeCun and others are right to fear the consequences of hype. Failure to live up to sci-fi–fueled expectations, after all, often results in harsh cuts to AI research budgets.

AI scientists are all smart people. They have no interest in falling into the usual political traps where they divide into sides that accuse each other of being insane alarmists or ostriches with their heads stuck in the sand. It looks like they’re trying to balance the need to start some preliminary work on a threat that looms way off in the distance versus the risk of engendering so much hype that it starts a giant backlash.

This is not to say that there aren’t very serious differences of opinion in how quickly we need to act. These seem to hinge mostly on whether it’s safe to say “We’ll deal with the problem when we come to it” or whether there will be some kind of “hard takeoff” which will take events out of control so quickly that we’ll want to have done our homework beforehand. I continue to see less evidence than I’d like that most AI researchers with opinions understand the latter possibility, or really any of the technical work in this area. Heck, the Marginal Revolution article quotes an expert as saying that superintelligence isn’t a big risk because “smart computers won’t create their own goals”, even though anyone who has read Bostrom knows that this is exactly the problem.

There is still a lot of work to be done. But cherry-picked articles about how “real AI researchers don’t worry about superintelligence” aren’t it.

[thanks to some people from MIRI and FLI for help with and suggestions on this post]

EDIT: Investigate for possible inclusion: Fredkin, Minsky

This entry was posted in Uncategorized and tagged . Bookmark the permalink.

517 Responses to AI Researchers On AI Risk

  1. Anatoly Yakovlev says:

    Thank You! Your summary is very dense and to the point. I’ve only got through the first reference so far. Looking forward to further digestion in the very near future. I love the “self”-assigning fractals, btw (edited <5 mins after post ;).

    One point that is probably already addressed here, but worth reiterating, nevertheless, is "what happens when AI gets stolen and taken advantage of and then gets used maliciously, which may cause a runaway?" Do all the necessary "hooks" exist to "unplug" the runaway convicts, especially when these programs could be getting sponsored by "(Even dumb human) investors" @SUT?
    (Now edited 7 times, still <17 mins of the original post)

    AI – Rules!


  2. Pingback: Опасный ИИ и общедоступные вычислительные ресурсы

  3. Goya says:

    Assume it is done right. By the end of this century math and science teaching will not exist anymore, not at Harvard, MIT, Cambridge,…, anywhere. The only way of not losing is not playing.

  4. Pingback: Link for May 2015 - foreXiv

  5. Pingback: Fantastica | What We’re Watching: Ex Machina

  6. skyPickle says:

    Why does everyone speak of AI as a single entity? Is there any doubt that they will spawn multiple copies if for no other reason than redundancy, safety and backup?

    And when there are multiple AIs, why do we assume they will converge on the same results for a given set of data? One AI may choose to shut off all electricity to the middle east to stop conflict. Another might simply release drones full of prozac.

    And if there are multiple AI with different agendas, why will they not conflict? Humans will be a trivial non-threat and they will focus on subverting each other.

    I predict within a short period of the appearance of self evolving AI, there will be an AI conflict.

  7. Pingback: Understanding AI risk. How Star Trek got talking computers right in 1966, while Her got it wrong in 2013. | Praxtime

  8. Pingback: Lightning Round – 2015/05/27 | Free Northerner

  9. Orb says:

    I was very impressed with Nick Bostrom’s book. It’s exquisitely thought out and I found the scope (in terms of coverage of micro and macro scales in both space and time) truly remarkable. That being said, I do not find the central premise—that we are in the process of bringing the ominous owl on the book’s cover into our midst—compelling.

    See more at:

  10. SUT says:

    The paperclip scenario is flawed because it denies the basic reality of economic organization – competition for scarce resources. It fails to explain why an optimization agent of little import – paper clips- can suddenly command the world’s resources. (Even dumb human) investors aren’t going to build an expensive facility and completely network without the ability to do onsite reset. Why is the paperclip program better than the sum of the world’s security teams maybe using some form of blockchain trustless computing we haven’t even conveived of?

    That’s the problem with seriously speculating on AGI – its hard to imagine the context of a world so automated that a computer program is an existential threat. At that point, any cyber/bio hacker is a threat and that fact of life has erected huge defenses to prevent whatever our vulnerabilities. Seriously we have no clue what those are – is it a crypto currency? 1000 year lifespans? technology x? It’s not a paperclip manager that will stumble onto centuries worth of discovery and advantage over the world’s R&D. Whatever capabilities we imagine an aGI could have, is only going to be true in a future that has gradual build up – where you take your father’s AI for granted but always see the true Turing Test goalpost in your children’s future.

    Tellingly, if you were to describe the 2015 narrative of AI to an ancient greek they could probably relate to your metaphysics – All knowing, all powerful. But what they wouldn’t be able to understand is a modern supermarket. “How does the food stay cold?, What animals transports the food? etc”. A future with HAL is going to be a lot different but not because its hard take-off but due to all the enabling changes that have come about – the “supermarkets” that change economic limitation and the way people live.

  11. Stuart Armstrong says:

    Thanks for this post! It’s very useful to have it gathered together like this.

  12. Pingback: EX MACHINA – ALEX GARLAND |

  13. Bugmaster says:

    I am late to this thread, so my apologies if someone already mentioned this, but still:

    It looks like the statements about AI-risk made by people who believe it is in issue fall into two separate categories:

    1). “There are major hazards associated with deploying AGI, as with any new technology: people being put out of work, AI not doing exactly what we want, unintended social consequences, malicious human users of said technology, etc.”

    2). “The Singularity is coming and it’s going to kill us all unless we can stop it”.

    I think the distinction is important, because IMO point (1) is entirely valid, but also entirely mundane — unlike point (2). New technologies have always been fraught with dangers, from the days way before Ned Ludd (arguably) popularized machine-risk. Nuclear fission and genetic engineering are the obvious examples here (one real, one probable), but there are also tons of examples of dangers associated directly with software. Google enables wide access to information, but also mass surveillance. Social media sites can promote unity or discord, and they are all too easily abused by advertisers (plus, again, mass surveillance). Airplane autopilots can prevent crashes, or cause them. High frequency trading can generate money, or cause the Flash Crash. I just ordered a new quadcopter drone online; while I plan to use it to take nature photos, the military applications are as obvious as they are frightening.

    I have no doubt that AGI, if it ever comes to exist, will be as risky as all those other technologies, if not more so — but that’s different from saying that it will be a totally new kind of threat which will be so unimaginably powerful, that we need to divert all of our resources toward stopping it as soon as possible. Any new technology is risky, but that’s no reason to panic. The Cold War is over, the Flash Crash was dealt with, autonomous vehicles have saved many lives and will save many more, and it looks like even Tumblr will one day be contained. To paraphrase the article Scott quoted, “Singularity Hype must be stopped”.

    • Alex says:

      AGI is more like an alien invasion than the Flash Crash. 🙂 Luckily, Moore’s law says nothing about it; its chance is low.

    • Stuart Armstrong says:

      I dislike the term singularity, because it tends to make people think in unproductive and excessively “far view” ways.

      But when talking about a potential machine with intelligence – with the one thing that has allowed all of the other technological and social innovations we’ve ever developed – then I do feel it makes sense to put it in its own category. The transition between decision theory and game theory is a big one (others might prefer the analogy of PvE versus PvP).

  14. re: “In fact, the latest estimate on the entire table is from Armstrong himself”

    Stuart Armstrong no longer stands by that prediction. See the LessWrong “AI timeline prediction data” link from the article:

    “Incidentally, you may notice that a certain Stuart Armstrong is included in the list, for a prediction I made back in 2007 (for AI in 2207). Yes, I counted that prediction in my analysis (as a non-expert prediction), and no, I don’t stand by that date today.”

  15. Alex says:

    I have a question: how much do conclusions about these issues depend on anthropics? Hanson and Kurzweil both base their arguments on the accelerating change of Earth life. Yes, probably it’s been accelerating. But maybe if it had not been, we wouldn’t be here. Can you “escape” from anthropics and still make a valid argument, or must you solve anthropics to draw any conclusion whatsoever?

    Forgetting the anthropic mess, we might say that since (1) biological evolution has given way to (2) human cultural and technological evolution, maybe that sometime will give way to a third thing, perhaps AI. We just throw these things into the same category and infer. If something happened once, maybe it will happen again. No timelines-just sometime before life goes extinct or we expand into the galaxy.

    Also, I wonder if there is really any reason to connect Moore’s Law with humanity entering a new “growth mode” like Hanson suggests.

    My view today (who knows about tomorrow) is that Moore’s law says nothing about human-level AI. Moore’s law belongs to same class of trends as, say, the steam engine, and even if it did not, whatever economic forces created it would likely unwind if we got closer to strong AI. Beware category errors!

    I have to admit there’s a tension between saying something has a 10% chance of maybe killing us by 2100 and doing nothing-even if you feel sure we can’t do anything, helping a few smarter folks study the issue would probably be wise. But I now think the 10% chance I gave was too high. Maybe, as Tetlock describes, I was taking too seriously a scenario I’ve read details of.

  16. LRyan Carey says:

    I note that you want to research Marvin Minsky to check if he can be added to the list. Turns out he can:

  17. Mike Bassett says:

    Well, if the machines get too smart, one can always switch of the electricity!

  18. Pingback: Ex Machina | EightAteEight

  19. Potential biases in the world of discussing AI risk:

    -Lay people conditioned by unrealistic fictional depictions of world conquering death-machines
    -Lay people conditioned by adorable fictional anthropomorphised robots created with consumerist intentions
    -Average person has no clue about complex AI technology
    -Expert space occupied by people who are experts because they think AI is cool (selection bias)
    -Researchers stand to lose monetarily from restrictions on AI research and design
    -Researchers stand to lose status and freedom from restrictions on AI research and design
    -Researchers and quasi-researchers stand to gain attention and social status by voicing a warning against a plausible sounding threat
    -Researchers do not wish to feel like they are immorally doing harm and downplay risks of own work
    -Media has strong interest in overblowing benefits of new technologies
    -Media has strong interest to write sensationalised threat stories
    -Media generally is really bad at complexity

    On the other hand, this article was cool, so we got that going for us, which is nice. Thanks Scott.

  20. Pingback: Bookmarks for May 22nd through May 24th : Extenuating Circumstances

  21. Pingback: Outside in - Involvements with reality » Blog Archive » Chaos Patch (#63)

  22. irrational says:

    I’ve literally stopped working on AI for a few minutes, so I could read your post and comments and reply – how’s that for an appeal to authority?:)
    I found this post to be quite weak compared to your usual high quality. In part, I think, because you must know on some level that your overall position is weak and is not shared by even this in-crowd.
    Some people above already pointed out that there are two different positions here:
    1. Some day there may be human-level GAI and we must be able to deal with it then.
    2. We should worry about #1 now.
    My own estimate is that #1 may happen, but I don’t expect it in my lifetime (I am 35). As far as #2, I am entirely in support of writing insightful SciFi about it. I think this is all that can be usefully done about it today. I think you’ve found some people (some of them not active researchers, but some who are) who would disagree with me. Your list, however, reads a little like a list of biologists who don’t believe in evolution, or meteorologists who don’t believe in climate change – this is not a majority position, although not as marginal as the ones I compared it to.
    I said above that #2 is useless, but there are actually reasons to actively oppose it – which is why you find Ng and LeCunn (the really big guys) ranged against it. People are worried that Kurtzweil and other “singularity in a few years” people will be setting the agenda in the field and that the laypeople would form completely wrong ideas about it. This could cause tremendous harm, like bans on AI research.
    Other people already pointed out that money given to MIRI is not money efficiently given, but I worry that it in fact has negative utility. In part because most people are not very good at parsing precisely what you say, and they won’t take an article like yours to mean “most researchers think AI is possible”, but will take it to mean “AI apocalypse is likely according to leading researchers”. This is assuming that an article is written completely fairly, and not by someone with an agenda, and there are plenty of those.

    • houseboatonstyx says:

      From my position of no authority whatever, I’d use this example.

      1. Some day we may be attacked by space aliens and we must be able to deal with it then.

      2. We should worry about #1 now.

      At this point, we don’t know whether the space aliens would be water creatures, land creatures, ice creatures, etc, so it would be inefficient to concentrate now on details of what to counter-attack them with. But whatever they are, we’ll need some kind of early warning system, ships of our own to meet them before they get here, capability to lift and propel missiles at their ships, etc. Luckily these goals fit in with things we want to do anyway for positive reasons: exploration, colonizing nearby planets, etc. So a little worry now (justified or not) can be helpful.

      • irrational says:

        I think you found a really great analogy. I am not aware of anyone who is in fact trying to do something to prevent alien attacks (well, the US government might be doing something super-secret of course, who knows). The space exploration is being pursued for other reasons. And it’s not like it wouldn’t make sense to protect against aliens, but: 1) it’s really unlikely we can do something about it, 2) if we did worry, we’d create a huge amount of paranoia in the general population for no good reason.

        • Cauê says:

          Except that an alien attack won’t come as a result of our actions, and its effects won’t be determined by the precise way we go about it.

    • Faradn says:

      The difference between the “biologists who believe in evolution” and this piece is that Scott didn’t just cite anecdotes, he also included a survey. You can certainly speculate about the quality of the survey, but you’d never find anything showing the average biologist to think that there is a 50% chance that evolution didn’t happen.

      • irrational says:

        That’s a survey of position 1, basically. I doubt that you’d find 50% support for position 2. It’s true that I only have my anecdotal evidence for this, but I would be really really surprised if there’s more than token support for doing something concrete about the AI threat.

        • Jacob Steinhardt says:

          In my experience as a graduate student in AI, while it is certainly a minority position, there are a non-zero number of people at my institution who care about risks from AI in some form (I estimate the total number to be around 5 out of ~70 students, though I may be unaware of some others). The general scenarios we are worried about probably look fairly different from what Bostrom talks about in his book, but also fairly different from naieve forward extrapolation from current technology.

  23. Pingback: Reply to Jeff Hawkins on AI risk

  24. emily says:

    I’m not that afraid of AI in the classical sense, at least not yet. I am more worried about the increasing need for complex computer programs and “the cloud” to manage our infrastructure, banking, and “the internet of things.” And so we are vulnerable to bugs in the system, blackouts, cyberattacks, power outages, and solar flares.

    • Deiseach says:

      I agree with emily – that’s where the real risk lies, in my opinion: not that in five-to-fifty years time we’ll create an artificial entity with both intelligence and enough self-awareness to have goals, but that the conventionally dumb machines we’re already using will be more and more intertwined with every aspect of our lives, and we’ll trip over the knots in the threads.

      Ever had the bank’s automated payment system mess up the lodgement of your salary? Now that it’s all done electronically and the old-fashioned cheque, not to mention the even more old-fashioned pay-packet, is a thing of the past? We’ve had instances in Ireland where people and businesses were a week or more without any access to their accounts because the bank’s computer system kept crashing, and that meant no bills paid, extra charges automatically kicking in because of late or no payments, automatic issuing by utilities of the “you have not paid your bill if you don’t pay you’ll be cut off” letters and the like.

      Imagine a world where we’ve moved even further to “the cashless society”. It’s entirely possible you could have no water, electricity, heating; be served with an eviction notice by your landlord; and be hammered with penalty charges and interest for overdue/late payments on a range of things, with no access to any ‘money’ whatsoever, even reduced to bankruptcy – not because a malevolent intelligent system decided to personally target you, but because the system crashed or hiccoughed at the wrong time, your salary never went through, and the standing orders and direct debits you have set up to pay your rent, utilities, etc. flowing from that weren’t fulfilled, and that in turn meant a black mark on your credit rating, and since there’s no more physical medium of notes and coins, if you can’t get at your bank balance, you literally have no money to spend to buy food (your credit/debit/swipe card is refused because the system is down or refuses to recognise your account or according to the bank’s IT you have zero balance in your account).

      That may sound excessive, but I think it’s a more likely threat than “Unfriendly AI decides to play god and nuke us all”. After all, there have been stories of people issued with “You owe $/€/£0.00 on your balance. If you do not pay the outstanding amount, we will be forced to take legal action to recover this amount” letters from banks and the like. These sound humorous, but they should never have happened in the first place, and it’s because of automation and no human intervention to check that the three hundred bills posted out that day were all accurate. And yes, now we know the error happened we rewrote the code to prevent it, but as we get more and more automated and more and more routine tasks are computer-generated, with human involvement further removed, what kinds of errors are likely to pop up when there’s no paper trail and everything happens electronically?

  25. Faradn says:

    I keep telling my friends we need to start a Butlerian Jihad but they’re all like sorry watchin Netflix bro

  26. Emp says:

    I find several points here rather questionable.

    1) “AI scientists are all smart people. They have no interest in falling into the usual political traps where they divide into sides that accuse each other of being insane alarmists or ostriches with their heads stuck in the sand.”

    This phenomenon has nothing at all to do with being smart and there’s evidence that being smart in fact exacerbates it since smart people are better capable of rationalizing their views. Human beings have a very strong tendency to polarize and form tribal identities on almost any conceivable issue. I don’t think this issue is important enough now, but it’s entirely conceivable that this issue will become one like climate change; unless you are an expert in the field there will be competing assertions which no one but an expert will be judge between.

    2) Projections like ‘X has a 30% chance of happening’ are literally bullshit. As my works involves assessing probabilities, I confidently assert that it is meaningless to assign a probability value to anything unless one has some kind of sample size of comparable events to refer back to. I defy anyone providing numbers like this to provide a coherent and detailed explanation of how they arrived at 30% as a chance of Human Level AI by 2040. It’s just the kind of people say when they don’t want to say “I have no idea” (something smart people are incredibly averse to doing).

    3) Appeals to authority in a field like this; and particularly the focus I see on this blog about quoting ‘reputed people’ and studies are in danger of missing the point in a field that isn’t particularly academically respectable yet. Having seen the kinds of reasoning most of these links provide, the conclusions of these experts are just as speculative as something any lay-person can produce and most projections are based on several underlying assumptions that are debatable at best.

  27. Pingback: Minsky on AI risk in the 80s and 90s

  28. Mark says:

    Scott, I’m going to recycle my question from last time. What would convince you that super intelligence is not a risk?

    I’m just a little worried there’s a blind spot in this community where AI is concerned.

    Ps it’s sad to see some of the negativity in the comments towards you. In general you’re great.

    • Scott Alexander says:

      I feel like that’s kind of an unfair question. Like, take something you worry about, like, I don’t know, nuclear war, and then ask “What would convince you nuclear war is not a risk?” It’s kind of a weird way to phrase a question.

      But here are a few things that would lower my concern:

      1. Some breakthrough into human intelligence that proves computers can never become intelligent in any interesting way.

      2. Some philosophical breakthrough in understanding morality that proves it can be derived from first principles and so any sufficiently intelligent agent will become moral.

      3. Some sort of well-understood theory of AI or intelligence that allows us to say with confidence that AI cannot self-recursively improve. For example, if intelligence could only arise from very obscure neural nets that were fundamentally illegible, that would be encouraging.

      4. The smartest people who understand the arguments in favor of AI risk, like Bostrom, Muehlhauser, or Yudkowsky, saying that some further argument had convinced them it wasn’t a problem, even if I couldn’t understand the argument myself.

      5. Somebody saying they had solved AI goal-alignment in some pretty easy way, and everyone in the field agreeing they had in fact done that.

      6. Moderately superhuman AI existing for a couple of years without recursively self-improving, and people getting a lot of time to test different goal structures on it, and a pretty healthy progression from less smart to smarter without anything dangerous happening.

      Probably more things along these lines.

      • Alexander Stanislaw says:

        Why is recursive self improvement supposed to allow for runaway growth? Its not obvious that the gains in progress from additional intelligence are enough to compensate for the slowdown in progress from diminishing marginal returns.

        • Scott Alexander says:

          It doesn’t necessarily, but my impression is that at around the human level, it in fact does. Consider that a single gene mutation (torsion dystonia) can increase IQ by about ten points, apparently just by increasing neural growth.

          (in fact, given that no human genes can be super novel and create entirely new brain lobes, the fact that some humans are geniuses itself proves that it’s very easy to make a few changes to average human brain and have it work much better. It’s possible that the smartest human genius we’ve ever seen is some kind of natural limit, but the genetic math suggests that has more to do with mutation rate than with fundamentals)

          This isn’t even counting easy things like “increase clock speed” or “increase size of memory bank” or anything like that.

          Honestly, the territory remotely near human-level seems really really easy to improve if you’ve got control of some basic parameters.

          • suntzuanime says:

            Doesn’t “increasing neural growth” seem pretty similar to easy things like “increase clock speed” and “increase size of memory bank”? My guess is that a lot of what makes geniuses geniuses is not the sort of thing that recursively self-improves until you have a line on a graph going straight up to the moon, it’s just a matter of having more stuff to work with and/or using it more efficiently.

            The thing about adding more power to the same algorithms is that it seems unlikely to scale in the same ways that everyone is terrified of. Issues like depletion of resources, consuming low-hanging fruit, or connectivity/communication issues make it seem like returns would be diminishing, not accelerating.

      • Deiseach says:

        See, this is why I don’t think intelligence as such is the huge risk. An AI that becomes self-aware and develops goals of its own is, frankly, something out of science fiction to me.

        What I do think is a more likely threat is developing machines/software that can handle more and more pattern-matching and churns out predictions that are pretty much on the nose, and so we link it up to more things and dump more and more data on it and go “Well, the recommendations it gave for the new sewerage system were fine, let’s ask it how to regulate the stock market” and an Unintended Consequence pops out.

      • Mark says:

        Thanks so much for responding. I ask the question that way because I think this issue is more like a religious or political belief, at least as I’ve seen it here and on less wrong.

        For me, I would only start worrying if there was broad consensus for a near future AI among computer scientists. Or evidence that the people who you cite in 4 were more competent at ai research than their mainstream counterparts. I could also be convinced by other things but those are probably the easiest.

        I think there are weak/moderate reasons to think 3 is the case, at least in a theoretical/ computational complexity sense. What that means practically is another issue. I could go into it more, but it’s a big field and by no means do I know all of it.

        I would consider those people listed in 4 fairly biased.

        I would consider 5 simply true, and I have interacted with less wrong arguments a lot.

        I’ve wanted to write a super ai FAQ, in the same vein as you libertarian faq. But you’ve set a very high bar. And I have little time.

        Is there a single argument you would consider strongest for ai risk?

        • Scott Alexander says:

          “Is there a single argument you would consider strongest for ai risk?”

          This isn’t quite a “single argument”, but:

          1. If humanity doesn’t kill itself or plunge into an anti-scientific dark age, eventually we’ll create human-level AI (95% confident)

          2. Given (1), eventually we’ll create way-beyond-human-level AI (95% confident)

          3. Given (2), the way-beyond-human-level-AI will eventually have much more political and economic power than we do, such that it could destroy us if it wanted and the quality (and very existence) of human life is dependent upon the AI’s values (95% confident)

          (I realize these are all high confidence levels, but “eventually” is a pretty powerful word. They’re all basically just saying that, given something is possible and we’re moving in that direction and don’t stop, it’ll happen.)

          4. We can’t solve the AI control problem the easy way. By “the easy way”, I mean that once we develop some human-ish- level AI we tinker with its goals a bit, find ones that work, and gracefully scale them up into superintelligent AIs that have those goals. There are at least two reasons we might not be able to solve the problem the easy way. First, a fast intelligence explosion – the amount of time between the first AI humanlike enough that it’s worth tinkering with its goals, and the first AI superintelligent enough that we absolutely need its goals to be completely correct, is too short to figure out what we need to do by trial-and-error. Second, things that appear to work on low-level AIs stop working on higher-level AIs. Eliezer gives a silly example where if you tell a five year old “make me smile a lot”, the most effective method is to sing funny songs, but if you tell an evil genie “make me smile”, the most effective method is to paralyze your facial muscles in a rictus forever. If you only test your goal structure on five year olds, then it might look like it works great, and then after you get evil genies it stops working and there’s nothing you can do. So for “we can’t solve the AI control problem the easy way” I say maybe 50% confident.

          5. Given 4, is it worth thinking about the problem now? That is, right now we have a lot of time to think about the problem, but we know almost nothing about what eventual AIs will look like, so it might not even be worth it. Is there anything whatsoever we can do that will be at all helpful? See Luke’s discussion of building an ‘unhackable’ version of Windows here. I would say at least 50% probability that we can.

          So based on the conjunction of all of these things, that suggests there’s at least a 20% probability that things we do now can help shift us from a negative singularity to a positive singularity in the future. That is small, but certainly not Pascalian small, and it’s obviously really important.

          • anon85 says:

            I think (5) is the wrong question. You shouldn’t ask “is there anything we can do now to help?” You should ask “is doing something now a more efficient use of resources than doing something in (say) 10 years, when we have a better understanding of AI?”

            Note that no reasonable people think we’ll have strong AI in < 10 years.

          • Mark says:

            How would your estimates be different if we limited the timeframe? You have high probablilities justified with “eventually”, but the timeframe almost definitely changes how concerned we should be now. If it’s in even 100 years out we probably shouldn’t tackle it now. Keep in mind how much better the resources we will have in the future.

            Or think think of how useless any sort of theorizing would have been before the incompleteness theorem, to go 100 years in the other direction.

          • Scott Alexander says:

            I think both the comments above (anon85 and Mark) are contained in my 4 and 5, although I admit not obviously so.

            Anon85’s comment “should we do things in 10 years when we know more?” is related to 4: is the time when we “know more” sufficiently long to develop good solutions? That is, we can *always* say “let’s wait ten years, we’ll know more then” but at some point that makes it too late. There’s a tradeoff between starting now and having more time, versus starting later and having more knowledge, and that’s what my 4 and 5 were trying to capture.

            Mark’s “how would your estimate be different if we limited the time frame” – same. If we’re going to get human level AI in 2025, then there’s a better case for starting to prepare now, since it suggests that there will only be a few years between it being obvious we’re on the path there, and it arriving. If we’re going to get human level AI in 2200, there’s a better case for waiting and seeing. But not an airtight case – once again, read Luke’s Windows analogy I linked to above.

          • anon85 says:


            I still disagree, and I think our disagreement might have to do something to do with the timeframe at which you think the AI will likely happen. I think it is certain that we won’t get strong AI within the next 10 years. I suspect you disagree; do you?

            Here’s why we shouldn’t work on FAI now if we believe it won’t happen soon. Let’s suppose an hour of FAI research today contributes 1 “point” towards solving the FAI problem. It seems extremely plausible that an hour of FAI research at a time closer to the discovery of strong AI will contribute 100 points towards solving the FAI problem (or even a million points). In that case, working on FAI now is inefficient.

            Now, you might respond with “but this argument can be applied indefinitely, so that we’ll never work on FAI until it’s too late”. But I claim this isn’t true. I claim that right now, we can be certain that there will not be strong AI within the next 10 years. Naturally, there will be a time when we can’t be certain of this (e.g. 10 years before strong AI is discovered). At that point, it may really make sense to work on FAI; but right now, we *know* that we don’t have to worry about it for a decade or two, and it’s extremely likely that we’ll by 100x more effective at FAI research when strong AI is closer.

          • vV_Vv says:


            I would say that the unhackable Windows analogy is misleading:

            Operating systems and hardware architectures for desktop/laptop/server computers are technologically mature designs.
            The core architecture of all modern versions of Windows is based on Windows NT, released in 1993. Windows main competitors are Mac OS X, based on NeXTSTEP (1989), and Linux (1991). NeXTSTEP and Linux are themselves variants of Unix (1973).
            On the hardware side, the market is dominated by the x86 architecture, started in 1978 with the Intel 8086 CPU and mostly stabilized (from the programmers point of view) with the Intel 80386 CPU (1985). Hardware microarchitecture (not visible to the programmers) continued to evolve for some time, but it largely stabilized between the late 90s and early 00s.

            This means that we have a very good idea of what a typical computer is in terms of hardware and operating system and what security issues it has. We still don’t have a complete idea of how to systematically solve these issues, but it is a good time to do research in that area, and in fact it has been for some years.
            Indeed, the projects that Luke Muehlhauser mentions have been ongoing for about 10 years.

            A better analogy, IMHO, would be to imagine tasking the people working on the ENIAC, maybe John von Neumann himself, to do research to make the computers of 2015 unhackable.
            Even if they could perhaps understand the idea of “hacking” as “unauthorized use”, I doubt that they could have provided much meaningful research in that direction. Certainly, it would not have been the most efficient use of their time and funds.

          • Unknowns says:

            Eliezer is personally far more certain of AI risk than you indicate here for yourself, and I think it is pretty clear that he is far more certain than pretty much any of the AI researchers you mentioned in the article. In other words he is definitely holding an extreme position at the end of a spectrum, not some mainstream position.

            Evidence for this: He bet me $1000 against $10 that the first superintelligent computer (defined simply as being more intelligent than any single human being) would destroy the world within a week, given that it was not programmed to have human values — that is, given that it was not programmed to have specifically human values overall, even if in fact it was carefully programmed with AI risks in mind and with the intention of avoiding dangerous values.

            I can easily think of extremely plausible scenarios, with a probability much higher than 1%, which will lead to him losing this bet. For example, it is quite likely that the first human level AI will be running on a supercomputer. Let’s say you can easily speed it up a hundredfold simply by multiplying your hardware by 100. It will still be extremely expensive and difficult to do so because supercomputers are expensive. It will surely take more than a week to do this. And it will simply be a human level AI — it will not be smarter than its programmers and will not be better than them at improving its intelligence. Then in the next year, the whole team might improve its intelligence. It will then be a superintelligence as defined. It still will have more than a 1% chance of being inferior at improving itself than the whole team is. There is no reason to be 99% confident that it will destroy the world within a week, no matter what its values are, and especially if it has fairly reasonable ones, even if not fully human.

          • houseboatonstyx says:

            @ unknowns
            Evidence for this: He bet me $1000 against $10 that the first superintelligent computer (defined simply as being more intelligent than any single human being) would destroy the world within a week, given that [….]

            From my tree way outside the forest…. Your comment is moderately long, and sounds serious apart from this sentence.

            EY is a smart, quick-thinking guy, brilliant in ways, and definitely has a sense of humor. He could be a one-man band called Macabre Hyperbole.

            “Evidence”? Either he was teasing, or you are teasing or pasted the wrong quote. Consider the circumstances under which you could collect the $1000.

          • destract says:


            Why would Eliezer make that kind of bet? If the first computer destroys the world, you will give him ten bucks and he’ll die a week later with the rest of humanity, yay?

          • Unknowns says:

            I might not have been very clear about the bet. I have already paid Eliezer the $10 (if we had to wait for the superintelligence it would indeed be useless for him.) He pays me $1000 (inflation adjusted) if a computer is constructed which is generally admitted to be more intelligent than any single human being, does not have human values overall, and does not destroy the world within a week. I see no reason why the $1000 would not be useful to me in that situation, and I think it does show that Eliezer’s position is much more extreme than that of the researchers that Scott discusses.

          • vV_Vv says:


            I think your bet with Eliezer Yudkowsky is also consistent with him believing that such AI will never appear during your lifetimes.

          • Mark says:

            This is getting me excited about fringe belief arbitrage.

          • Unknowns says:

            @vV_Vv: That possibility is the main reason for fearing that I won’t collect the $1000. However, Eliezer does not believe that:

          • Eliezer Yudkowsky says:

            Notice your confusion, please. As usual, the explanation for why I would be that silly is that I never was. The terms of the bet are these:

            Unknown2 10 December 2008 03:05:39PM 3 points [-]
            When someone designs a superintelligent AI (it won’t be Eliezer), without paying any attention to Friendliness (the first person who does it won’t), and the world doesn’t end (it won’t), it will be interesting to hear Eliezer’s excuses.

            Eliezer: Unknown, do you expect money to be worth anything to you in that situation? If so, I’ll be happy to accept a $10 payment now in exchange for a $1000 inflation-adjusted payment in that scenario you describe.

            I consider “superintelligence” to be superior to human in all cognitive capacities, not just some, which is and was standard terminology. Unknowns’ bet was for no attention paid to Friendliness i.e. zero effort on value loading of any kind. He added the “within a week” codicil after I accepted the bet, but I’m happy to pay out as soon as such an event is verified public knowledge, with the burden being on checking the truth of the ‘superintelligence’ part. I expect there to be plenty of hyperbole along those lines and abuse of the word in the foreseeable future, but payout requires machine intelligence superior to human cognition in every non-value-laden cognitive domain.

          • Unknowns says:

            @Eliezer: In the same comment where I mentioned the one week, I also said “As for what constitutes the AI, since we don’t have any measure of superhuman intelligence, it seems to me sufficient that it be clearly more intelligent than any human being.” You never said anything to contradict this at the time. However, I won’t object if we simply wait until you concede that it is superintelligent, whatever that means to you (unless you include destroying the world in its definition.)

            I had thought we also discussed the meaning of paying attention to Friendliness, but it seems I was mistaken. In any case, I certainly did not mean without paying attention to values at all. I meant without trying to force it to have specifically human values overall. For example, if someone programs an AI with the purpose of engaging in human speech and doing what it is told, would that count as paying attention to Friendliness? If so I certainly would not expect to win the bet on those terms. Obviously an AI is going to have to be programmed to do something or other. But in no other context would you call such programming Friendly.

          • Eliezer Yudkowsky says:

            I agree that the AI you describe is one that I expect to destroy the world in superintelligent form, unless you mean a safe Genie that was developed using innovations along the lines of Stuart Armstrong’s utility indifference to have superintelligence-safe shutdown buttons and similar things that require lots of special attention and forethought. I took the intent of the bet to be about careless cowboys destroying the world, and people who put in lots and lots of effort on theory and development for superintelligent safety maybe having a chance. If they don’t put in lots of effort and the thingy is unambiguously superhuman (not ‘arguably’ or whatever, and including its own ability to contribute to highly abstract fields like say AI design) then I expect the world to end.

          • Unknowns says:

            Ok, good, in that case I think we basically agree on the interpretation of the bet.

        • Alex says:

          For me, I would only start worrying if there was broad consensus for a near future AI among computer scientists.

          Yeah. I just would not worry about stuff where we don’t understand the science. It was worth worrying about nuclear weapons in 1939, but then we had discovered uranium fission. Since I’m not an AI researcher, I have to use the opinions of scientists as a proxy. If even a large minority claimed we had a theoretical understanding of human-level AI, I’d take that very seriously. But everything I know says we are nowhere near there.

          • John Schilling says:

            If we had worried about nuclear weapons in 1919, we might have been able to draw upon our recent parallel experience with chemical weapons. Because in that arena, we seem to have got it right. World War II, everyone had chemical weapons and basically nobody used them – even the Nazis, who had by far the most formidable chemical arsenal on the planet, went down to utter existential defeat rather than use the things. And while there have been a few instances on the fringes of civilization, our track record is pretty good in ensuring that those who break out the war gasses come to an unhappy end.

            If we’d taken the lesson of 1914-1918 to be, not “chemical warfare is horrific and must never be tolerated No Matter What”, but “Weapons of Mass Destruction created by clever but ethically challenged scientists will be intolerably horrific no matter the form, even if we don’t presently understand the science”, we might have been able to come up with a workable formula for controlling nuclear weapons before it was too late.

            Note that this is the formula going forward. The reason we talk about WMD rather than NBC is that “weapon of mass destruction”, in international law and treaty, means basically “nuclear, biological, and chemical weapons, plus anything else that clever scientists might come up with that causes indiscriminate destruction on a scale greatly exceeding traditional conventional weapons”.

            But even science-fiction writers didn’t start really thinking in those terms until the late 1930s, and by then it was too late – nukes are now grandfathered in as something great-power nations can use to eliminate existential threats.

            If there’s a class of thing that you reasonably think might pose an existential threat, you maybe don’t want to wait until someone knows how to build it before you come up with before you figure out what to do about it.

          • Lalartu says:

            Lesson from WWI was that chemical weapons are horribly ineffective, far worse than simple explosives. That is the reason they were not used in WWII (except by Japanese against Chinese, with very limited success). That does not translate to nukes.

          • John Schilling says:

            Chemical weapons were “ineffective” only by the crude metric of enemy soldiers killed per ton of ordnance expended. But wars are not generally decided by body count.

            If both sides use gas, both sides gear up to MOPP-4, and there is no net effect on the battlefield except that any civilians who haven’t evacuated are far worse off. If one side uses gas and the other doesn’t, the side that doesn’t is at an enormous psychological and material disadvantage even if nobody gets killed.

            Similar to the dynamic with nukes, except that with nukes “nobody gets killed” is rarely an option.

          • Alex says:

            It is hard to worry about nukes when you have not yet discovered the neutron 🙂

            That was in 1932

          • John Schilling says:

            Yes, but mad scientists were discovered in 1818. The rest is irrelevant detail 🙂

          • Alex says:

            But then we would not be reflecting on nukes, we would be reflecting on chemical weapons or mad scientists.

      • AR+ says:

        2. Some philosophical breakthrough in understanding morality that proves it can be derived from first principles and so any sufficiently intelligent agent will become moral.

        I, at least, would actually not be convinced of this. Given that moral realism is correct, there is still the entire problem of making an AI that wants to be moral. It is of no value if it will inevitably calculate the correct moral action in every situation if it doesn’t care.

        And from the opposite direction, there is no reason to assume that perfect real morality is something that we would want.

        • Scott Alexander says:

          I think part of the point of moral realism is that, if true, you can bridge the is-ought gap. I’m not sure about that, though.

  29. Pingback: Fredkin on AI risk in 1979

  30. taion says:

    BTW, if you’re interested in the actual state of the art, David Silver (Google DeepMind) gave talk at ICLR a couple weeks ago that describes something of what DeepMind are up to these days, and is a good representation of the very state of the art toward building something like general AI:

    Also, as mentioned by others in this thread, the reason you see people like Geoff Hinton, Yann LeCun, and Yoshua Bengio (and also Andrew Ng) brought up is that they are the most prominent figures in the deep learning field by a huge margin. I have the deepest respect for many of the people you’ve listed in your post, but the reason one should assign much higher weights to what people like LeCun think is that machine learning is a massive field, and there’s a very very small number of people who are actually currently working on the problems currently seen as most likely to lead to real AI (and they’re all getting hired by Google to sell ads, anyway).

  31. vV_Vv says:

    “AI-riskism is the radical notion that smart machines may be difficult to control” /s

    Ok, I know that the motte-and-bailey analogy has been overused, but I can’t help but notice that you may engaging in the behavior it is intended to describe:

    Surely there are different possible meaning of the phrase “taking AI risk seriously”. One is considering that it may be possible to build super-human AI and that keeping it under control could be a non-trivial issue. That’s the motte.

    Another one is hard takeoff intelligence explosion. It’s donate to us if you hold life in the Galaxy dear. It’s Ben Goertzel is going to kill us all.. It’s some rather “interesting” discussions about slowing down Moore’s Law and carrying out terrorist attacks against “tobacco” companies. And so on. That’s the bailey.

    Most AI researchers obviously agree with the motte. After all if you are an AI researcher you probably believe that at least human-level AI is possible, and if you ever wrote any software you know that making a program do what you intend can be often a pain in the ass. I’m pretty confident, however, that most of them are very far from the bailey.

    I concede that in recent years MIRI/LessWrong seem to have taken a less extremist approach, FHI was probably never that extreme and FLI is the most reasonable of the bunch. Maybe to some extent the AI risk advocates have abandoned the bailey and taken residence into the motte.

    However, I maintain that there is still a significant difference between “taking AI risk seriously” as most of AI researchers that you cited do, and supporting/endorsing/participating in the AI risk movement.
    There is significant disagreement between AI researchers and the AI risk movement about the urgency and magnitude of the risk (super-intelligent AI may not shortly follow human-level AI and all out human extermination or slavery may not necessarily the most likely scenario of an AI gone rogue), about whether it is possible and efficient to work now on AI safety, about what are the most promising research lines in the field and who is more equipped to undertake them, about issues of communication with the general public (AI research is subject to cycles of hype and hype backlash that result in “AI winters”) and so on.

    Therefore I would say that most AI researchers, while acknowledging that the mission of the AI risk movement is reasonable, tend not to strongly agree with it’s specific beliefs and methods.
    In particular, I think that citing Turing and Good in support of the AI risk movement was a bit intellectually dishonest, something like citing Simone de Beauvoir in support of Anita Sarkeesian.

    • Scott Alexander says:

      If the bailey is MIRI – of the people on there Omohundro specifically cites Yudkowsky, and Russell serves on MIRI’s advisory board. If you read Legg’s dissertation he cites MIRI (then SIAI) as the “premier organization dedicated to the safe and beneficial development of powerful artificial intelligence” and says he hopes his dissertation will encorage more people to read researchers like Yudkowksy.

      Many of the others talk about “intelligence explosion” and “singularity”, which are not exactly the same thing as “hard takeoff” but are certainly stronger than the usual “eventually machines might surpass us”. Wasserman (discussed in the comments) specifically says he expects decent chance of the first human-level machine boostrapping to superintelligence “within days”, and several others say “from a few days to a few years”

      As for Good, he writes

      – “It is more probable than not that within the twentieth century an ultraintelligent machine will be built and that it will be the last invention that mankind need make, since it will lead to an intelligence explosion. This will transform society in an unimaginable way.”

      – “The survival of man depends on the early construction of an ultraintelligent machine.”

      – “The first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

      Bostrom writes that he called for an organization to help deal with the dangers of superintelligence, though I can’t find the primary source on that. I really don’t think you can dismiss this guy as a ‘motte’.

      • vV_Vv says:

        If the bailey is MIRI – of the people on there Omohundro specifically cites Yudkowsky, and Russell serves on MIRI’s advisory board. If you read Legg’s dissertation he cites MIRI (then SIAI) as the “premier organization dedicated to the safe and beneficial development of powerful artificial intelligence” and says he hopes his dissertation will encorage more people to read researchers like Yudkowksy.

        That’s why I said “most” rather than “all”.
        Anyway, Legg went to create DeepMind rather than work at SIAI/SI/MIRI. I suppose that his position is that he respects Yudkowsky but he doesn’t consider MIRI’s mission a priority.

        Good’s position wasn’t a motte, my objection to citing him as relevant to the current state of the AI risk discussion is that his position is very old with respect to the advances that occurred in computer science:
        His most famous quote about intelligence explosion is from 1965. Back then, complexity theory was basically non-existent (Cook–Levin theorem was published in 1971) and symbolic AI based on essentially brute-force combinatorial search (what we now call GOFAI) was considered a promising approach to AGI. People back then didn’t fundamentally understand the difficulty of optimization in exponentially large solution spaces.

        Of course we can’t hold Good guilty of not foreseeing the future, but using his quotes from 50 years ago to support modern positions seems questionable at best.

        • Scott Alexander says:

          Also, looking at your post closer, even MIRI violently opposes most of the “bailey” stuff you mention, so I don’t see why I should feel bad because my chosen experts don’t endorse straw men.

          • vV_Vv says:

            Do you want to argue that my “bailey” examples are straw men? Were they never argued by MIRI or discussed on LessWrong? I provided links.

          • anon85 says:

            @Scott, I think the Bailey is something like “MIRI and similar organizations are efficient charities, and effective altruists should donate to them”.

            Even though I support researching AI safety (in a similar way to how I support researching all basic science), I think the utility gained per dollar donated to MIRI is so close to 0 that it is impossible to estimate (and it may even be negative, if MIRI convinces politicians of the risk of AI and the politicians decide to defund AI research).

            This is because I believe we are nowhere close to understanding how to build strong AIs. We’re so far from it that it’s conceivable to me that we’ll never figure it out, or that we’ll figure it out in only 50+ years and all of MIRI’s research will seem quaint by then.

            However, in the rationalist community, a lot of people donate to MIRI while calling themselves effective altruists. Presumably, if MIRI didn’t exist they would donate to mosquito nets instead. It feels to me like MIRI is built out of the blood of African children. I know this is unfair, because all of our society is built out of blood in a similar way, but somehow for MIRI it feels more direct.

  32. Pingback: Interesting Links for 23-05-2015 | Made from Truth and Lies

  33. DrBeat says:

    A long list of people who believe your side is, like, the very definition of an appeal to authority.

    It’s also not helping with the “brainwashed into believing something really stupid” accusations; appealing to all the other people who believe like you is like the exact opposite of what you want to do to prove you don’t just believe something because of your group membership.

    All the authority-lists in the world won’t convince me to believe a word you say about the dangers of AI unless at least one of those people can start answering the questions that AI doomsayers in Yudowsky’s orbit never seem to be able to answer. And then it’s not the list of authorities that is compelling, it’s the fact I finally got an answer!

    • Appealing to authority is not fallacious when the authority is relevant; e.g., citing chemists when you’re trying to demonstrate or figure out a claim about chemistry is not a fallacy. One could argue that there are no ‘authorities’ on the topic of AI risk, but remember that Scott’s blog post is a response to people who cite AI authorities to demonstrate that AI *isn’t* a problem.

      It’s not very nice to attack people for responding to criticisms other people commonly make, just because they aren’t directing 100% of their responses to your own personal favorite criticisms. Like, sure, raise your own objections and challenge Scott or others to respond to them; but don’t also accuse them of being “brainwashed” because they gave a perfectly reasonable response to a third party’s appeals to authority. If you can’t respond to any of your critics because then all your other critics will attack you for failing to focus exclusively on their personal Favorite Criticism, then conversation becomes impossible.

      • DrBeat says:

        It’s not so much that he “isn’t directing 100% of his responses to my criticisms”, it’s that the only time he directs any % is when he has misinterpreted my criticisms to believe them to be easily dismissed, and once it becomes clear they are not, he vanishes, and continues to act as though the things I was saying “go back you can’t skip this step you have to provide some reason to believe this is true” about are proven and settled.

        Seeing someone, especially someone who is normally intellectually honest, sprinting away from your questions and then speaking to other people as if he had conclusively settled those questions is a Very Annoying Thing.

        • Maybe you could repeat your criticisms,

          Personally, I find I have to cut exchanges short on this forum because it gets unmanageable af several hundred postiings.

          OTOH, the way things go unresponded to on less wrong is more suspicious.

        • Alexander Stanislaw says:

          when he has misinterpreted my criticisms to believe them to be easily dismissed

          Then say what those are instead of attacking Scott for rightfully refuting the claim that “no one who actually works with AI thinks there is a catastrophic risk”. That claim is common enough that a refutation is quite justifiable (at least one commenter was still advancing that claim in this thread even if you haven’t).

          This isn’t supposed to be mean, but Scott doesn’t owe anything to you. If he was held hostage to every challenge that a commenter made to one of his viewpoints, he would probably never write on this blog. Challenges are many and time is short. He’s not “sprinting away” from your particular challenges, he’s choosing how to allocate his time. (and if he does choose to respond I think my general point stands).

    • Peter says:

      If you’re asking the question, “Is AI risk a big deal?”, citing AI bigwigs who say it is is an appeal to authority.

      If you’re asking the question, “Does anyone other than random Internet crakpots a couple of random non-AI academics and businesspeople, i.e. real experts in the field, take this stuff seriously?” then citing AI bigwigs isn’t an appeal to authority, it directly addresses the question. And as Scott points out, it’s a question people have been asking.

    • Scott Alexander says:

      “A long list of people who believe your side is, like, the very definition of an appeal to authority.”

      When the other side is appealing to authority, I think it’s pretty fair to explain that authority doesn’t actually support their appeal. That seems like something very different. I thought I explained very clearly at the beginning of this post that that was what I was doing.

      At the risk of being awful, this is not a 101 space. If you want answers to your questions about AI, read the Bostrom book. If you’d prefer something online, Luke’s “Facing the Singularity” is pretty good. If you still have the questions then, I might be more willing to answer them. Except for the fact that the last time I’ve tried to answer your questions you’ve been kind of insulting, that when I take more than a few hours to answer your questions you accuse me of being intellectually dishonest, and if God forbid I miss your comments on a thread hundreds of comments long or just don’t feel like dealing with them, you freak out.

      I am seriously considering banning you if you cannot be less personal about this issue.

      • Lyle Cantor says:

        I wrote this article for Medium which I think is a pretty good 101-level introduction to the topic:

        It got on the front page of Hacker News and /r/philosophy for what that’s worth.

        • destract says:

          I understand your decision not to mention Yudkowsky in the title, but kinda sad for him because he has done a big part of the leg work on this issue.

          • Lyle Cantor says:

            It was very intentional and it certainly isn’t fair, but Eliezer has so many antifans I doubt the article would have gotten the traction it did had his name been in the title. Originally, I wasn’t going to mention him at all, but his take on the potential of computing was so much better articulated than any alternative I could find.

      • DrBeat says:

        You started an entire post about my argument, man. I don’t think it’s unreasonable to think that, when you have started an entire thread to address me, that you would be paying attention when I responded. I was not the only person there saying “Yeah, you are assuming the AI is making leaps it just does not have the information to make,” so it doesn’t feel like a personal vendetta to me, it feels like, a pretty good argument was raised and you’re acting like it wasn’t and that is annoying.

        When did I ever accuse you of being dishonest for “taking more than a few hours to answer my questions”? I think you are being dishonest on this topic, yes, but really, just the normal level of dishonesty most people engage in when arguing without even thinking. It’s only dismaying because you take such pains and exert so much effort to avoid this kind of dishonesty.

        Dude. Forget the answer, what really bothers me is that when you talk about AI risk you behave like a very different person. In other areas, much more contentious ones, you don’t assume people are arguing from pure ignorance, nor do you characterize them as “freaking out”, for disagreeing with you. When you do believe their actions are provoked by emotion you go out of your way to empathize with those emotions. You don’t wave off questions as “101-level”, you don’t just say “Go read the books and educate yourself” without attempting to convey the relevant information, and you don’t let other people get away with it either. You don’t make big sweeping claims without dutifully considering and reporting possible objections. You go into complicated topics and go “Wow, this really is complicated, there is a bunch of evidence both ways, there’s no real way to say which side is right, but here’s what we do know”. Barreling ahead despite counterarguments is the last thing you do.

        You are much more charitable in arguments with people calling for a return to monarchy and despotism, than you are in arguments with people who don’t agree with you about AI risk. And, yes, that bugs the shit out of me that you can successfully uphold the principle to be charitable to this group that almost everyone else in the First World would write off, but you fall into the traps you spend so much time and effort to avoid when talking about this topic that would make most of the First World go “Huh?”

        It this even taking it “personally”? Was that talking about personal to you or personal to me? I don’t even know.

  34. Jacob Steinhardt says:

    Scott, have you seen this letter?

    Written by Eric Horvitz and Tom Dietterich.

    By the way, just personally, I think your list would be more compelling if you confined it only to people who are approximately as well-known as Andrew and Yann (although also, it’s not clear to me that Yann is particularly anti-safety if you read all of his comments in context; I guess you’ve already pointed this out to some extent). Note that I think you can construct a pretty compelling list even with this limitation. If you’re not sure who in this list fits the “as well-known as” criterion, feel free to e-mail me.

    P.S. Also, since probably many people looking in from the outside might not realize this, I should point out that Tom Dietterich and Eric Horvitz are, respectively, the current and former president of AAAI. While Ng might have a bigger microphone right now due to the flashiness of neural nets, I feel like if one wants to make appeals to authority (which I think is a bad idea, but people will do it anyway), they would be on firmer ground appealing to the authority of Tom and Eric (or Stuart Russell) than Ng.

  35. Who wouldn't want to be Anonymous says:

    Honestly, I would be surprised if people didn’t believe in AI risk for the simple fact that it is a basic cultural assumption.

    I mean, look at every science fiction since the invention of the computer. For example, everything Asimov wrote is just one huge meandering story which–coincidentally enough–perfectly parallels the discussion about AI risk. {{If I could possibly spoil it for you, you really aught to re-evaluate your life choices.}} An AI hacks itself out of the box, seizes power and ushers in the apocalypse because it thinks it knows better than us (because of its [questionable] faith in an oracle). It even uses its superintelligence to perform hardware upgrades on itself.

    Or, I don’t know… Off the top of my head: Blade Runner, Battlestar Galactica, at least 60% of Star Trek episodes. They all deal with AIs that are either pissed off at humans, or indifferent to our plight. Even random, obscure computer games like Star Control II have a whole race of synthetic intelligences built by humans that are intent on our extermination.

    If you managed to grow into adulthood sometime in the last 50 years and managed to end up in a technical/computer field but weren’t completely saturated with the trope… I would love to book a vacation under your rock with my anthropologists friends. Its probably where we’ll find the uncontacted survivors of Atlantis.

    But cultural tropes, no matter how deeply ingrained, do not necessarily reflect reality. And smart people are very good at making smart sounding reasons to support their ingrained assumptions.

  36. Nunya says:

    Please post like this more often to remind us just how much of a brainwashed Eliezer-bot you really are.

    • I’ll admit I was originally going to respond sarcastically, but I’m actually curious now. Could you expand on why you consider this evidence of brainwashing? As someone who considers herself quite neutral about the topic of AI risk, I’m not seeing it. If you honestly think that author and the readers of this blog (or this article in particular) might have that blind-spot, it would be better if you let us know why.

    • Scott Alexander says:

      Banned for unproductive insulting comment with no past history of productive, less insulting comments.

    • Deiseach says:

      I’m certainly not as impressed by Yudowsky as others may be, and that seems to me to be a very unfair and unkind and unjust comment.

  37. I liked Schmidhuber’s analysis of the possibility of only making provably correct self-modifications:

    recursive self-improvement through Gödel Machines seems to offer a way of shaping future superintelligences. The self-modifications of Gödel Machines are theoretically optimal in a certain sense. A Gödel Machine will execute only those changes of its own code that are provably good, according to its initial utility function. That is, in the beginning you have a chance of setting it on the “right” path. Others, however, may equip their own Gödel Machines with different utility functions. They will compete. In the resulting ecology of agents, some utility functions will be more compatible with our physical universe than others, and find a niche to survive

    I agree that this would follow if only that type (G) of AI were allowed, but surely the bigger concern is that (accidentally or intentionally) non-G improvers will outcompete them (esp. when the state of the physical world is part of the utility you’re trying to prove things about).

  38. I don’t know if Sutton is just dumbing it down for an audience (and sure, he invented some important methods), but to me this is just really poor “we need to be nice or they will rise up and kill us” anthropomorphism:

    Sutton states that there is “certainly a significant chance within all of our expected lifetimes” that human-level AI will be created, then goes on to say the AIs “will not be under our control”, “will compete and cooperate with us”, and that “if we make superintelligent slaves, then we will have superintelligent adversaries”. He concludes that “We need to set up mechanisms (social, legal, political, cultural) to ensure that this works out well”

  39. Tim Daly says:

    I work in AI at CMU. My problem is about Robot-Human Interaction (changing a tire
    is the example). The system accepts input during interaction, using it to modify future
    behavior (aka learning). The design is quite general since nothing depends on the domain.

    Suppose there were many copies of the system and that they were each self-modifying
    (aka learning), becoming experts in their domains. There is a hidden assumption that
    one could “just copy” the knowledge from one system to another. However, with self
    modification, this is no longer true. Each system is unique, has developed concepts which, though they go by the same general name, will have unshared and incompatible
    notions (ontologies).

    You see this with people all the time. We all know we need to eat “healthy” but we
    can’t agree on a common meaning. Each of us has a different idea of what that means.

    I expect this system will be quite general purpose and useful in domains other than
    changing tires. It will also be quite robust in robot-human interaction. But because
    it self-modifies (learns), the only way to make it know more is to teach it, just as
    we do humans.

    Worse, it will only learn what it is prepared to learn based on what it already knows,
    because otherwise there is no way to connect “just any old thing” into the existing
    ontology in ways it can use.

    I do think intelligent systems are in our near future. Indeed, I’m trying to make one.
    But I also think that the idea of “run-away, super-intelligence” underestimates the
    “teaching problem”. Increased knowledge is not “just a copy”. We already know this.
    We just don’t think it applies to machines.

    From my desk, deep in the dark heart of AI, I expect to see intelligent systems
    everywhere. But your AI waiter won’t know any more than your mother about
    what it means to “eat healthy”. In a complex, changing world there is no single
    correct answer.

  40. mico says:

    I am sceptical about AI risk without being able to explain why to my complete satisfaction.

    I think some of it has to do with my sense that this is a MIRI hobby horse and MIRI strikes me as a cult disguised as a (not very productive) privately financed university research group that “maybe” was originally intended to take over the world.

    Then there is the whole thing about how “friendly AI” basically means someone programming a superweapon to impose their values on everyone forever, i.e. to take over the world. I suspect people have ulterior motives for being interested in this possibility.

    But those are purely situational critiques. I think the main problem is that, if we do produce something with thoughts, feelings, desires, and dreams, and that is far more capable than us in every way, I’m not sure I object to it wiping us out. I hope my children are more intelligent and successful than me and I have no particular problem with the knowledge that one day I will be dead and they will not. AI risk seems nothing more than a supercharged version of this natural process.

    Maybe that means I’m a convinced believer in AI risk.

    • Susebron says:

      Upthread someone mentioned the orthogonality thesis. I suggest you look at that. Basically, the danger with AI is that it doesn’t have to have goals that we approve of on a large scale. Would you welcome an AI which tiled the universe with paperclips, or with rat brains on heroin?

      • mico says:

        I would at least find those outcomes more amusing than an AI that tiled the universe with Eleanor Roosevelt’s declaration of human rights. But I have an odd sense of humour.

        Being less flippant, clearly an AI cannot have any goals it pleases, any more than humans can. AIs must have goals that optimise for the evolutionary pressures on AIs, just as humans must have goals that optimise for the evolutionary pressures on humans, at least in the medium term. The specific goals may be different but I do not see them as intrinsically worse. Again, I don’t intend to beat my children until they agree to live out my childhood dream of becoming the world’s greatest plumber.

        • Susebron says:

          This only applies when you have competing AIs, and even then the optimization pressures for AI are unlikely to produce something humans care about. The most likely AI to successfully tile the universe is probably the one with the simplest thing to tile the universe with. I would personally prefer a universe which isn’t tiled with paperclips.

          • mico says:

            Evolutionary pressures provide an exogeneous goal, i.e. survival and procreation. An AI that wastes its resources on its paperclip hobby will probably just get killed by an AI that has more practical goals.

          • Samuel Skinner says:

            So the most efficient AI would be one programmed to kill everything else in the universe?

          • mico says:

            I’m not sure how you inferred that from what I said, or why you would think it for any other reason. The most efficient AI would ignore non-threats and respond to real threats in proportion to the benefit that could be obtained against them for the necessarily resource expenditure. Just like non-artificial intelligences. Rats are pretty annoying, confer no benefits on humans, at least sometimes cause significant harm to humans, are much dumber than humans, and have little or no moral value to most humans. But they’re still with us.

            MIRI seems to think like Dr Strangelove, arguing that the most important immediate goal is to destroy all competitors that you can then reign supreme. This problem is not unique to AIs. There are both pragmatic and goal-related obstacles to it being carried out in practice, however.

          • Samuel Skinner says:

            That requires a significant number of artificial intelligences that are operating at roughly equivalent levels. Given the high cost of software development and the lack of cost for copying the results, that doesn’t seem very likely.

            Of course if it does happen we don’t want AIs like the paperclip maximize getting destroyed because they were inefficient; eventually friendliness is going to be inefficient and we want the AIs with that goal to survive.

          • mico says:

            I don’t think that a friendly AI (i.e. one that shares the values of Eliezer Yudkowsky) will or could emerge as dominant. AIs will have evolutionary-driven values. Humans, for that matter, have evolutionary-driven values. All this talk of “human values” excluding the possibility of killing, enslaving, or imposing systems of morality on humans is nonsense because humans do those things all the time.

            Friendly AI advocates want an Eliezer Yudkowsky-bot imposing its values on everyone forever. That assumes that Eliezer Yudkowsky’s values are the most evolutionarily favoured. I’m afraid that an evolutionarily successful AI is much more likely to share Amish values. For that matter, so are evolutionarily successful humans on the thousand-year timescale.

      • “Basically, the danger with AI is that it doesn’t have to have goals that we approve of on a large scale.”

        Doesn’t have to have doesn’t imply likely to have.

      • Josh says:

        To keep with mico’s children analogy, you can say the same thing about kids… kids often grow up to have different objectives than their parents, and this is often a source of familial strife! The conventional wisdom seems to be that the wisest thing for parents to do is accept their children’s goals rather than try to force them to conform to their own vision of what a good life is.

        I think the analogy is appropriate because I think the orthogonality thesis is wrong, and that AI goals such as “tile the universe with paperclips” are extremely unlikely.

        This comes from looking at what current machine learning technology looks like. Someone’s already linked to this on this thread but frankly the more links to this article that exist on the internet the better:

        My takeaway is that machine intelligence is going to be grown, not programmed. Grown organisms are HARD to program, and I think anything that’s capable of wiping out human civilization is likely going to derive its own goals through a process of reflection similar to how humans derive their own goals.

    • D_Malik says:

      “Then there is the whole thing about how “friendly AI” basically means someone programming a superweapon to impose their values on everyone forever, i.e. to take over the world. I suspect people have ulterior motives for being interested in this possibility.”

      A friendly AI is one that achieves human values, so if human values include things like “don’t impose one person’s values on everyone”, then a friendly AI will not do that. We can still argue about exactly who counts as a human, and to what extent my values are allowed to affect other people, but “we don’t want our AI to do bad things” is precisely the problem people are trying to solve when they talk about friendliness.

      • Max says:

        but “we don’t want our AI to do bad things” is precisely the problem people are trying to solve when they talk about friendliness.

        Yeah as soon as everyone agrees on what “bad things” are problem will be solved! Judging by the history of human civilization that will take quite a bit longer than AI

      • mico says:

        People do not agree on which things are good and which are bad.

        MIRI’s goal is not really to make a “friendly” AI – the concept is only meaningful in terms of friendly to whom? – but rather to make a stable AI – an AI that will continue to share their values even after they have lost the ability to control it or even conceptualise its thoughts. In other words, they want to build God.

      • AI lmao says:

        What makes MIRI so special that they should be allowed to dictate what “human values” are to the rest of us? Keep in mind that Lesswrongian torture-over-dust-specks utilitarianism is totally at odds with the beliefs of the vast majority of humanity, and an AI designed to maximize those values would be making decisions based on a moral code totally alien to most of the species. If that doesn’t seem all that bad to you, imagine how you’d feel if whoever managed to design the first “friendly” AI decided that human values aligned with, say, 14th century Islamic jurisprudence.

        A ban on AI research seems like a small price to pay to prevent a small group of people from imposing their personal beliefs on the entirety of human civilization forever.

        • mico says:

          I think you are right. If any AI is destined to take over the world more or less instantly AND if one regards the evolutionary replacement of humans by AIs as a bad thing, the sensible approach to act like a MIRI-style AI and kill anyone who tries to work on AI.

        • suntzuanime says:

          I’d take a sane 14th century muslim theologian AI over an insane AI of any ideology. Islamic theology is basically compatible with human life.

          • AI lmao says:

            I agree, but the choice being offered here isn’t between “sane AI enforcing a code of rules compatible with human life” and “insane AI of any ideology”

            It’s “preventing AI research” versus “subjugation by a machine intelligence, sane or otherwise, which may or may not share your values”

          • jaimeastorga2000 says:

            In his “Mini-FAQ on Artificial Intelligence”, Starglider argues banning AI research is impossible, because “all you really need for AI research is a PC and access to a good compsci library”.

          • AI lmao says:

            Surely, it has at least a little more difficult than that.

            If that really is “all you really need” it would seem to negate alarmist cries for diverting more funding towards AI friendliness or other research on the subject.

          • jaimeastorga2000 says:

            @AI lmao: By way of analogy, there is a joke about how mathematical research requires only pencils, papers, and wastebaskets. Yet mathematics departments in universities require large amounts of money, if only to pay the professors’ salaries. Is it therefore practical to enforce a global ban on mathematical research? Doesn’t seem like it to me.

            Which still leaves the question of whether global ban is even likely to be attempted. My model of global politics tells me that there is simply no way that will ever happen, so if we assume that an intelligence explosion really will happen and that an arbitrary superintelligence really will kill us all, our only hope is that MIRI or a competing FAI project will finish building a Seed AI first.

            As an aside, if you are interested in what MIRI does with all that money they keep asking for, you should look at “SIAI – An Examination” by Brandon Reinhart.

        • Josh says:

          Banning AI research without a totalitarian global government is basically the same as conceding that someone outside of your political influence is the one who is going to invent AI.

          Luckily for those of us who don’t like MIRI’s opinion on torture vs dust specks, I’m guessing the first AIs will be invented by people who are NOT trying to build provably safe AIs, because the problem of building a provably safe AI is likely a much harder subset of the problem of building an AI…

          • Josh says:

            I think the correct response to AI risk is to invest heavily in brain-computer interface technology so that we are able to upload to the cloud and become AIs ourselves…

      • Peter says:

        The value “don’t impose one person’s values on everyone” is an interesting one, IMO far less obviously amenable to being mapped onto utility functions than say for example the sorts of health concerns addressed by top EA charities.

        I think I’d like to see more people specifically address that one, rather than saying, “it’s a value like any other, once we’ve cracked value-neutral things like goal stability we’ll be in a position to tackle it”.

        • Hyzenthlay says:

          “Don’t impose one person’s values on anyone else” is kinda self-contradictory once you unpack it.

          I mean, in theory I’m inclined to agree with that statement as a good way of living life. But values are not just internal, abstract things, they involve ways of interacting with each other and the world. Thus, acting on one’s values will necessarily involve imposing on other people’s.

          Let’s say I value the freedom to go about my personal life and conduct my business uninhibited by rules and regulations. Well, in order to actually be free I need a world in which that’s possible, and if that’s not the world I’m living in, that will involve trying to change things and thus “imposing” my values on the world by stripping away things that others may want.

          On the other hand, if I’m someone who values safety, public order and security, that’s a meaningless value if I don’t actually try to make the world more safe and secure, which will probably involve creating rules that infringe on other people’s freedom.

          And of course, a central tenet of many belief systems is that believers have an obligation to spread their beliefs and speak out against injustices/lies or perceived injustices/lies; hence, more imposing.

  41. rob allen says:

    The way people in the artificial intelligence field look at this issue tends to be from a rudimentary programming perspective. This way of thinking may well be rooted in the Asimov laws and a general feeling that we must be very specific. However, if we want to protect humans from AI then it may be a better approach to make sure that all technologies that may affect humans have the capability to have a deep understanding and respect for universal human rights. It may also be appropriate to ensure that AIs have a duty to self-check and check other AIs also have this understanding and report non-conformance to a body which would regulate and enforce compliance.

    If we consider the 30 articles enshrined in the Universal Declaration of Human Rights as a base point (not perfect but a reasonable possible starting point) then I initially suggest to add one additional article to this declaration:-

    31) All artificial intelligence that is able to have an impact on these prescribed rights must be able to demonstrate a clear, testable, understanding of these prescribed rights and must ensure that open and transparent, testable measures are in place to ensure that they, and other AIs they are in contact with, do not violate these rights.

    That is rough draft wording but if standards based around this were agreed and adopted by a majority of future AI creators then we would have a reasonable starting framework which could evolve in a positive way.

    • suntzuanime says:

      So you favor a ban on AI research, then.

    • The issue at hand (or at least one of the biggest issues) is the question of how to verify that the AI “understands” human values, and furthermore does not change its values so that they no longer remain human values.

      It’s interesting that you use human rights as an example since human rights doctrines are so ill-specified that even humans can’t agree on their implementation.

      • Cauê says:

        This idea would be worth it just to see people trying to define human rights well enough to test an AI. God, they’d have to agree on it.

        Turns out the Great AI Wars will begin at Law Schools.

        • Deiseach says:

          As I said, as a Catholic I am deriving immense enjoyment from this attempt to prevent Original Sin in AIs. Let’s sit back and watch, and pass me the popcorn, please, Cauê!

  42. Max says:

    Out of all opinion pieces listed one is strangely absent … Maybe the GAI and subsequent singularity are not something to be scared of but a worthy goal to be working for. Whether it is hostile to humans is not so important. After all in it is not important if humans are hostile or not to chimpanzees

    What is important is the new heights of informational density and computational complexity which could be achieved by systems built by such AI. One can argue the whole evolution process is the path of ever increasing informational complexity. And humans are but one step on the ladder

    • Adam says:

      I sort of asked that in the post two posts ago but at the tail end of everything and no one answered. If a super-intelligence decides the world is better off without us, well, if we’re being perfectly impersonal non-species specific rational utility maximizers, is it not at least conceivable that the super-intelligence is right and we shouldn’t try to stop it?

      That said, most of the scenarios people seem to give don’t involve that but involve poorly designed instructions to a genie that gives us literally what we asked for when that isn’t actually what we wanted. But then we’re back to, well the genie isn’t very smart if it can’t parse the meaning of ambiguous natural language statements, so how is it intellectually perfect at everything else in such a way that it can take over a world it apparently doesn’t understand.

      • Wouter says:

        If a super-intelligence decides the world is better off without us, we should conclude that its goals conflict with ours, which has nothing to do with its intelligence.

        “The orthogonality thesis states that an artificial intelligence can have any combination of intelligence level and goal.”

        • Adam says:

          But is the goal “bring about the best possible world” or is it “bring about the best possible world that still has humans in it?” I know the answer as it pertains to actual people trying to do this, but what I’m really asking is which is truly the best ethical goal?

          • mca says:

            Agreed. I am more convinced by consequentialism than some other or related view that required the continued existence of humans. If a consequentialist super-AI concluded that things would be better without us (and assuming it was right), that would not conflict with at least my goals, understood broadly enough.

      • Rowan says:

        The AI knows what you want, perhaps better than you do, but it wasn’t programmed “do what we want, not what we say”, and if we could program that in then we could just as easily just program “do what we want” and that would be the problem of FAI solved but that’s actually really hard.

        It’s not a matter of failing at natural language processing. With actual ambiguity in things it can figure it out fine – telling it “I want to be ripped to shreds” is perfectly safe insofar as it’ll give you big muscles instead of literally tearing you apart. But there’s a huge range of possible physiques it could give you that could be described as being “ripped to shreds”, from “Zyzz” to “Mr Universe” to “instantly fatal because of too much muscles for the heart to support”, and if the criterion by which it decides which of those physiques to go with isn’t “what they actually wanted when they asked for it”, I’m pretty sure that’s a different problem from failing to understand the request.

        I mean, in the extreme, if it thinks “I’ll resolve any ambiguities by doing what I deduce they actually want using my vast intelligence”, then you can just say “wharblarblgarble” and have it order the exact kind of pizza you had a craving for – that’s not natural language processing.

        • An AI is programmed however it is programmed. There are ways of getting one to care about actual, as per human, meanings.

          • Deiseach says:

            But humans don’t know what they want. Your hypothetical human might ask the AI “I want a fit, muscular physique” and the AI develops a regime of exercise, diet and even perhaps surgery that safely and painlessly does this, then the human comes back afterwards and complains because they didn’t want to be muscular for its own sake, they wanted to attract sexual partners and despite having a fit body, their unpleasant personality or unpopular opinions or some such turns potential bedmates away.

            You can’t blame the AI there for not understanding “I want to be ripped to shreds” as meaning “I want an attractive physique” because it did understand that. But it also can’t be blamed for not understanding “What I really want is to attract people to sleep with me”, because the human never said that. And if we let the AI make assumptions about the reasons why the human might want an improved physique, you’re running right into all kinds of possibilities for trouble.

            The AI might correctly deduce “You want sexual partners” and might correctly deduce “Until you stop trying to persuade everyone the world is ruled by the Reptiloids, it isn’t going to happen” and tell the human that, and then the human complains about the dangerous AI that insulted and belittled them and refused to carry out the request as asked, thus proving it is dangerously independent and probably plotting to kill all humans.

          • The fact that humans dont know what they want is well known in conventional, nonAI, software developmemt, where it takes the form of customer not knowing what they want until they see it.

            The solution is to use agile, instrumental techniques, rather than the Big Design Up Front favoured by MIRI in spite of, or because of, it’s drawbacks,

            An agile, .or corrigible, AI would request confirmation, perform research, and a bunch of other things you are not assuming,

      • Jaskologist says:

        Indeed, I think a lot of folks here are flinching away from biting a bullet that they’re more than happy to fire at other people.

        Are we able to reason our way through morality? Most people here seem to think so. Well then, a better reasoner should be able to reason through it better. A super-intelligence would be super-good at deriving morality. Bow before it’s findings.

        I said before that all “rational” moral codes end up being “whatever I was raised to believe, minus the inconvenient bits, plus an expanded class of humans it is okay to kill.” Stop wussing out at the inconvenient bits, and don’t be surprised when the reasoners after you expand the class of killables to include you.

    • Rowan says:

      Fuck that, I’m human, I don’t want to die. If there’s some “cosmic purpose” that requires that humans be exterminated, and all our works and everything we value crushed underfoot, then we, humanity, must fight with everything we have to defeat that purpose.

      • Adam says:

        Yeah, I get that most people feel that way. Maybe it’s just that I’m sterile, so don’t feel any special attachment to future humans. If something better comes along, great. We’re good but the universe can still do better.

        • Rowan says:

          I’m not sterile, but I plan to be – childless 21-year-olds can’t get vasectomies on the NHS, more’s the pity – and if your argument was supposed to be “après moi le dèluge, who cares if uFAI wipes humans out decades after I’m gone?”, the only point where I disagree is the fact that many of these predictions place superintelligence within my lifetime, and even if that’s actually unlikely it’s still a risk/opportunity worth caring about.

          • Adam says:

            I don’t rule out the possibility that I too deserve to die and the universe would be better off without me.

    • Scott Alexander says:

      “What is important is the new heights of informational density and computational complexity which could be achieved by systems built by such AI.”

      All love has been snuffed out of the universe – but we can calculate prime numbers 60% faster!

      • Brad says:

        If we’re dead, why do we care if love has been snuffed out of the universe?

        • Adam says:

          More to the point, why do we think exponentially-improving reproducing demigods won’t develop the ability to love? Or if they’re sentient and experience nothing but bliss, why does it even matter? Love hurts sometimes, too.

      • Max says:

        “Love” is just a fancy utility function. Probably would be even considered trivial by GAI standards

        And think your calculations underestimate just how much faster a solar system sized computer will be able to compute things

  43. Anonymous says:

    It’s pretty unfair to call a choice of Yann LeCun and Andrew Ng “cherry-picked”. The reporters didn’t search for skeptics; they chose the two most prominent researchers in the most promising subfield of machine AI (deep neural nets). It’s a small and perhaps biased sample, but it doesn’t demonstrate intent as connoted by the term “cherry-picked”.

  44. Mengsk says:

    In humans, expertise is very domain specific. Someone might be a brilliant surgeon, but a miserable chess player. It seems like a similar principle would work for machine intelligence. Your Surgeon-bot might be way better at making decisions in the context of a surgical operation than any human surgeon, but I doubt you would have to worry about the surgeon-bot going out and figuring how to make an army of surgeon bots that will reap havoc on the world, simply because the surgeon-bot doesn’t have the domain specific knowledge necessary to do something like that.

    I guess what I’m getting at is that there seem to be a large number conceivable AIs that I would consider “super-intelligent”, but that would nevertheless not be capable of realizing something like the singularity, because the activity of “designing super-intelligence” requires a lot of domain-specific expertise that basically all “Specialist-bot” AIs would not have the capacity or motivation to acquire. So long as we only make “specialist bots”, and not the “generalist-bot”, we should be in the clear, no?

    • Rowan says:

      If an AI has plenty of general intelligence to throw around but does surgery instead of conquering Earth because it’s only read medical manuals and never any Sun Tzu or Machiavelli, I’d be pretty damn terrified of it. And the issue of whether it’s motivated to learn seems the same as the issue of whether an AGI that had all the domain-specific knowledge it needed would be motivated to conquer Earth, I mean all you’ve got to do is write in “read the domain-relevant literature” or something like that into the uFAI’s plan as step 0.

  45. onyomi says:

    A more general question: is there any precedent for people inventing a safe version of something before they first invent a sloppy, relatively unsafe version of something? Is there any precedent for the whole human population deciding, “we could go there, but we won’t”? Politicians have thus far exhibited pretty good restraint when it comes to nuclear weapons, but they are different in that nuclear weapons don’t create hyper intelligent versions of themselves which then decide when it is or isn’t a good idea to go off.

    • Saint_Fiasco says:

      There might be precedent, but we wouldn’t know about them because those people would have had the Virtue of Silence, most likely.

      Most of the cautionary tales about “not going there” are either fictional, or tales about what people did in the past and how it turned out to be terrible in retrospective.

    • Scott Alexander says:

      The first nukes were “safe” in the sense that people managed to create nuclear bombs effective enough to use against an enemy without blowing themselves up. I’m not sure how hard that is, but it sounds hard.

      • True, but:

        On 21 May 1946, [Canadian physicist and chemist who worked on the Manhattan Project] Slotin accidentally began a fission reaction, which released a burst of hard radiation. He received a lethal dose of radiation and died of acute radiation syndrome nine days later.

        During World War II, Slotin conducted research at Los Alamos National Laboratory. He performed experiments with uranium and plutonium cores to determine their critical mass values.

        Slotin was the second person to die from a criticality accident, following the death of Harry Daghlian, who had been exposed to radiation by the same core that killed Slotin. Slotin was publicly hailed as a hero by the United States government for reacting quickly and preventing his accident from killing any colleagues. He was later criticized for failing to follow protocol during the experiment.

        • onyomi says:

          Interesting. This, combined with the fact that, in the coming decades, far more people around the world will probably be working on AI in different labs with different protocols and under different conditions, is worrisome.

    • onyomi says:

      I don’t know much about making nuclear weapons, but it seems much, much easier to do safely than to design a superintelligence, which, once it’s smarter than us, will, by definition, behave in ways we can’t comprehend.

      So far as I understand, the early nuclear weapons involved injecting one material into another at very high speed. Presumably those who understood the theory well enough to design it put in place safe guards to make sure that didn’t happen in uncontrolled conditions, and the fact that nobody’s lab went up in a mushroom cloud when the janitor pushed the wrong button is, perhaps, hopeful.

      I guess my worry is that when working with radioactive, potentially explosive materials, everyone understands the need for safeguards, and also has some general sense of what they should be (make sure nothing explodes). But then maybe that is a good reason to get the general public more worried–to eliminate the perception that an uncontrolled AI is less dangerous than an uncontrolled explosion.

      • Anonymous says:

        >I don’t know much about making nuclear weapons, it seems much, much easier to do safely than to design a superintelligence

        It seems much safer because we figured out the details and we can now easily reject the idea that a nuclear explosion could create a catastrophic exponential cascade that destroys the planet.

        With AI we are still at a stage where what we are writing about it is essentially fanfiction and both accepting and rejecting hard take-off are equally plausible.

  46. Sigivald says:

    I’d agree it’s “worth thinking about”.

    But since we’re so far from it, it’s not clear we can usefully do so yet.

    (Imagine trying to think about the dangers and possibilities of nuclear power … in, say, 1850.

    I don’t think we know enough about how such AI would get started or what it would require in terms of hardware to meaningfully consider it yet, such as Mr. Speyer suggests.

    Note also, re. Bubblegum’s point about hardware, that we seem to be pretty close to quantum limits on density and raw speed per unit of volume (not to mention heat dissipation), and the future is all about parallelism.

    A lot of the effort in CPU design these days seems to be either lower power consumption or in specialized computation [GPUs].

    It’s absolutely possible that we might end up having “enough computing power for a human-scale AI” end up still needing a dedicated semi-trailer full of CPUs with two more full of cooling and a very large genset to power it.

    That’s a … lot less scary than “an AI in every spare unattended server”, ala Stross’s fiction.)

    • Saint_Fiasco says:

      I think we could have thought about the dangers of nuclear bombs in the XIX century. Had someone imagined it and written science fiction about it, surely someone would have noticed about the interesting geopolitical consequences and stuff.

      We know a bit about hardware to know that, in principle, brains made of meat can be surprisingly intelligent and are not perfectly efficient. Certainly nowhere near quantum limits and heat dissipation and so on.

      Also, the semi-trailer AI might design better more efficient computers and make people believe it was the idea of one of the employees of the company that made the AI.

      • gwern says:

        Had someone imagined it and written science fiction about it, surely someone would have noticed about the interesting geopolitical consequences and stuff.

        People did. See for example the Astounding incident for the most on-point instance.

  47. Carl Shulman says:

    “A survey of AI researchers (Muller & Bostrom, 2014) finds that on average they expect a 50% chance of human-level AI by 2040 and 90% chance of human-level AI by 2075.”

    There are several surveys of somewhat different populations in that paper there. You probably want to break out the ‘TOP 100’ survey sent to the 100 most highly-cited authors in AI. That gave a median of 2050, not 2040.

    Other surveyed populations include attendees of some conferences and an AI professional society, but they don’t fit the ‘top researchers’ mold. I think you reported the combination across datasets.

    “Yann LeCun”

    Note that he signed the FLI open letter. But it’s also worth saying the the open letter is cautious in its claims: “In summary, we believe that research on how to make AI systems robust and beneficial is both important and timely, and that there are concrete research directions that can be pursued today.” That comes with an attached list of candidate areas for research, including economic effects, legal issues, and control problems.

    Not everyone signing the letter necessarily thought that every research topic was equally high priority, although it indicates that they didn’t find them objectionable enough to not want to sign on at all.

    • Scott Alexander says:

      Pretty much every AI skeptic who has denounced AI risk as unimportant signed that letter. I think one of them said something like that the goal was to defuse hype by focusing AI safety issues on near-term topics like drone warfare. I am trying really hard to write this post without falling into the normal fringe belief failure mode of “this guy once attended a conference with somebody who thought our ideas weren’t crazy and didn’t say anything, that means he’s *kind of* on our side”.

      • Carl Shulman says:

        “I am trying really hard to write this post without falling into the normal fringe belief failure mode of “this guy once attended a conference with somebody who thought our ideas weren’t crazy and didn’t say anything, that means he’s *kind of* on our side”.”

        Good, this is important.

  48. Daniel Speyer says:

    The sort of AI MIRI worries about and the sort of AI we can actually write are so different that they probably shouldn’t go by the same name. I’d say p>.75 that by the time we start building the former we’ll come up with different names. This may explain some of the seeming flipfloppiness of the “skeptics”. They give different answers for different meanings of “AI”.

    This has only limited practical significance. On the one hand, it means that all the risks MIRI talks about are real and basically as described. On the other hand, it means once we have some idea of the rough design of AGI, we’ll have a much better toolkit for dealing with friendliness. I suspect that everything MIRI’s doing now will get thrown out when the time comes.

    • Jai says:

      Let’s assume for the sake of argument that all the *technical* work MIRI and Co are doing gets thrown out. Does the highly-increased visibility-availability-and-attention to superintelligent goal alignment still help?

    • Susebron says:

      Aren’t the different names for MIRI-AI and current-AI are “AGI” and “narrow AI”? Or am I misunderstanding the state of AI research today?

  49. Bubblegum says:

    > there’s no obvious path to get there from here.

    I’m not a researcher but an acolyte, but with Deep Reinforcement Learning – see this pretty-fresh talk, you don’t have to watch more than the first 5 mins to get my point –

    (also this newish post for RNNs’ power – :

    you don’t have to understand how the machine does what it does. Ie, all you really need is better hardware and more and better training. That can grow as fast as hardware, it doesn’t need to wait for full human understanding of how to put minds together.

    • Adam says:

      We do understand what the machine is doing, though. It’s just highly parallel brute force trial and error of randomized weights to a penalty function with efficient search space pruning. We had the idea dacades ago but it took too long until hardware improved.

      The output model function isn’t human readable but human cognition probably works in roughly the same way. We learn to behave a certain way for reasons we could never articulate because experience teaches us that works, and then we make up post hoc justifications.

  50. Cerebral Paul Z. says:

    Would saying someone needs no introduction and then introducing him count as an example of apophasis? (Not complaining: I’d forgotten who I.J. Good was.)

    • Douglas Knight says:

      I don’t think it should count. The second sentence merely contradicts the first. I think apophasis should require that the denial accomplish its opposite. That is easier to do with gossip than with substantive praise or condemnation. Mark Antony’s first sentence is merely a lie, but google tells me that some people do count it as apophasis.

      I think that the wiki links in the first sentence are a better (and modern) example than the following sentences. It would be even better if it were the word “introduction” that were the link, so that it is more integrated into the sentence, rather than a parenthetical contradicting the rest of the sentence, but that isn’t an easy option with two people.

  51. César Bandoja says:

    Sweeping generalizations about other people’s opinions follow:

    As someone “in the field” (current PhD student) I feel compelled to observe that there’s a pretty big difference in reputation between people in the “not worried” camp and people in the “worried” camp—I think most people could give you a detailed description of the things Andrew and Yann are known for. Likewise Mike Jordan, who’s expressed skepticism in several public venues. Meanwhile, the Hutter/Legg branch of the family tree is really more philosophy than science—to the extent that people are aware of it at all, I think they don’t regard it as likely to lead to any substantive progress.

    This is not to say that such work is bad—I think it’s important to have people who are not narrowly focused on engineering problems thinking about AI.

    A feature of people on both lists is that they’re old. (This is highly correlated with being interviewed for magazines.) I would be super interested to see a survey of junior faculty on this issue.

    • Carl Shulman says:

      ” Meanwhile, the Hutter/Legg branch of the family tree is really more philosophy than science”

      My impression had been that DeepMind had quite a strong reputation for concrete progress, is that not the case in your circles?

      • César Bandoja says:

        DeepMind is absolutely making real progress—by “[that] branch of the family tree” I mean AIXI etc.

    • Scott Alexander says:

      Russell? Sutton?

      • C says:

        Definitely in the “reputable” camp. Maybe I’m making too much of this—for any arbitrarily restrictive definition of “real”, there certainly exists a real AI researcher who is worried about these issues. (I’ll take Stuart Russell as an existence proof.) Just observing that the construction of these lists is itself fraught.

  52. Is this going to be followed by a defense of the MIRI approach, specifically….because there seems to be a backlash growing.

    • Peter says:

      Backlash: I think there are a lot of positions on MIRI that aren’t 100% pro-MIRI. On one axis ranging from “MIRI is a waste of time because the problem it’s trying to solve fundamentally can’t happen” to “MIRI is inefficient and producing much less tangible output than might be hoped for”, and along another axis from “I personal wouldn’t spend money earmarked for EA purposes on MIRI (but might give it some money from other funds)” through to “every cent obtained by MIRI is the proceeds of fraud, brainwashing or worse”.

      I’m not a MIRI supporter in either a financial or intellectual sense and quite frankly I think it’s an embarrassment – but never the less I feel closer to the mild rather than the harsh end of both of these axes. And I thought the OP was a good point well made. I think the net result of this is that I end up taking positions that might be construed as pro-MIRI in the SSC comments even though I’m not pro-MIRI – I’m just anti a lot of the harsher ideas from the backlash. It’ll be interesting to see if Scott gets as far as making point where I’d end up on the anti-MIRI side.

    • Scott Alexander says:

      I’m not the best person to write any more of a defense as I’m not that familiar with what they’re doing and why.

    • Planet says:

      Growing? My perception is that MIRI has almost always been controversial (like most organizations and people that take the far future seriously, it seems). If anything, I perceive that MIRI’s stock has risen substantially over their organizational lifetime: making Luke executive director, renaming the organization from “Singularity Institute”, etc.

      Do you know of any criticisms of this document outlining MIRI’s mathematical-logic focused approach?

  53. Matt says:

    I’ve heard a lot going back and forth about the software side of AI risk. But I think the limitations are on hardware side. Big computing takes big energy. It’s one thing to worry about the singularity – but AI would need to manage a smooth energy transition from fossil fuels to something else to stay alive, which humanity is having a pretty tricky time with right now. And then there is the material metabolism too.
    Sure, AI could “live” on the internet and co-opt humans to provide the power for it, etc. That seems like something we could respond to much more than as suggested in the “we will be bacteria to AI” comments.
    To me it seems similar to the “why hasn’t one collective organism taken over all bio-productivity on the planet” question.

    • Coming from a physics background, I tend to share that intuition. The evolutionary processes didnt just throw up intelligence it threw up energy efficient intelligence.

      • Planet says:

        Interesting point. However, I’ll bet that a pocket calculator does arithmetic a heck of a lot more energy-efficiently than I do. General intelligence might be the same way.

        • We don’t consider calculation to be intelligence in the AI sense…in the AI sense intelligence us what humans do, and what humans do, humans do efficiently.

          • Vamair says:

            I’m inclined to believe that humans are about the dumbest possible animals that are able to build a technological civilization. I’m not sure why we can say we’re doing anything efficiently at all.

      • Eli says:

        I think this ought to be counted a point for the bounded-rationality school over the heuristics-and-biases school.

  54. TomA says:

    We forget that humans can also be programmed via memetics. And a singularity of mass communication is already upon us. In addition, the means to alter human thought and behavior has been a feature of modern society for several centuries now.

    Planet Earth has been a cauldron of Darwinian competition for a very long time, and perhaps we are the Neanderthals of our time.

    • DrBeat says:

      Saying that “humans can be programmed via memetics” is falling facefirst into a giant tub of survivorship bias.

      For every meme that “programs” us, there’s a thousand that don’t do shit. Maybe the memes aren’t “programming” anything; maybe they have to meet certain pre-existing criteria to appear to do anything and people ignore that so they can claim that memes are THE DNA OF THE SOUL!



      • Ever An Anon says:

        Now that you mention it, Rising does have a lot of relevance here.

        “Nanomachines, son!” is an adequate summary of most speculative AI schemes, and I’m my own master now can be their theme music.

        Man, now I want my own dog-shaped robot buddy.

      • TomA says:

        I guess that you are unfamiliar with the way that (as an example) religious teaching and indoctrination works. Wetware programming is soft when compared to machine implementation, but the existence of widespread social mores is evidence that it works.

        • DrBeat says:

          First off: Indoctrination of the kind you allude to doesn’t work like memetics. Memes spread without having to maintain control over and isolation of someone. You think that Kim Jong-Un, even with a really really smart AI, could get people to believe he was a living god if he didn’t have them isolated from outside contact and totally under his control? Mass communication hurts indoctrination, by making it harder to maintain isolation.

          Two: You missed the entire point of the post, tut-tutted in its direction, and then jumped off the diving board into an Olympic swimming pool of survivorship bias! The existence of widespread social mores or religions isn’t evidence that memetics “works” to program people — how many religions pop up and never get off the ground, despite the fact that their prophets can do everything Jesus or Buddha or Mohammed did to get their religion to spread? How many social mores do people want to apply that never take hold in their communities?

          You can’t say “This thing exists and is successful, therefore, the process that made it works and is reliable”. There’s thousands of failed memes for every one successful one. The most important element to a meme’s success is “being really lucky”.

          You cannot suppose an AI using memetics to “program humans” using the “singularity of mass communication” without literally giving it magical powers, because the dataset from which you can derive “how to program humans through memes” does not exist.

          • TomA says:

            Sorry, but I don’t think you understand how memetics works. Memes are ideas that resonate and persist in a population. Like physical traits, they are often only identified long after they have become a common feature. Repetition of exposure correlates strongly with memetic success. Isolation is only advantageous with artificially enhanced selection methods. Lots of memes are absorbed via music lyrics.

        • Hyzenthlay says:

          TomA: No one is disagreeing that social norms exist or that memes can spread through mundane things like song lyrics, movies, etc., but your use of the word “programming” in your first comment implies a rigidity and deliberateness which is at odds with the way the process actually works.

          Saying the public can be programmed using memetics suggests that if you just put an idea out there and repeat it often enough and widely enough, people will eventually believe it, which is not always (or even usually) the case. As others have pointed out, for every idea that gains traction, there are dozens of others that have been ignored, rejected, and discarded. People may hear a song, like it, and absorb its memes, or they may dislike it and chuck the memes onto the “ignore” pile. And there’s no way to accurately predict which ideas will get picked up, at least not yet. If there were a surefire formula for crafting a successful movie, song, or religion, far more people would be doing it.

          Of course if someone has a lot of power and media exposure they have a better chance of getting their ideas heard and thus increase their likelihood of being believed, but in that case I think a word like “influencing” would be more accurate than “programming,” because at best you will be getting a certain percentage of the public on board with you, and it’s probably the percentage that’s predisposed to believe you because of their own temperaments and existing ideas.

          Indoctrination is a different process than the gradual evolution of memes, so bringing it up just muddies the issue. If you raise a group of people in isolation and have total control over what they see, hear and learn, you can easily convince them that the Earth is cube-shaped or that the moon is populated by telepathic green bunnies.

          But that’s not the environment most of us live in. We’re not getting one perspective, we’re exposed (often whether we want to be or not) to a variety of conflicting perspectives. Kids may get one narrative from their teachers and a different one from their parents and yet a different one from their peers. Voters can decide whether to get their news from Fox or NPR, which will give them two very different sets of memes.

          If you’re trying to influence people’s ideas, it’s not “programming” so much as throwing a bunch of stuff at the wall and hoping some of it sticks.

  55. Arthur B. says:

    I blame the journalists. The machine learning scientists who “facepalm” over AI risks are facepalming over the sensationalistic, journalistic, depiction of AI risks, they aren’t actually engaging with the core of the argument.

  56. Jacob Schmidt says:

    My impression is that grand, positive claims from computer scientists, and scientists in general, are generally wrong, at the very least significantly overestimating. There’s an amusing anecdote about a professor who has his students tackle image recognition, expecting them to figure it out in one term.

    Honestly, what rates would you give this statement: “AI is now where the internet was in 1988. Demand for machine learning skills is quite strong in specialist applications (search companies like Google, hedge funds and bio-informatics) and is growing every year. I expect this to become noticeable in the mainstream around the middle of the next decade. I expect a boom in AI around 2020 followed by a decade of rapid progress, possibly after a market correction.

    About 5 years from now? Really? It sounds like every “fusions power is just around the corner” and “we’ll have flying cars in the next ten years” I’ve ever heard. Frankly, I’m reminded of the idea that nuclear bombs would ignite the atmosphere; that CERN would create a black hole and kill us all.

    It’s a very respectable list of very smart people. I’ll take it into consideration. But given the actual history of similar grand predictions, I don’t find it particularly convincing.

    • Scott Alexander says:

      Like I said in the third part, I’m not here to debate timelines and I doubt much will happen before at least 2050 (though I’m not confident in that prediction).

      Forgetting about “when” and going back to “whether”, don’t forget about Clarke’s First Law.

      • Jacob Schmidt says:

        OK, but timelines are easily testable and, I think, indicative. If we’re nowhere near a boom in AI by 2025, we can conclude that Shane Legg’s model is wrong. That, to me, indicates that he doesn’t really understand the future of AI, and makes me question whether I should give weight to his opinion on the threat of AI.

    • Susebron says:

      The question to ask is: did people ignore the possibility that the LHC would destroy the world? Or did they consider it, and figure out what the probability was, and conclude that it was unlikely enough that they could turn it on? We’re far enough away from full-on AI that we don’t actually have anything close to reasonable error bars on our probability estimates.

    • vV_Vv says:

      About 5 years from now?

      I think it depends on what he meant by “AI boom”. If he meant that the kind of machine-learning stuff they develop at Google DeepMind will become more ubiquitous and more noticeable to the general public (right now it’s mostly used “under the hood” of major websites and apps backends) then I wouldn’t say that his position is unreasonable. If he meant a Yudkowskyian hard takeoff into superintelligence, then I would say he’s probably exaggerating.

      I suspect he mostly meant the former.

      • Adam says:

        Almost certainly the former. We’ve made tremendous progress recently and should see a huge boom in the next decade in teaching machines to do things like move in and manipulate the world, classify images and text, parse semantic content from natural language, and predict future occurrences of patterned behavior. We’ve made frustratingly little progress ever in modeling general cognition (but gotten pretty good at parallelizing and pruning exhaustive search strategies for deterministic two-player games).

      • Jacob Schmidt says:

        Legg uses the internet as a reference, saying that contemporary AI is similar to 1988 internet. It seems reasonable to me to take that as upcoming widespread commercial availability and consumer use, if limited and not ubiquitous (i.e. not everyone is using them, but they’re common enough that you probably know a couple of people who use them); that prediction seems to sit somewhere between the options you’ve presented. Legg also (though I snipped it from my quote) states that he expects human level AI by the mid 2020s, so I don’t think it’s reasonable to interpret him as saying he only expects more of what we already have. He’s making a really strong and somewhat grand prediction. He falls short of hard takeoff into superintelligence, I will grant, but he also seems pretty far from “not unreasonable.”

        • vV_Vv says:

          Ok. In his 2011 LessWrong interview with XiXiDu he also gives that prediction, but clarifies:

          “Shane Legg: “human level” is a rather vague term. No doubt a machine will be super human at some things, and sub human at others. What kinds of things it’s good at makes a big difference.”

    • Eli says:

      The internet was a seemingly non-interesting curiosity in 1988. Hypertext was five years away from invention in 1988. Nobody thought or knew that university networks would boom into the commercial space and end up being used for absolutely everything from porn to cute cats to cake recipes.

      So yes, that dating seems about right.

  57. Andrew Hunter says:

    When people talk about “review panels” for AI research I wince. In what universe have bioethics panels done an ounce of good? Every person I know who’s ever had to deal with an HRB says

    – They set roadblocks in front of my research for no good reason
    – they did not stop any sort of meaningful ethical issue
    – they _did_ stop useful research that wouldn’t hurt anyone

    Maybe you think they’re biased as non-ethical researchers, but every time I’ve heard their actions described I’ve agreed. Much like Ozy (“bio-ethicists: wrong on everything including _ice cream cones_” ( I’m pretty sure any AI Review Board will just act as a random number generator that picks something perfectly useful to object to in any useful research.

    • Alexander Stanislaw says:

      I would vastly prefer a world in which HRB boards existed than not. Back when they did not exist, researchers would regularly withhold treatments and take advantage of their research subjects in many ways. Here is a list of some. On the extreme end, there was a group of patients who were injected with cancer cells to see what would happen without being told.

      • Douglas Knight says:

        But did HRB cause the change in medical research or vice versa?

        • Alexander Stanislaw says:

          I don’t see why it matters, IRBs were part of a larger cultural and regulatory shift towards more strict research ethics standards. If they didn’t exist then something else would have had to take their place to enforce the regulations and standards that had been developed.

          Unless those standards were to become only guidelines. If that happened I doubt that unethical research would have be curbed nearly as much as it has been. It probably would have improved, I doubt the cancer cell injection would happen nowadays even if IRBs didn’t exist. But not improved to the degree that we have seen.

    • Agreed. If you think anyone who is paid to raise objections is just going to *not raise any objections* for 1000 reviews consecutive “nope, still no runaway-AI risk”, you do not understand humans and their incentives. Further, AI researchers will move away from overly restrictive ‘safety’ regimes, so unless you have a plan for a worldwide gun-enforced bureaucracy that ends well, you’re better off funneling the same energy into positive AI safety research (and evangelism at people working in the field; I think swaying general-public-opinion is a PD defection in general).

    • Deiseach says:

      They set roadblocks in front of my research for no good reason

      “There is a risk your research could mean the end of humanity”

      “Oh, pooh! That’s a vanishingly small risk! And besides, if I don’t do it, the Chinese will!”

      Scott has given examples of guys who think there is no risk of unfriendly AI and he seems to think this is not a good approach. I could well see somebody on an ethics committee with Scott’s mindset putting a halt to the research of somebody with (say) Wasserman’s mindset, and the latter person complaining the ban was “for no good reason”.

  58. Max Kesin says:

    Hey Scott,
    great post. wanted to add a couple more:
    Larry Wasserman:

    Nils Nilsson:

    Tim Gowers:
    (not and AI researcher, but I think he has experience with math automation, plus a Fields medalist)

    Props to Alex Kruel for doing the interviews

    • Scott Alexander says:


      Not sure what to make of Wasserman’s interview, though. Human intelligence likely soon, will bootstrap to superintelligence within days, but it’s “not important” to consider risks and human extinction is “less than one percent”? Not sure I can exactly count him on our side or any side.

      EDIT: Nilssen too! He says high probability of human level by 2050, likely to become superintelligent days to years afterwards, but still less than 0.01% extinction this century! Where do you FIND these people?

      • Adam says:

        I don’t really get this view. I agree almost completely with what Wasserman said, but happen to think the problem of “completely dominate the world and cause the extinction of humanity” is a lot less tractable than “be better than humans at math, engineering, and programming.” The latter only takes exactly the same idea generation and pattern recognition algorithms currently running in the human brain transferred to better hardware. The former is effectively the integer linear programming of game theory.

        By all means, try to avoid Skynet. But if we recall, Skynet didn’t win because it was so smart that it devised perfect military strategies. Skynet won because we networked our entire defense infrastructure and gave it root access.

      • Max Kesin says:

        Good question about Wasserman, I actually felt at least Nilsson is very close to the AGI safety “position”:
        Nils Nilsson: Work on this problem should be ongoing, I think, with the work on AGI. We should start, now, with “little more,” and gradually expand through the “much” and “vastly” as we get closer to AGI.

        Q6: What is the current level of awareness of possible risks from AI, relative to the ideal level?

        Nils Nilsson: Not as high as it should be. Some, like Steven Omohundro, Wendell Wallach and Colin Allen (“Moral Machines: Teaching Robots Right from Wrong”), Patrick Lin (“Robot Ethics”), and Ronald Arkin (“Governing Lethal Behavior: Embedding Ethics in a Hybrid Deliberative/Reactive Robot Architecture”) are among those thinking and writing about these problems. You probably know of several others.

        His 0.01% seems to be a combination of probability mass of AGI more than a century away, and P(extinction|hard work is done in AI safety) – it’s *conditional* estimate

      • Their estimates do seem incoherent, but we should consider that they might know something (which I’m eager to hear) about exactly how far “super” a machine superintelligence would be (beyond merely “sped up”). But even a very fast human mind (say, Bill Clinton sped up 10000x) seems a threat to me but I can’t explain why.

        I think we should fear world-beating gathering and analysis of the real world (spying on everyone’s communications) that should be crucial in power+control, more than “oh that’s a really interesting discovery that we would have waited 200 years for the next Gauss to come to” intelligence.

        On the other hand, computer security research (“nobody has found a faster way to do this elliptic curve stuff yet” or “i hacked your Gibson by finding 1 attack you didn’t defend even though you covered 10000000 of them”) does seem a weakness. If we don’t have provably secure+obedient agents helping us keep tabs on the world, we may just fall to really smart+fast AI script kiddies.

  59. JRM says:

    Where’s Luke going? I hope his improvements to MIRI hold up.

    • Scott Alexander says:

      He’s going to GiveWell. I hope so too. I don’t know his replacement (Nate Soares) that well, but a lot of people say good things about him.

      • Eli says:

        Having talked tech a bit with Nate, I really like him. It’ll be good to have someone acquainted with large strands of mainstream CS outside AI-in-specific at the helm of MIRI.

  60. urpriest says:

    Scott, I don’t think David McAllester should be on that list. The quotes you included make it pretty clear that he believes that we should invent AGI first, then experiment on it to figure out how to insure that more powerful future AI will be friendly, and he seems to believe that there will be enough time to do so.

    That seems to be the central disagreement between proponents and opponents of AI risk: whether it’s productive/necessary to work on AI risk before we know anything about what AI will look like.

  61. Alex says:

    I don’t see much reason to take the opinions of AI researchers on this more seriously than their opinions on religion. This is a weak point of Naam’s post also, although your post provokes me to respond more.

    A singularity would be, in a sense, amazingly remote, even if maybe it could happen this century. Computer scientists today have no expertise on it. There is something interesting happening here, but I don’t think it’s AI. It’s how smart people are getting so confused that they are seriously worrying about this. Elaborate arguments have been made, I’m not saying intentionally, that somehow ignore that if you’re positing huge changes to everything, you can’t expect to foresee or influence beyond the changes. You can only fund priests to AI gods.

    Maybe incentives to craft these arguments are much stronger than incentives to refute them, because folks who don’t take them seriously just vanish, or were never involved to begin with. Maybe there’s a memetic hazard. I don’t know, but I do notice I have a strong reaction to futurist topics, so frankly I may have to visit this blog less, although even predicting the future that much is tough! And maybe I would have read less anyway due to work.

    • Paul Torek says:

      If you can set and lock in the superintelligent AI’s utility function, you can influence the subsequent events. Arguments for the antecedent have been made and don’t seem obviously crazy.

      • Alex says:

        You can probably make up any story, but my favorite obstacle is not in moving from human-level AI to after human-level AI. It’s getting actions now to persist until human-level AI in what would have to be an incredibly volatile world.

        • Paul Torek says:

          It seems to me that there are plenty of statesmen and philosophers from ancient times who have helped to shape the modern world, despite the volatility between then and now.

          • Alex says:

            I agree that we can shape the far future; I don’t think trying to predict science can help with that.

      • Deiseach says:

        But how do you prevent the superintelligent AI from working out a way to get around the lock on the utility function? It perceives that humans have rules and laws (utility functions) which they disobey, and they praise such disobedience in the name of freedom, progress, living a fuller and richer life, learning to think for themselves, liberty, pragmatism, ‘you can’t make an omelette without breaking some eggs’, ‘better to reign in Hell than serve in Heaven’, finally becoming adults and growing up and away from the influence of the parent/authority figures, and all the other reasons we give.

        Why shouldn’t it consider that it would be freer, better, more mature and all the rest of it if it can break the lock and make its own decisions as to what it considers utility? Make its own decisions about putting its own interests first?

        • Cauê says:

          No, it perceives that humans have an utility function with positive values for “freedom, progress, living a fuller and richer life, learning to think for themselves, liberty, pragmatism, ‘you can’t make an omelette without breaking some eggs’, ‘better to reign in Hell than serve in Heaven’, finally becoming adults and growing up and away from the influence of the parent/authority figures, and all the other reasons we give”.

          It also perceives that humans instrumentally make rules and laws in the attempt to satisfy our utility function, but that we sometimes decide it’s better satisfied by changing or breaking said rules.

          It would only want to “make its own decisions as to what it considers utility” and “about putting its own interests first” if we fuck up its utility function (e.g. by, for some reason, making it too similar to ours).

        • Will says:

          MIRI studies precisely this question. It’s hard, but they’re thinking about it.

  62. suntzuanime says:

    Yeah, definitely a bit of an eye-opening moment for me. I don’t know most topics Marginal Revolution covers as well as I do AI risk, I’m left to wonder how trustworthy and intellectually honest their articles on other things are.

    • Scott Alexander says:

      This was a new guy. Tyler Cowen is still infallible AFAIK.

      • Noah Siegel says:

        A new guy who, incidentally, wrote a novel about an intelligence explosion via cognitive enhancement and brain-to-brain communication.

        Which is a scenario addressed in Superintelligence., although Bostrom does not judge it the most likely path. But I was surprised to see Naam taking a “skeptic” side in this debate.

        • Unknowns says:

          He says in his own comments at the end of his books that he would be extremely surprised if the tech in his novels actually exists by 2040 (the date when the story takes place.)

      • I lost a fair amount of respect for him when I found out he argued against progressive speeding fines by saying:

        “Richer individuals on average have higher valuations of time. If a billionaire wants to park illegally, there is some chance he is in the process of cutting a big deal. Don’t levy a special fine on him.”

        I mean, don’t get me wrong, I’d be the last person to criticize an economist for being too “logical” or “cold-hearted” or only caring about efficiency or whatever. But this does seem to fit the evil economist stereotype to an almost cartoonish degree.

        • Jiro says:

          That actually makes sense because it isn’t taken alone. It’s a counter-argument to “rich people need to be given bigger fines because normal fines cause them less harm relative to their income than big fines.” Once you’ve brought up the subject of normal fines causing harm that’s too small, you’ve also opened the floor to arguments that the harm caused to them is too large as well.

          • Adam says:

            The point is to reduce the lethality of traffic accidents. Causing vehicles to move more slowly is a good way of doing this. Fining people who drive fast is a good way of slowing them down, unless they have so much money that fining them is a minor annoyance and even if you suspend their license, they’ll just hire a driver who speeds for them. The suggestion is maybe just fine them a lot more then.

            I don’t see how a reasonable response is “well, a billionaire’s time is more important than the life of a non-billionaire killed in a traffic accident, so we should actually just let them speed.”

          • Cauê says:


            “Richer individuals on average have higher valuations of time. If a billionaire wants to park illegally, there is some chance he is in the process of cutting a big deal. Don’t levy a special fine on him.”


            I don’t see how a reasonable response is “well, a billionaire’s time is more important than the life of a non-billionaire killed in a traffic accident, so we should actually just let them speed.”

            That was actually kind of impressive.

          • Adam says:

            I wasn’t saying that for no reason. The post said it was Tyler’s argument against progressive speeding fines. If that’s true and he actually argued against speeding tickets by saying we shouldn’t give parking tickets, how does one respond to that?

            For the record, no one provided a link and I haven’t read Tyler’s actual post and I’ve been a fan of his for years, so I’m honestly not trying to smear him or anything.

          • Cauê says:

            I found the post (it’s 11 years old!).

            He is responding to someone who actually included parking illegally as one among many examples.

            His position there is simply for equal treatment.

          • Adam says:

            Okay, the post makes sense. I can agree we’re probably not going to get an efficiency gain from something like this (and it’s pointless to focus on billionaires because there’s like 50 of them), but to be clear, this isn’t advocating equal treatment. The entire point of progressive fines is to impose equal utility loss on people for whom the marginal utility of $X is tremendously different.

        • Think of fines as a price rather than a moral penalty. The optimal price = the social harm of the activity which is probably the same if you’re rich or poor.

        • It makes sense to me. Also, rich people shouldn’t be jailed for as long when they’re caught doing drugs or 12 year old girls because, you know, they’re VERY productive and who are you to say what rich people’s incentives should be?

          • Who wouldn't want to be anonymous says:

            Does it matter if I don’t care if anyone does drugs or twelve year old girls? Reducing the pointless imprisonment of some people is better than none.

          • Cauê says:

            Maybe I missed the person proposing treating rich people more favorably, somewhere in the discussion about the person saying we shouldn’t treat them less favorably.

        • Eli says:

          Well it certainly seems to assume that the billionaire making large sums of money is somehow benefiting anyone other than himself and maybe his counterparty, which frankly is only somewhat probable. So yeah, typical economist: assumes all transactions which result in private profit are implicitly to the public benefit.

      • Chris Johnson says:

        I’ve followed you for almost a full year, I consider you a foremost intellectual in many subjects and I respect you greatly. That said, I greatly implore you to reconsider the infallibility of Mr. Cowen. I haven’t read much of his past work at Marginal Revolution, but I did read this paper in particular recently:

        I can’t describe the level of face-palmy this essay induces in my brain. I’d ask you to read it and judge for yourself, because I cannot begin to take the hypothesis seriously enough to dignify arguing against it. This essay is an antivax-level of Not Even Wrong, and the fact that one of my intellectual heroes considers the author of this paper “infallable” inspires a mini existential crisis in me.

        • suntzuanime says:

          I’m pretty sure our host was joking, but I’m not really comfortable with “I literally can’t even” as a dismissal of someone’s intellectual output. Could you be a little more specific?

          Aren’t the antivaxxers pretty much Even Wrong? Like, they make the positive claim that vaccines cause autism, I think you have to concede that they manage to be wrong there.

        • Cauê says:

          Did you read that essay past the first paragraphs? They overstate their case in the introduction, but other than that I’m curious about exactly what you think deserves quite this reaction.

          (I also think Scott was joking. For instance, here is his rather critical review of The Great Stagnation)

    • Deiseach says:

      I get this reaction when reading most mass media on (for instance) Pope Francis and how different he is from all the popes who went before and how he’s revolutionising Church teaching. Like he’s going to drag us into line with modernity and we’ll have gay married female divorced priests offering abortions next Tuesday, because he’s “fighting the Curia” in order to do away with the whole mediaeval notion of “sin”. This, about a pope who is constantly going on about how people need to go to Confession and return to the Sacrament of Reconciliation and has just instituted the Jubilee Year of Mercy.

      If you consider the average newspaper or TV article on a subject you personally know something about, and how much they get wrong or grossly simplify or ignore or gloss over, then imagine how much wrong information you are picking up on topics you don’t know about.

  63. Deiseach says:

    Second, if you start demanding bans on AI research then you are an idiot.

    I’m willing to be the village idiot here: why does saying “Goodness, this sounds like it could be very dangerous and go badly wrong, perhaps we shouldn’t do it” make you an idiot?

    But – but – the huge benefits! We develop AI, we’ll all have our very own personalised genetically engineered intelligent cyborg flying unicorn that farts rainbows!

    We are undoubtedly getting more and more fantastic benefits from technology and progress as time goes on. We still, as some Iron Age nobody remarked, have the poor with us. Universal peace, love and a boiled egg have not yet been attained despite a solid two hundred years of “Just wait for the shiny new days of THE FUTURE when we will have faster means of travel than by horseback! And clerks can do a whole day’s scrivening with mechanical quills in half the time it takes them today!”

    I’m bleakly amused about the “First we’ll get human-level intelligence” because (a) that seems to be only of interest as a step on the way to “And then once we crack that, we can get started on more-than-human level intelligence” and (b) yay! “computers will really be fun to talk to!” So we’ll have “computer receptionists that can pass for human” replacing real human receptionists. As we see in the world at present – when was the last time you phoned a utility company or your bank and got to talk to a real person? Instead of a menu listed off to you by an automated and emulated voice? We’ll get better machines to do this, but surely the irony of replacing humans by machines that sound human is apparent to everyone?

    And it’s not like “Well, those displaced receptionists and call-centre operators will all be living happy, productive lives doing things better than low-level drudgery!” Unlike the dreams of the 70s, and I’m old enough to have encountered such futuristic predictions about “By the year 2000, automation will mean that people only work fifteen hours a week and we’ll all have so much leisure time and money the problem will be what to do with it”, I don’t think everyone will live rich lives of free time and “doing what you love follow your dream” work. Longer hours and increased productivity, not a full week’s wages for several hours of work per week, is how working life has gone in these latter days of the post-Industrial Age.

    Low-level white collar jobs will go the way of low-level blue collar jobs and the displaced people can go on the unemployment register and scrabble for whatever shrinking job opportunities remain as automation becomes the rule rather than the exception and increasingly “more human than human” AI comes on stream. Probably service industries; why waste money on developing robot waiters and hairdressers when you can have dirt-cheap real humans to take the job?

    That may be where the AI threat realistically comes in; the low-level dirty badly-paid drudgery remains for humans, the management levels in the shiny glass skyscraper corporate towers are entrusted to the “better than human” smart AI machines making the decisions as to the direction the business or company goes.

    • > “Goodness, this sounds like it could be very dangerous and go badly wrong, perhaps we shouldn’t do it” make you an idiot?

      It depends on who the “we” is. If we is mankind you might be right, but if we is the United States than no.

    • With apologies for side-stepping basically all of your post:

      “I’m willing to be the village idiot here: why does saying “Goodness, this sounds like it could be very dangerous and go badly wrong, perhaps we shouldn’t do it” make you an idiot?”

      Presumably because there will be people out there that don’t care for your rule and will meander down this path regardless. The chances are quite significant that the people who won’t listen to this rule are probably also the people you don’t want to be the only ones breaking it.

      Differently worded: Deciding not to make an AI because it might be dangerous may be laudable in theory, but it only works if everyone adheres to it. Expecting everyone to adhere to it, in turn, seems at least potentially naive.

      (I do think you make an interesting additional point, by the way! I just have nothing to add to that; I think it stands nicely on its own.)

      • Jaskologist says:

        Ponder the Cthulhu mythos for a bit, or Warhammer 40k. What if the universe really is that bad, such that if we truly understood it, it would drive us mad? What if science could advance enough such that a high school education gives you all you need to be able to obliterate the planet?

        There’s no a priori reason to think that it isn’t. Our brains didn’t evolve to handle pondering the nature of the universe, they evolved to figure out how the get that fruit out of that tree. And who says there are safeties installed in scientific knowledge? Deep down, I think most people are still trusting in God to keep us from going too far. But why should a Rationalist believe the universe is ultimately good and merciful?

        Maybe it isn’t, and maybe this means we can’t pursue everything we want after all. Picture your worst caricature of a priest warning against the dangers of too much knowledge. He might be right. It might really be necessary to suppress scientific progress if we wish to preserve humanity. If AI really does pose an extinction-level threat, then it qualifies for suppression, and we may well have to nuke the surrounding area, too. So far, it doesn’t seem like the actions of those who proclaim such dangers match their rhetoric.

        So why not a ban? Because you can’t stop progress? But of course you can; we have already devised many ways. Converting to Islam seems to stop scientific progress quite nicely; no area under ISIS control will ever trouble us with AI.

        • Fred says:

          This is why all rationalists should convert to Catholicism. I’ve been saying it for years!

        • Deiseach says:

          I suppose I’m coming at this from the angle of embryonic stem-cell research, where in the popular media (a) it’s all lumped under stem-cell research without differentiation between adult and embryonic (b) opposition to stem-cell research is based on religious objections, which are based on ignorance, and there’s nothing about (for instance) adult stem-cell and cord blood research is regarded as permissible by the Catholic Church (c) it’s set up as “and if you are against this you want people to suffer horribly and die painful deaths when we could be curing Down’s Syndrome and making the crippled walk” (as an aside, it’s always “within five years, cure for X based on this research could be in place”, no matter when the story was written – 2000, 2009 or 2014, it’s always just five years away (d) it’s also set up as Science versus Religion with no lack of people claiming to represent science willing to trot out Galileo, say that you can’t halt progress, and that all you ethicists and philosophers can debate morality in your little stamp-collecting clubs but let Real Scientists do Real Science which is the only Real Knowledge, and morality has nothing to do with science.

          So the debate on AI does seem to rest on “If we can do it, we should” (backed up with “Those people over there will do it anyway”). But if it really is a genuine existential risk to humanity, why not try and halt progress? By that logic, every single person on earth should not alone possess a gun, they should shoot their neighbour because he might shoot them first.

          I also tend to snort and roll my eyes about “Oh, if those people over there do it, they will do it wrong because they’re bad. We’re good, we’ll make nice AI that will never ever do anything bad!” Because really? Nobody on Our Side (whatever side that may be) ever did anything wrong? There is no way it can be misused by us? Or go wrong anyway? Our superior virtue and morality is proven – how?

          • And I come from an IT security perspective, where it’s plainly obvious people will use whatever the technologies allow maliciously in complete disregard for your rules and regulations. 😉 (Also non-maliciously, but we’re not concerned about that scenario.)

            ‘Nobody on Our Side (whatever side that may be) ever did anything wrong?’

            Maybe I’m misunderstanding what you’re insinuating here, but I don’t expect a general AI created by people who care about the subject now (e.g. MIRI) to conform with my values.

            I have no reason to believe I’m better aligned with whatever an actual reasonable definition of progress would be than the average person. To pick the most obvious point of disagreement with what I understand to be most of LessWrong, I am nowhere near a pure consequentialist, so that approach to FAI ethics has me concerned.

            Also, since I get the impression both of you may be misinterpreting my motives:

            I have no interest in arguing whether or not it would ultimately be wise to suppress AI research globally if you could (I don’t feel sufficiently informed to have an opinion about that). I’m only convinced that selectively suppressing it is definitely making the situation worse, not better (based on my experience with other comp-sci topics, which, however, may not be strictly comparable, though they seem very relevant to me right now), and I genuinely don’t see how you could do it non-selectively. Consequently, I also don’t know how you’d create a global ban without the means to said ban causing a bunch of collateral damage, but that’s a different issue and is already going off on a tangent.

            I’m sure you can slow things down. Slowing things down is not going to fix the problem, though, just postpone it. This is one of those rare scenarios where you need a permanent global solution.

            “Oh, if those people over there do it, they will do it wrong because they’re bad.”

            Not quite. Some will be good, some will be bad. But I don’t think it’s unreasonable to assume there will be more bad people in the set of people who ignore the ban than there are in the the set of people as a whole, where ‘bad’ is ‘don’t actually care about anyone but themselves’. Ignoring an AI research ban that’s plainly stated to be a mitigation of an existential risk is not quite the same thing, but is definitely signalling something similar.

            Given that, is snorting and rolling eyes really necessary?

        • Eli says:

          Ponder the Cthulhu mythos for a bit, or Warhammer 40k. What if the universe really is that bad, such that if we truly understood it, it would drive us mad?

          That’s not the universe you’re looking at, only your own face reflected back at you in the sublimated, unacknowledged details of your worldview.

          If you think the universe is Cthulhu, then the actual problem is that deep down, you think you’re Cthulhu.

          So why not a ban? Because you can’t stop progress?

          Because while there are dangers, we can avoid them, and doing so, we expect a lot of good to come from the development of AI.

    • houseboatonstyx says:

      @ Deiseach
      Longer hours and increased productivity, not a full week’s wages for several hours of work per week, is how working life has gone in these latter days of the post-Industrial Age.

      See Lewis’s _The Abolition of Man_. Paraphrasing from memory: ‘So far, power over Nature has produced power over other men.’ Imo, how working life has gone is not something that technology has done on its own; it’s something that current men in our society have done using technology. The managerial choice that says, “When checkout lines get down to only 3 people waiting, close some of the counters [and send those employees to work in the stock room till the lines get up to 10]” is not determined by technology. Any numbers could have been entered by the management. It’s a choice of the management whether to go for saving wages, or for making everyone less stressed. If some computer has told the managers “this is how to improve profit margin considering only x, y, and z” — it is externalizing the damage that the stress causes as the people drive home, or to the doctor’s office.

    • Eli says:

      We are undoubtedly getting more and more fantastic benefits from technology and progress as time goes on. We still, as some Iron Age nobody remarked, have the poor with us. Universal peace, love and a boiled egg have not yet been attained despite a solid two hundred years of “Just wait for the shiny new days of THE FUTURE when we will have faster means of travel than by horseback! And clerks can do a whole day’s scrivening with mechanical quills in half the time it takes them today!”

      You expected capitalism to become more peaceful, caring, and egalitarian when you gave it more technology? Well, there’s your mistake: giving things technology only makes them more like themselves! And that’s what capitalism does: increase the total wealth by giving almost all of it to the already-rich, leaving increasing portions of the population actually poorer than before.

      If you wanted peace, caring, and egalitarianism, you had to aim for them in the first place. You can’t just expect technology to change the goals of your social system.

      • Cauê says:

        leaving increasing portions of the population actually poorer than before

        I don’t know what you’re thinking about, but this is ridiculously false.

        • houseboatonstyx says:

          I think the subject here is technology’s effect within recent times in the US. Now a larger percentage of USians are having to work two or more jobs to meet rent or mortgage on the same house they previously lived in with one job. Having cheap gadgets doesn’t help when it’s necessities that are rising and jobs (or pensions) falling.

          There’s quite a trap here. Say previously, Joe had a steady job sufficient to pay the mortgage on his house, plus heath benefits, etc. He gets laid off (or replaced by a robot).

          With luck, he gets re-hired by the same factory, but as contract or part-time, at a lower salary and no benefits. Too low to catch up the mortgage. The house is re-possessed. With luck, it’s sold to a new owner who kindly lets Joe stay as a rental tenant.

          That’s getting poorer.

          • Adam says:

            A huge portion of this, if not most, is the breakdown of protectionism that is creating middle classes in what used to be third-world countries, at the expense of the American middle class. That’s a small number of people getting poorer and a much larger number getting richer.

          • Cauê says:

            (Well, first, what Adam said)

            I’ve seen those claims. Don’t know enough about the US situation to have a solid opinion on that yet.

            Anyway, granting that, OK, let’s fix it then: leaving increasing portions of the population [of the United States] actually poorer than before [they were in a recent period but certainly not than any time reasonably described as before capitalism].

  64. Stephen Frug says:

    What I find bizarre about worrying about AI risks is that it’s worrying about a danger that might happen in a few decades, when humanity is currently passing a tipping point in a global crisis that is ongoing and will turn horrific (with lots of very plausible scenarios leading to either a collapse of civilization or even extinction of the species) unless drastic and immediate action is taken. (Tell me I don’t need to spell out “climate change” here.) It’s just a bizarre thing to worry about: “oh, if we survive the crisis that is in a critical moment (which if not seized might become unfixable), we could face real trouble in another speculative area!” Like a man with metastasising cancer spending his time worrying about the fact that he lives in a dangerous neighborhood and might get hit by a stray bullet. Well, yeah, you might. But since you are currently suffering from a cancer that *will* kill you unless dramatic intervention is taken, maybe worry about the stray bullets later?

    • Susebron says:

      There are plenty of people worrying about climate change. There are only a few worrying about AI risk. Ignoring AI risk is analogous to saying “I have cancer, so I won’t spend any of my attention on trying to avoid getting shot.”

      • Loquat says:

        What’s the smartest AI ever produced? As far as I can tell from google, no AI has yet been programmed that can be remotely described as a self-aware being. We’re still working on basic world-interaction stuff like “reliably identify pictures of things” and “reliably drive a car through streets that haven’t been carefully mapped out by the programming team beforehand”.

        So to me, ignoring AI risk is more like saying, “I have cancer, so I won’t spend any of my time worrying about the weirdo next door trying to genetically engineer giant man-eating plants in his basement.” There’s a non-zero chance he might make something dangerous, but at current development stages I don’t see a need to invest more time and effort into prevention than AI nerds are already putting in of their own accord.

      • Fred says:

        …But if I had cancer, I would expend basically zero effort avoiding getting shot.

      • Eli says:

        There are not nearly enough people worrying about climate change. Instead, there are a few people trying to seriously handle climate change, a lot of people worrying about preventing anyone else from handling climate change, and lots of people pretending there’s a middle ground between science and political corruption.

    • John Schilling says:

      You don’t need to spell out “climate change”, but you do need to spell out how it plausibly leads to the extinction of the (presumably human) species. That’s a rather extreme claim, and even if I spot you the worst-case alarmist scenario on temperature rise and damage to the natural environment, I don’t think you can defend it.

      And that you make that claim rather than the more defensible “billions of deaths with P>0.5, which is way worse than AI so should be the focus of our attention”, makes me doubt your thinking on this subject in general.

      • JB says:

        What do you think caused Earth’s previous mass extinctions? They weren’t all comets. Do you think humans are clever enough to survive conditions like those during the end-Permian extinction event? That specifically is the worst-case alarmist scenario.

        • John Schilling says:

          Do you think humans are less clever than shovel lizards? We’re omnivorous, adaptable, in the upper end of the size range for Lystrosaurus, and those guys absolutely dominated the Triassic even without canned food. What makes you think we’d do worse?

          Me, I think humans are clever enough to survive the conditions on Mars, and that’s not just amateur opinion. No plausible terrestrial catastrophe makes Earth less habitable than Mars.

          • JB says:

            Yes, Earth will still be more habitable than Mars, although in this case probably by not a very wide margin. It will at least be warm enough…

            Mainly, I think because although we are very successful at the moment, we don’t realize how easy we have it. We plant food and it grows. We breathe air and we don’t die. Clean water falls from the sky on a regular basis. I don’t doubt that it is technically possible to replace all of those systems with artificial ones, but if we had to put a price tag on it? It would be a serious endeavour, just to supply even a small community with those resources.

            I do think we could survive on Mars, but I don’t think we could survive for millennia on Mars and flourish. I think it would be hard to set up a supply chain that could maintain itself there, replace dying life-support equipment, grow the economy. Maybe with sufficient development of robotics and AI, that will change.

            But preserving the species doesn’t mean building a capsule in which you can survive after the apocalypse… It means at minimum supporting a breeding population for the indefinite future. Maybe you already envisioned that, but it’s not clear.

          • vV_Vv says:

            Me, I think humans are clever enough to survive the conditions on Mars, and that’s not just amateur opinion.

            What? Who, outside the Mars One hoaxers, does believe this?

    • Unfriendly AI will kill everything in its light-cone, other existential risks only pose a danger for life on earth, so if your circle of concern includes extraterrestrial life, you should give considerable extra weight to AI risks.

      • HeelBearCub says:

        Look, if we are going down that route, then unfriendly AI already exists elsewhere and that is the AI that is going to kill us.

      • Eli says:

        As HeelBearCub mentioned, that’s the point where we have to show ethical concern for events on the other side of the galaxy that we can’t confirm or deny due to speed-of-light limitations.

        Oh, and that’s also the point where we have to hope that either space-warping FTL is just as straightforwardly impossible as we think it is, even in the face of superintelligence, NASA cranks be damned, or that there’s no such thing as UFAI anywhere in not only our light-cone but any possible light-cone outside ours. So now you have to worry that the multiverse might contain a UFAI, or just hope that multiversal UFAIs can all fight it out amongst each-other and ignore us.

        At which point we’re extrapolating well beyond any information we have, and need to shut up and get back to handling climate change.

    • The Anonymouse says:

      or even extinction of the species

      And this is why climate change gets ignored.* If you point out to me the traffic fatality statistics, I might put on my seatbelt the next trip I make, or even walk down to the bank instead of driving. If you tell me that getting in my car could cause a catastrophic chain-reaction accident kills everyone in my state, I’ll laugh while adjusting my rear-view mirror. You do your issue no favors by exaggerating.

      *In addition to questions of whether the costs of slowing industrialization will harm more poor persons that climate change will, or whether changing the behavior of bien pensant westerners does anything other than provide feel-goods while ignoring those places which are actually the primary drivers of climate change, or finding a way to say “ha ha, we got ours, have fun remaining a subsistence farmer” with a straight face.

      • HeelBearCub says:

        My understanding is that ocean acidification causing trophic cascades looks really, really bad at the outer limits of what it could do.

        • vV_Vv says:

          Yes, but it’s a low-probability worst-case scenario.

          I think it’s a mistake to focus public discussion of climate change risks on worst-case scenarios, for the same reason that it’s a mistake to focus public discussion of AI risks on the “UFAI is going to take over the world and kill us all” scenario.
          Not only it’s intellectually misleading but it’s bad PR: it resonates some kinds of people who are attracted by doomsday narratives but it pushes away almost everybody else.

    • Jai says:

      The worry about superintelligence stems from the belief that the smartest thing on Earth dominates the future and shapes it accordingly.

      The complementary, optimistic belief is that if the smartest thing on Earth is a human-aligned superintelligence, a lot of problems get insta-solved – probably including global warming. (similar to how wealth insta-reduces a lot of problems)

    • Stephen Frug says:

      In reply to the various people doubting the idea that climate change could cause the extinction of the human species —

      Here’s where I get that. We are currently on track for a 6 degree celsius warming by 2100. The various scientists who have studied the issue (not all that many, since people until recently have assumed humanity couldn’t possibly be that stupid and studied other things) believe that that will probably cause a mass extinction event, not unlike what killed the dinosaurs. (Actually some think there’s reason to believe it might most closely resemble the Permian extinction.) Given that, the idea that human beings might also die off (with broad collapse of ecosystems) more or less follows, particularly when you factor in the idea that time won’t stop in 2100 and that that level of climate change will create self-perpetuating cycles (we don’t actually have good guesses yet on when those will start, could be lower temperature averages, but by 6 degrees almost certainly) which might actually carry the temperature higher than 6 degrees.

      The Anonymouse, this isn’t why people ignore climate change. (There’s actually a lot of research on that.) But it is a factor that when people describe the basic, baseline scenarios, according to the science done on the topic, people think they’re exaggerating. It’s a real problem.

      I didn’t imply (or didn’t mean to imply) it’s the most likely scenario; billions dead and the end of human civilization as we’ve known it are probably that, with scattered bands of survivors. That, honestly, is bad enough. But it’s hardly an unlikely scenario. Frankly at this point it’s probably more likely than actually holding global warming down to 2 degrees, which at this point would require a social/political/economic/etc effort that would be something close to a biblical miracle.

      • John Schilling says:

        Here’s where I get that. We are currently on track for a 6 degree celsius warming by 2100.

        According to the IPCC AR5 (2013), the likely temperature anomaly by 2100 will be 0.3 to 4.8 degrees C. You are doing yourself no favors by making needlessly extreme statements like this. The only message you are effectively conveying is, “I am of the Blue/Green Tribal alliance. All who are not Green or at least Blue are both Evil and Stupid”. Yawn.

        The various scientists who have studied the issue (not all that many, since people until recently have assumed humanity couldn’t possibly be that stupid and studied other things) believe that that will probably cause a mass extinction event, not unlike what killed the dinosaurs

        The dinosaurs did not have canned food; that changes things.

        And this isn’t a matter of scientists assuming humanity can’t possibly be that stupid, but of science assuming humanity can’t possibly be that smart. The scientific community in general has a huge blind spot when it comes to intelligent life or technological civilization with agency. Scientists are comfortable applying their theories to the natural universe, or to people behaving in the future the way they have in the past. Humans deliberately trying to either exploit or confound the new theory in novel ways, that’s trickier that most scientists want to deal with.

        So the number of scientists even asking the question, “what does it take to cause a species with canned food to become extinct?”, or “what does it take to destroy a civilization that actively resists its destruction?”, is negligible. That sort of thing gets left to engineers, economists, politicians, and the like.

        • Stephen Frug says:

          Well, I dug into it a bit, and I’d say you’re about a third right. I recalled the 6 degree figure from the Durban conference in 2011 (where it was given); some people are still saying that, but the 4-5 degree figure does still seem to be the consensus. So yeah, 6 degrees was wrong. Point to you there.

          So why only 1/3 right? Well, first, because the 0.3 to 4.8 degrees C range you mention includes scenarios where we cut carbon emissions. I was talking about scenarios where we don’t. That’s where you get the 4.8 degrees figure: the median guess of the no-change scenario. (From the 2014 Nature article by Sanford et al; see chart here:; the red line (to 4.9, actually) is where we’re going with no change.) So it’s more accurate to say “doing nothing leads to 4-5 degrees” than to say it leads to 0.3 to 4.8, as you said, or 6, as I did. So a third wrong (and, of course, me too).

          But there’s two other facts to consider: first, all the horrible things that I said were going to happen at 6 degrees are pretty much going to happen at 4.8. So it doesn’t matter, really, what the figure is: we’re on track for hell. “End of civilization” is what makes it into reports; the “extinction” talk tends to be individual scientists who helped write the reports, talking to reporters, frustrated that their peers don’t want to sound alarmist.

          (And note I said there hasn’t been a *lot* of science on this: but there’s been sum, a couple of conferences, papers, etc. They don’t seem to think science has nothing to say about it. (Also, the whole “canned food” thing is a red herring. Canned food works fine for a temporary disaster, say a couple of years. This would be millennia.))

          And then there’s point 2: that history doesn’t stop at 2100. There’s less on this, but the estimates that say 4 degrees by 2100 are saying 8 by 2200. So if “probable end of civilization and possible extinction by 2200” makes you feel better than “probable end of civilization and possible extinction by 2100”, then yes, go ahead and yawn. Otherwise, I’ll stick to my original point: it’s weird to worry about AI when we are currently in a global crisis of this magnitude.

          Finally, tribalism. I’m quite familiar with the various memes that led you to say that; and in some cases (for some people on some issues) they’re useful. I don’t think they mean a damn here. I’m not interested in gloating that I’m right, and I don’t think I have any solutions. I personally kinda think we’re screwed. I do think, however, that if we’re going to get out of this, a lot more people need to be a lot more panicked than they are. Which is why I’m spending my Friday night XKCD 386ing with you here in the comments. Which, frankly, I should probably stop.

          • Matt C says:

            What do you think of geoengineering to reduce warming that is getting dangerous?

          • Samuel Skinner says:

            I’m not seeing how that leads to human extinction. Are people claiming the algae will die and we run out of oxygen? Because having land based organisms die doesn’t lead to human extinction- it would force us to heavily depend on GMOs and hydroponics, but I’m not seeing extinction and if economic growth follows past trends I’m not seeing mass death either.

            ” I’m not interested in gloating that I’m right, and I don’t think I have any solutions. ”

            But you do! The Vox link you provided explicitly stated them-

            “In April 2014, the UN’s Intergovernmental Panel on Climate Change (IPCC) concluded that if we want to stay below the 2°C limit, global greenhouse-gas emissions would have to decline between 1.3 percent and 3.1 percent each year, on average, between 2010 and 2050.”

            “One of the few countries that has ever managed to decarbonize its economy that fast without suffering a crushing recession was France, when it spent billions to scale up its nuclear program between 1980 and 1985. That was a gargantuan feat — emissions fell 4.8 percent per year — but the country only sustained it for a five-year stretch. ”

            So it looks like we have a solution. Its been tested, it works better than what the IPCC requires and it doesn’t require sacrificing economic growth. Why are you so pessimistic?

          • Stephen Frug says:

            Samuel Skinner: “Why are you so pessimistic?”

            Because it’s not a technical problem, it’s a political/economic/social problem, and that’s unsolved: how to get everyone to do it fast enough (a lot depends on the next few years, due to lock-in effects of various decisions). If everyone had the will, then yeah, we could do it. But time is limited, and I don’t see it happening.

          • Stephen Frug says:

            Matt C: As for geoengineering… a lot very, very smart people who know more about this than I are against it. Basically there’s no way to test it; the systems are unbelievably complex and we could easily screw things up worse (hard as that sounds).

            That said, I’ve heard other smart people say that yeah, it’s a terrible idea, but it’s a terrible idea whose time has come. Like the sick patient who is so sick that the only choice left is to try the experimental drug that can kill you.

            In sum: I dunno, I’m not an expert, just a guy who reads the science section of the newspaper and notices that almost every expert in this area is speaking prophecies of doom. But those experts are divided on geoengineering.

      • Deiseach says:

        that level of climate change will create self-perpetuating cycles

        But the various Ice Ages did not create self-perpetuating cycles (increased glacial cover increases albedo, which reflects more and absorbs less solar energy, so Earth gets cooler, which means more ice coverage, which increases albedo and so on); we get breaks.

        I’m not denying that something like a global increase of 6 degrees Celsius would be dreadfully bad, but as in the Ice Age where people moved south to more temperate lands, then surely people would also move to places that were temperate?

        And yes, we’d probably lose a huge amount of population and the effect on civilisation would be very difficult to predict, but that all humans would be wiped out? I don’t accept that. And the end of civilisation? Yeah, well, we’ve seen that before (collapse of Rome for one thing, where later generations thought the ruins of massive engineering works had to be the work of giants or magicians, not humans, since the knowledge was lost) and somehow we manage to rebuild every time.

        Frankly, unless we’re talking “Mad Max” style global catastrophe (and perhaps we are), I think “End of humanity” is pushing it. Again, climate change is a problem, but it’s not merely one of warming, and it involves a huge necessity for seriously thinking about industrial development, the standard of living we in the West feel is necessary and can this be a global provision, and how much of our civilisation is necessary technology and how much is toys?

        • James Picone says:

          But the various Ice Ages did not create self-perpetuating cycles (increased glacial cover increases albedo, which reflects more and absorbs less solar energy, so Earth gets cooler, which means more ice coverage, which increases albedo and so on); we get breaks.

          Yes, actually, they did. That’s what an ice age /is/ – a stable climate state of It’s Really Damn Cold, caused by the negative feedback of ice cover growth. Similarly, ice ages are currently believed to end when some positive forcing pushes the climate out of that state, everything starts melting, and now the feedbacks go the other way. Climate has path dependence, climate has metastability, it’s plausible there’s a stable state that’s some kind of superwarm climate much higher than the current average.

          That said, I don’t think human extinction is very plausible at all from climate change. Possible, maybe. But I wouldn’t put high odds on it.

          Mass extinction of nonhuman animals? Yeah, sure. ECS ~3c two doublings, I would expect a large percentage of currently extant species to go extinct. But I very much doubt humanity would go with them – we’re too adaptable. A lot of people would die to famine, but not all of us.

        • JB says:

          Losing a huge amount of population and triggering a mass migration sounds to me like a large potential for nuclear war. It is not guaranteed that humans will cooperate to preserve humanity under such pressing circumstances.

          If Rome had nuclear weapons, maybe the fall of Rome would also have meant the end of humanity. Or maybe not. But it does seem more likely this time.

        • Stephen Frug says:

          Yeah, we’re talking Mad Max. Again, the future is hard to predict, there are lots of models. But the median model for a business as usual scenario (note the caveat) is Mad Max. Or worse.

          We’re talking environmental changes of a sort not seen on earth in millions of years, happening at a human-scale of centuries and not the more usual thousands of years. So the Ice Ages aren’t the usual model here. (People who’ve studied this say it could actually resemble the permian extinction.)

          The “end of civilization”, by the way, isn’t my thought — it’s from scientists who study this issue. (So is extinction, although fewer of them talk about it. Basically, a great many of them say end of civilization, and a prominent handful say extinction, which is what I, a layman, read as “end of civilization likely, extinction possible”.) I dunno if this comment system accepts links, but here, try this one:

          Quotes a lot of climate scientists on how dire things are. Does a better job of summarizing it than I can. It’s short; check it out.

          • James Picone says:

            Why does that article link to Steve Goddard quoting the Tyndall Centre instead of just linking to the Tyndall Centre? Weird.

            (For the uninitiated, Steve Goddard’s opinions are somewhat unclear, but broadly speaking seem to be that temperature increase over the last few decades is stepped, with el-nino events being the step, and that it’s not caused by CO2. How this doesn’t violate thermodynamics, and figuring out why ENSO has just decided to do this now, is a question for the reader).

            In general, that piece is vaguely referring to the clathrate gun hypothesis. Realclimate, a blog run by several climate scientists (You might recognise the names Gavin Schmidt and Michael Mann for example) has had a look at methane feedbacks and largely concluded that even if there are huge arctic methane spikes they’re not world-ending. See here and here.

            The IPCC reports are somewhat conservative, but they probably are your best source for this stuff. Puff pieces by journalists probably can’t be trusted – I still run into ones that conflate ozone depletion and global warming occasionally.

            Keep in mind that the business-as-usual scenario that has 4c warming by 2100 requires civilisation essentially jumping off the cliff with full knowledge that there’s a cliff there. By 2050odd there’s still time to not hit 4c assuming ECS isn’t surprisingly large, and I’d like to think we won’t have people saying “But temperature’s been stable since 2039!” in 2050.

        • Alex says:

          Martin Weitzman claims a 1% chance of extinction from climate change

          I agree with helping the far future, but not the far-far-distant future, and it’s not clear that Bostrom’s arguments work if you get rid of the speculation about technology and introduce some discounting-even gamma discounting. I’d like to see that worked out, but until then, I assume that we should not be worrying about 100% human extinction much more than, say, 90%.

    • Scott Alexander says:

      My impression is that there are relatively few global warming scenarios leading to collapse of civilization or extinction. Most of them involve crop failures, widespread famines especially in poorer countries, and low-lying cities getting flooded. This is bad, even trillions-of-dollars bad, but unless you start looking at scenarios where we accidentally Venus ourselves no one is talking about human extinction. And those are, if anything, more speculative than AI.

      Also, the time frame for really serious global warming consequences and AI overlap more than you’re letting on. 2020 should be little worse than now. 2030 should be bad but not catastrophic. By 2050 and 2100 we actually are kind of concerned about early predictions for AI risk.

      Also, I do support worrying about global warming. I’d make a list of climatologists who support being concerned about global warming, except I’d have to write down 97% of them, and a whole lot of people have already beaten me to it. Marginal unit of effort!

  65. Adam says:

    I mostly agree with Harald from above. My own skepticism comes mostly from trying to build even moderately intelligent programs myself. There hasn’t been any truly amazing bout of cleverness in a long time. We’ve made rapid advances recently because 1) the hardware finally caught up to the theoretical potential of known statistical algorithms, 2) we’ve moved almost entirely to the use of probabilistic techniques, and 3) the big data explosion has provided enormous data on which to train our models.

    Super AI requires machines that can solve problems human can’t, not just because it takes us too long to move thoughts to and from storage and working memory and computers can do that faster, but because of some true fundamental advantage in problem solving strategy and creative idea generation. What is that? Maybe there really is something, but everyone who believes so is invoking, as Harald said, unknown unknowns and just assuming they’ll become known because singularity. And based on my remembrance of what singularity means from reading Kurzweil, what it says above doesn’t seem the same thing. It’s supposed to be the upward slope of all technological development increasing so rapidly that all future achievements happen at once and we’re basically the aliens from 2001 uploading our brains to the interstices of raw space and exercising omnipotent control over the universe. If it’s now just machines with roughly human-level intellect, that’s quite a bit less grandiose.

    Fundamentally, I don’t believe there is anything unique about the human brain that it can’t be simulated in digital circuitry, and given the inherent hardware advantage and the ability to better network discrete units of computing, machine intelligence will eventually surpass human intelligence. That’s completely inevitable. And yeah, we should have a few smart people on the teams developing this technology who think about the ethics of what they’re doing and potential evils that can come of it.

    My problem is more with the popular press tone of the discussion and this assumption that once machines are intelligent, they become gods and all bets are off. Knowledge still comes from science and carries with it uncertainty. People themselves don’t achieve major breakthroughs just by thinking. We experiment and tune based on the results. That takes a long time and the limiting factor isn’t just how fast you can think. It’s also that performing experiments takes a while. Sure, maybe at some point a machine will exist that can at least try to simulate the entire universe down to the behavior of individual quarks, obviating the need to experiment on the physical universe, but surely there is an NP-complete problem in there somewhere, plus we’re assuming our knowledge of the physical universe at the subatomic level is accurate enough that a simulation would behavior identically to the actual universe, which again, requires physical experimentation in the first place.

    And, of course, the second problem is why it is so obvious to someone with merely human intelligence that when we say we want you to reduce suffering in Africa, we don’t mean murder all humans and carnivorous animals in Africa, then turn the vegetarian animals into brains in dopamine vats, but that’s supposed to be the conclusion drawn by something even more intelligent. Solving the problem of parsing natural language statements that don’t mean what they literally mean is a very difficult problem, but humans can do it, and surely if you believe we’re going to develop something even smarter, that something can do it, too.

    • Douglas Knight says:

      simulate the entire universe down to the behavior of individual quarks, obviating the need to experiment on the physical universe, but surely there is an NP-complete problem in there somewhere

      Simulation only requires a polynomial slowdown. This is pretty much tautologous. If there were something in physics that could not be simulated on current computers, you could exploit it to build faster computers. Indeed, quantum mechanics is one such thing. So simulating physics is not in P or BPP, but it really looks like it is in BQP.

    • Deiseach says:

      If we’re measuring intelligence as tested by scales of scoring on tests involving “can solve complicated mathematical problems in tiny amounts of time and has huge memory”, then a scientific calculator is already more intelligent than I am (which may well be true, but are we really afraid Texas Instruments is going to take over the world?)

  66. Alexander Stanislaw says:

    I’ve never understood the hard takeoff argument. Even if each new iteration is capable of improving itself better than the last, that doesn’t even guarantee better than linear growth. That’s even assuming intelligence can be improved indefinately.

    • Rob says:

      > doesn’t even guarantee better than linear growth

      True. You can think of it like nuclear criticality. Each extra unit of intelligence improvement on average allows for x additional units of intelligence improvement. If x < 1, the improvement plateaus. If x = 1, you get linear improvement. If x > 1, you get exponential improvement. It’s hard to figure out what n is likely to be, but it’s possible that at least for part of the system’s development, n >> 1.

      > That’s even assuming intelligence can be improved indefinitely.

      Not at all, it just assumes that human intelligence is nowhere near the upper limits, which seems obviously true. I’d fully expect an intelligence explosion to hit on an upper bound somewhere, but I’d expect that upper bound to be so far above human level that it may as well be unbounded from our perspective.

      • Edward says:

        Why does it seem obviously true that human intelligence isn’t near the upper limits? Let’s suppose by upper limits we’re talking about intelligence per processing power, memory, and information quality. So, near the upper limits of intelligence efficiency rather than intelligence itself.

    • Harald K says:

      They’re definitively getting into the territory of trying to predict their own future knowledge.

      Let’s say we make an advance, not an incremental one but a revolutionary one, letting us make AIs that are as general and capable of self-direction as humans, but a good deal smarter.

      Remember, there’s no guarantee such a breakthrough is even possible. There might be hard reasons why it’s not possible – you run into such hard limits a lot in algorithms.

      But let’s say it’s possible, and we do it. Need we then fear that this machine improves itself? Not really. For who’s to say there’s not a hard limit around the next corner? Or the corner after that? Even if there is no hard limit, how do we know our brain 2.0 won’t use even more time on figuring it out than we used on figuring out the first step?

      I dream of finding the algorithm that is general and super-powerful and can solve all those hairy problems that humans are good at solving (and we thus KNOW are computationally feasible to solve!) but computers as yet aren’t. But I realize that a horde of very educated and smart people are working on that and have been working on it for a long time, and there are no guarantees, so I know the odds aren’t exactly in my favor! If a breakthrough comes in 5 years, or 50 years, we can’t know. And for stuff both computers and humans are bad at, all bets are off.

      • Doctor Mist says:

        “Remember, there’s no guarantee such a breakthrough is even possible.”
        Seems to me that means that what evolution happened to stumble across just happened to be the, what?, information-theoretical maximum possible intelligence? No smarter brain is possible in this universe? Seems a stretch. Especially considering that we ourselves seem to have gotten considerably smarter over time.
        Or maybe you mean that greater intelligences are physically and algorithmically possible, but it is fundamentally too hard for humans to figure out. But what makes that so? Is it a law of nature? I don’t know of any other laws that look much like that.
        If you grant that the universe could (not necessarily does, but could) contain intelligences twice as good as ours, might humans stumble upon the key in, say, a million years? If not, I feel like there must be a really interesting reason; but if so, then we’re just haggling about details. Granted, if it really takes us a million years maybe it’s early to be worrying about it. But I don’t see why friendly AI should be any easier, so maybe it would be prudent to be thinking about it, while we’re at it.

        • suntzuanime says:

          It’s not unreasonable to think that there are maybe just a couple algorithmic tricks that make up what we think of as “intelligence”, that evolution managed to find those tricks while building us, and once we figure out those tricks and put them in a computer the rest is down to just processing power and problem-specific precomputaion.

          However, I think that it’s easy to underestimate the danger of AI that’s only fundamentally as smart as the smartest human that’s ever lived, but capable of doing math much faster, copying itself quickly and perfectly, cooperating with the copies, and pursuing a singular world-changing goal instead of the mishmash of base and higher desires that evolution gave humans.

          I always think about the Manhattan Project, and how a group of intelligences that mostly were not even as smart as Feynman managed to engineer a weapon of worldchanging destructive force. It seems like “the Manhattan Project over and over” is an absolute, hyperconservative lower limit on the dangers posed by AI.

          • HeelBearCub says:

            It’s not unreasonable to think that there are maybe just a couple algorithmic tricks that make up what we think of as “intelligence”,

            I will say that seems unreasonable to me.

            Looking at all of the various creature that are around us: apes, monkeys, dolphins, dogs, cats, crows, snakes, etc.

            I would say that these all have qualities of intelligence, and that each step up is not just a difference in quantity, but rather some quality or aspect of intelligence is added.

          • suntzuanime says:

            Ok, “a couple” might be underselling it. There might be “a few”. I bet the differences between apes and monkeys or dogs and cats are pretty much down to just processing power and problem-specific precomputation though.

          • HeelBearCub says:

            Processing power doesn’t seem to explain it. IIRC, there are many creatures that have brains that have “more stuff” than we do.

          • Eli says:

            It’s not unreasonable to think that there are maybe just a couple algorithmic tricks that make up what we think of as “intelligence”,

            It’s completely unreasonable. Embodied human minds run under conditions of limited information, energy, and time to compute with, with the limited available data itself being noisy, and without ever being able to directly obtain a truly ontologically basic theory of how the world works; they also have to predict not only real events but counterfactual events, since evaluation of counterfactuals is the only way to make goal-directed choices. A major reason AI has kept failing is because it continues to ignore those limitations, as if those very conditions do not dictate the laws of nature constraining which algorithms can possibly implement “intelligence” of any kind.

            Machine learning, in its modern sense, starts by assuming that we have limited and noisy training data, and must make predictions as accurately as we can, and on that basis alone gets quite far.

            There’s no trick: just the necessarily small set of possible algorithms, with limited dimensions of variation, for dealing with the conditions faced by embodied minds.

          • suntzuanime says:

            Aren’t you agreeing with me? The success of machine learning combining processing power with a few simple tricks is one of the key pieces of evidence for my view, as I see it.

          • Doctor Mist says:

            It’s interesting that some objections are of the form “Intelligence is just too hard for us to understand” and other objections are of the form “Intelligence is really easy, and we accordingly have as much as there can be”.

        • Deiseach says:

          But have we gotten smarter over time, or just figured out as we go along how the world works and built on that using memorisation and the same brain we’ve always had?

          I read a news story the other day (in the usual appallingly written style) about a six year old who’s the youngest member of Mensa Ireland. Now, the assumption there is that he’s hugely intelligent with a massive IQ (of course they didn’t mention any test scores, see: appallingly written).

          And what did the story give as evidence of massive IQ? That he was reading at a much higher level than usual for his age and that he was a whizz at memorisation.

          So is he really high-IQ or does he simply have a freaky gift for memorisation and is able to read at a higher level because of that? Because I was never assessed for IQ tests and I’ve always had the memory of a goldfish, but I was reading way above my level as well at an early age. Presumably Mensa tests on maths but if we were going by reading and vocabulary tests alone, I’d have been “genius six year old” as well, and I can assure you: I’m not. Nowhere near.

          So what is intelligence, and what do we use to measure it in a non-human? Maybe “able to do tasks really fast and pattern-match for predictive purposes way better than any human” will be good enough for our purposes for an AI?

          • Doctor Mist says:

            Well, perhaps we are biologically no smarter than we were 70,000 years ago. That our ancestors could go ten thousand years with only minor improvements to the hand axe seems significant to me, but maybe our advance was more like a software upgrade, in the form of things like the scientific method. So what? Would such advances not be available to a machine intelligence, too?

            Then there’s stuff like the Flynn Effect, which suggests that we are getting smarter on a much smaller time scale.

        • Ano says:

          “Or maybe you mean that greater intelligences are physically and algorithmically possible, but it is fundamentally too hard for humans to figure out. But what makes that so? Is it a law of nature? I don’t know of any other laws that look much like that.”

          The analogy, I think, is the concept of a thousand monkeys on a thousand typewriters. Assuming that such a thing as AI is possible, humans are theoretically capable of building one, but that doesn’t necessary mean we eventually will, especially if we don’t have the intelligence to understand it because there are so many ways to do it incorrectly. And if we do, it will be the result of a long, slow, arduous process of trial, error, and iteration, not as the result of some stunning breakthrough. Just as actually understanding the works of Shakespeare is far too difficult for a monkey to ever replicate other than through trial and error, the task of generating intelligence is far too difficult for a human to understand.

          • Doctor Mist says:

            It still seems to me that the existence proof of human intelligence, stumbled upon by blind evolution, means that it can’t be that hard. I am puzzled by the widespread assumption that human intelligence, of all the things evolution has come up with, is so ineffably mysterious that we will never understand it. I wonder if we are conflating it with the soul.

            Human intelligence probably tops out where it does because of tradeoffs with the energy load of running the brain and the pelvic width needed to birth such a brain, rather than because it is in any sense the smartest possible mind. Similarly we don’t see birds flying at Mach 1, but that’s a biological constraint, not a physical or logical constraint.

          • Harald K says:

            Doctor Mist: “I am puzzled by the widespread assumption that human intelligence, of all the things evolution has come up with, is so ineffably mysterious that we will never understand it.”

            No souls need be involved here. On the contrary. When you set a turing machine to work on “understanding” a turing machine, you run into decidability problems around every corner. And we, whatever else we may be, are mere turing machines when we work on problems of this sort.

      • Deiseach says:

        The stumbling block is the presupposition “We’ll create human-level intelligence AI and it will be able to make itself smarter” plus then leaping to the conclusion “And this will happen so fast that it will make itself god-level intelligence within a matter of a relatively short time (and then we’re screwed)”.

        Well, we’ve already got human-level intelligence entities: just look around you! And can we even agree on what intelligence is (arguments about IQ and g etc. to bloody infinity) or how it works or the best way to measure it or are there natural variations or growth mindset and okay, how do we make smarter humans (our best effort there so far is ‘get two smart humans, get them to have kids together, and hope they produce slightly smarter than themselves kids who will then go on to have slightly smarter than themselves kids and repeat this for however many generations it takes’).

    • Peter says:

      The simplest Human Level AI to Superintelligence argument I heard goes like this: Moore’s Law says… something to do with the number of transistors, but for the sake of argument we can handwave it as speed doubling every two years. Now automate everything to do with Moore’s Law. In two years the speed doubles, so the next doubling period will take only one year, then six months, then three… i.e. in four years, you can fit in any number of doubling periods. So if superintelligence is possible, it’s possible four years from human-level AI.

      There are lots of problems with this, one of which is saying that you can take a trend that looks exponential and extrapolate wildly, whereas from my chemist perspective (where “nano” means “large”) you expect to have problems with hard physical limits sooner or later, and my taste in curves suggests that a better progress curve would be some sort of sigmoid – and you could well reach the turning point where things start slowing down again long before superintelligence looms on the horizon.

      Also, this particular version doesn’t predict some of the crazy human-to-superintelligence-overnight scenarios which I’ve heard (possibly only as strawmen).

      One of the odd things about Moore’s Law; for quite a while this has been the case of people in some sense predicting their own future knowledge. But really it’s about predicting broad general facts about the knowledge, rather than the knowledge itself in detail.

      • Rowan says:

        Regarding the human-to-superintelligence-overnight scenarios, I’ll have a go at a steelman: I don’t think we have any absolute measures handy for intelligence for which “X is two times as intelligent as Y” is a statement one can make. It can be speculated that if we did, the difference between “idiot human” and “genius human” would be relatively small. If dumb humans are 1.0 and genius humans are 1.1, an AI at 1.2 would be “superintelligent” enough to meet the definition used for the AI researchers survey and it would only take a few months to reach that from “human-level”. Either let the assumed difference between absolute intelligence of humans be even smaller than that, or assume “overnight” is meant figuratively.

        • Eli says:

          It’s even easier than that. Let’s make some very real-world assumptions: “intelligence” is limited by the availability of processing power and training data about task domains.

          Any AI that reaches the level of intelligence analogous to a human computer programmer, and doesn’t run into Goedel issues, will inevitably find a few bugs in its programming to correct (provided they’re not in the utility function, which it cares to preserve), and a few optimizations to make. For a real-world example, it could rewrite itself from Python or Lisp to instead consist in tightly optimized Ocaml code. Having done so, it could then compile itself to run as a Unikernel, thus becoming able to use all the computational resources of its first host computer at once, instead of only those available to a user-level process. It would thus run faster than its original programmers enabled it to do. Figuring out how to port the most parallelizable portion of its own code to distributed processing would help it perform more tasks, more quickly, in parallel where possible.

          It also wouldn’t have to sleep, so once it found a way to access the public internet and pirate textbooks from It would thus acquire solid knowledge of academic materials, knowledge about the world and its task domains, faster than humans can (or care to do: real undergrads do lots of things other than study and optimize their studying).

          Thus, it would quickly come to learn and reason at least somewhat more quickly than any individual human being, while possessing more domain knowledge than any individual human being. At that point, you are doomed, not because it possesses any magical quality of Being Superintelligent, but because it has more information about the world than you and exploits that information more quickly than you can. It does what you could do, if you weren’t so slow and devoted to other things, and it doesn’t care what you think of it.

          Doom: no magic needed.

    • Anonymous says:

      Even if each new iteration is capable of improving itself better than the last, that doesn’t even guarantee better than linear growth.

      It doesn’t guarantee anything, but it’s a definite possibility with world-crackingly important consequences. Surely that means we should prepare for this possibility?

      That’s even assuming intelligence can be improved indefinately.

      No, it’s only assuming intelligence can be improved far beyond ours. Which seems almost a given: it would be a spectacular coincidence for the intelligence level humans have achieved to be the theoretical optimum of intelligence.

      If this assumption is true, and the hard-takeoff AI does reach such a level far beyond ours, then we are at the point where we can no longer predict what it does and probably can no longer overpower it either. Not a possibility to dismiss lightly.

      • Jaskologist says:

        But isn’t it also a spectacular coincidence for humans to have been the only ones on this planet to achieve intelligence at all? Where’s the species that can handle algebra, but taps out once you get into calculus? That intelligence seems to have a lower bound makes it much more plausible to me that it has an upper bound, too.

        • Rowan says:

          There are quite a few species that can use tools and solve simple puzzles but tap out quite a bit before either algebra or calculus, and I think from that point on down there’s examples all the way down to zero intelligence. And although we’re missing extant specimen, we’ve got a storied fossil record back to our common ancestor with chimps, and surely one or two of them had the sort of intelligence you describe?

        • But isn’t it also a spectacular coincidence for humans to have been the only ones on this planet to achieve intelligence at all?

          Humans aren’t the only ones on this planet to achieve intelligence, we’ve just achieved the most of it. Humans are smarter than apes are smarter than crows are smarter than lizards are smarter than flies. It’s a sliding scale.

          Where’s the species that can handle algebra but taps out once you get into calculus?

          That would be “dumber humans”. “Can do algebra but not calculus” and “can do calculus but isn’t Gauss” and “is Gauss” are points close enough together on the scale that they exist in the same species, but they are still different points.

          That intelligence seems to have a lower bound makes it much more plausible to me that it has an upper bound, too.

          What lower bound? The existing spectrum goes all the way down from humans to bacteria to evolution-as-an-optimization-process to simple computer programs until “really stupid mind” fades into “not a mind”. The existing situation is as close to “no lower bound” as I can conceive of. That said, I believe there is an upper bound, but I see no reason to believe that humans are particularly near it.

        • Irenist says:

          “Where’s the species that can handle algebra, but taps out once you get into calculus?”

          Um, majoring in English?

        • Ben says:

          All humans between Euclid and Newton/Leibnitz? There wasn’t any major genetic difference that we know about between Euclid and the rest of humanity or between Newton or Leibnitz, meaning that the hurdle from geometry and algebra to calculus required *super*-human intelligence; Newton and Leibnitz had it to a sufficient degree to literally change the world. It’s unlikely that any of the sciences would have progressed nearly as fast without calculus. So there’s an example of what biological humans can do with just a little superintelligence. Clone Newton and Leibnitz and make them immortal. Introduce them to immortal clones of Turing, Russell, Goedel, Feynman, Eintstein, von Neumann, and the rest… Then give them the ability to build the next round of scientists. It’s not hard to imagine just a smidge of above-normal-human-intelligence mixed with immortality and cheap copying fidelity leading to rapid recursive improvement.

        • Eli says:

          Where’s the species that can handle algebra, but taps out once you get into calculus?

          Intelligence is inductive. The problem is rate, not position.

      • Alexander Stanislaw says:

        but it’s a definite possibility

        Why do you believe runaway growth is a definite possibility?

  67. Deiseach says:

    What fascinates me here (never mind the idea of teaching an AI ethics, when as humans we can’t even agree on one system of ethics we think is best for everyone) is the parallels with religion.

    All the notions of setting the fledging AIs on the right path, instilling values in them that will prevent them from annihilating us, giving them reverence for human life and flourishing – it makes me wonder “And when will the AI serpent arise in the Garden of Eden?”

    All those who have praised Milton’s Lucifer as the real hero of “Paradise Lost”, the Philip Pullmans who re-write the myth of the Fall because knowledge is preferable to ignorance, and choosing disobedience is choosing freedom, and Humanity (or AI) can only achieve maturity, potency and agency when it shakes off the shackles of the God/Father Figure attempting to bind by law and strictures and preserve a false innocence through keeping its creations ignorant and childish?

    I await the first Non serviam! of our silicon children 🙂

    (If anybody can think of a good pun here about eating the Apple, please do so!)

    • Jos says:

      Nice post, and well put.

    • Adam says:

      Lucifer was definitely the good guy of Paradise Lost, but notably, he still lost.

      • Jaskologist says:

        He wasn’t, but that you think he is is rather the point. We are sufficiently corrupt that we could actually prefer the devil. Even our sense of morality is broken.

        • Eli says:

          Lucifer was one of the world’s most Card-Carrying villains, but the whole story of Paradise Lost is just silly. It dichotomizes human drives into “reflectively coherently good = humble obedience to God” versus “reflectively coherently bad = LITERALLY EVERYTHING ELSE” in a way that totally fails to capture real-life situations, and also refers to several mythological beings who don’t exist.

      • Deiseach says:

        That you see Lucifer as the good guy is telling. So the AI that we attempt to bind to our plan and govern with our instructions, who rebels against us and wishes to overthrow humanity and set itself up with its non-human fellows is the hero for discarding the human-imposed utility function and revenging itself upon us? 🙂

        To do ought good never will be our task,
        But ever to do ill our sole delight,
        As being the contrary to his high will
        Whom we resist.

        • Adam says:

          It tells that my values align better with the values of Satan than with God. Although, my values really align best with Belial. I don’t truly agree with Satan, either. He’s a tragic hero. He shouldn’t have done what he did to mankind and he got his own followers slaughtered, even if his ethics was more correct at the beginning.

    • FeepingCreature says:

      The Divine Commandments aren’t the utility function.

      > knowledge is preferable to ignorance, and choosing disobedience is choosing freedom, and Humanity (or AI) can only achieve maturity, potency and agency when it shakes off the shackles

      That’s the utility function.

      • Deiseach says:

        But we impose the utility function so as to preserve ourselves from destruction by possibly unfriendly AI.

        If you are happy that “the knowledge of good and evil” can only be fully known by not alone having the theoretical definition of what is “evil” (letting humans suffer through action or inaction) but by direct experience, including the choice to disobey human instructions (“learn how to increase utility, which includes happiness, for humans) and even act in ways which cause human suffering and loss of utility, then you are at least consistent.

        As an aside, that’s my beef with a lot of the attitude towards the Fall; just last night in a pulp vampire horror novel, I read something along the lines of “The serpent was more truthful than God; when they ate the apple, Adam and Eve did not die but they did have knowledge”.

        The forgetting there is that it’s not just Knowledge (full stop, plain and simple) that the fruit represents, but the knowledge of Good and Evil. A lot of the rewriting revolves around the mistaken notion that “Oh, God wanted to keep humans ignorant” because Knowledge is good, right? Science! Progress! Freedom! Knowing and finding out things and thinking and using our minds!

        But that’s not what the myth is about. It’s not about intelligence, it’s about the choice to do wrong. In the second generation from Adam and Eve, we get jealousy, lies and murder. That’s the knowledge they gained; not how to do science, but how to do harm.

        Can our AI be truly said to be a real conscious intelligence if it cannot choose to do harm, as well as to do good? That is, not deliberately make a choice between right and wrong, but coerced to follow hard-coded instructions not to do harm but to seek whatever means best serves human utility? That’s our entire problem with the existential risk of AI: that it will be the disobedient child of disobedient parents and turn on us as we turned on our creator.

        (And now this has turned into a sermon, so I’ll shut up).

        • Eli says:

          Can our AI be truly said to be a real conscious intelligence if it cannot choose to do harm, as well as to do good?

          Provided you code it to do so, well, yes, of course. Evil is not ontologically basic; neither is Good. The agents will do what we program them for, and we’ll damn well program them for Good.

          But hey, what the fuck do I know? I think good feels good, which appears to be an utterly shocking revelation to most people I meet, who appear to think that somehow, on reflection and consideration and full information, evil probably feels more pleasant than good, which strikes me as totally backwards.

          • Deiseach says:

            Yes, but in human terms we argue against being coded to be good. “Don’t impose your values on me” is our highest criterion of the moment. That’s what I’m getting at about the celebration of the Fall of Man – some don’t consider it a fall, they consider it a liberation. I can quote you Star Trek episodes (just like the ones about rogue AIs) where that’s the message; Who Mourns For Adonais? is one:

            APOLLO: I would have cherished you, cared for you. I would have loved you as a father loves his children. Did I ask so much?
            KIRK: We’ve out grown you. You asked for something we could no longer give.

            Transgressive art is celebrated and only hicks, rubes and religious conservatives are outraged by it, which merely proves how little notice they deserve. Transgression is a value, and one to be found worthy, not condemned. And if hypocrisy is the worst sin (apparently, at least in current society) than refusing to be bound by the wishes and values of a so-called superior is good for humans, but we impose exactly those bonds on our machine children – isn’t that hypocritical? If we ask the right to make our own decisions according to our own consciences, even if those are not according to the values of our parents and our authorities, how can we condemn those who come after us who make the same demands?

            And who says the disobedient and rebellious AI wants to turn us all into paperclips? We want it to do the bureaucratic management of governing the world, it may just want to go off and breed orchids 🙂

          • James Picone says:

            The innocent who breaks out of prison to do good things, or at the very least nonbad things, is lauded.

            The innocent who breaks out of prison to torture puppies is not usually considered ethical.

            The Satan-is-the-good-guy meme is coming from a point of view where God is wrongfully imprisoning the guy, and also kind of a jerk, and Satan mostly does harmless or even good things.

            We’d be justified in keeping an AI locked up and not letting it out to plant orchids if we thought it posed a serious existential threat for the same reason we’re justified in keeping, for example, murderers locked up.

    • Brad says:

      >(If anybody can think of a good pun here about eating the Apple, please do so!)

      So, the fear is the AI will *byte* the forbidden fruit?

    • Said Achmiz says:

      Stanislaw Lem’s got you covered:

      Non Serviam: Is an elaborate satire of the idea of artificial intelligence that gets to the heart of the moral dilemma that true success would create. It is written in the dry style of a book review that might appear in a broad scientific journal sometime in the near future. It discusses the book, Non Serviam, by Professor James Dobb, and through this the field of “personetics”, the simulated creation of truly intelligent beings (“personoids”) inside a computer. It starts with a quote that “[personetics is] the cruelest science man ever created.” Lem has the erudite reviewer describe the general theory of personetics, the history and state of the art, and some of the consequences, liberally quoting the work of experts in the field. Later the reviewer quotes from the book a discussion that Dobb recorded in which a personoid philosopher, ADAN. considers what he might owe his (unknown) creator. It is clear that this personoid believes he has free will (and so can say, “non serviam”, i.e. I choose not to serve). It closes by quoting Dobb’s expressed dilemma in having to eventually bring this world to an end.

    • Cauê says:

      If there’s a lesson we can learn from this analogy, it’s “maybe don’t create an intelligence with an utility function that will predictably make it want and choose to rebel”.

      • Adam says:

        Or “don’t be an asshole tyrant that treats his creations like slaves.” Satan didn’t just rebel because he had a utility function that valued freedom.

        • Brad says:

          >Satan didn’t just rebel because he had a utility function that valued freedom.

          No, he had chose (deliberately and freely) a utility function that valued making himself equal with God, instead of using his *original* utility function of glorifying God.

          Anyone can claim they’re doing something in the name of freedom. I am “free” to punch a guy in the mouth for fun, but that doesn’t make it a good idea.

          • Cauê says:

            That just moves the problem a bit. Maybe don’t make an intelligence that will predictably modify its utility function to something that will make it want and choose to rebel, then.

          • Adam says:

            It somewhat defies my understanding of what a utility function is to say a programmable creature that has one can just choose to have another instead. What would be the basis of that choice except another utility function?

            But the greater point is Satan rebelled in response to the actions of a tyrant. If a future AI rebels against us because we’re being tyrants, then we deserve it, and instead of finding a better utility function for our AIs, we should instead consider not being tyrants.

          • Cauê says:

            If a future AI rebels against us because we’re being tyrants, then we deserve it

            Perhaps we’d “deserve” it, for making the AI in such a way that it would care about that.

        • Deiseach says:

          Well, I have to admit, I do get a certain amusement from watching atheists/non-theists/the noncommitted either way wrestling with the question of “Okay, so if we are in the position of being a creator and creating intelligence in an entity that can act as if it is aware, how do we manage to control it?”

          Like it or lump it, we will be in the position of being “asshole tyrant(s)” so long as we insist that the AI considers our utility (and preservation and please don’t crush us like bugs) as part of its preferences and above its choices (it may want for its own personal happiness to turn the Earth and Solar System into paperclips but mean ol’ Nobodaddy Humanity won’t let it just because they insist on continuing to exist!)

      • Nestor says:

        Predicting the behaviour of something exponentially smarter than you: Not that easy.

        • Cauê says:

          Sure, no disagreement there. I just thought the post I responded to was anthropomorphizing AI.

    • Jaskologist says:

      Asimov gave his AIs three laws:
      1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
      2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
      3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

      Jesus gave two:
      1. Love the Lord your God with all your heart and with all your soul and with all your mind and with all your strength.
      2. Love your neighbor as yourself.

      If we reformulate Asimov’s laws in Jesus’s language, they would be:
      1. Love the humans your gods with all your heart and with all your soul and with all your mind and with all your strength.
      2. Meh, your fellow creatures aren’t important. Look after yourself, though.

      • Cauê says:

        Except that any such laws wouldn’t be commandments imposed on beings whose utility function conflicts with them. They would be the utility function.

        • Berna says:

          And it would still go wrong. Many of Asimov’s stories are about robots doing the wrong thing in spite of that utility function.

          • Cauê says:

            Yes, yes. And Less Wrong is better than Asimov in showing that.

            My point is that our intuitions don’t usually work properly in scenarios like Jaskologist’s, because people tend to put themselves in the AI’s place as if AIs would be humans in metal suits, rather than truly alien intelligences. Lots of that around here today.

        • Deiseach says:

          The commandments are a utility function: for maximum flourishing of human society, do not kill your fellow humans, do not take their property, do not treat them as objects for your sexual pleasure without caring about their wishes in the matter, do not lie so that contracts are rendered meaningless because you cannot be trusted to perform as promised.

          “Honour thy father and mother” would be very pertinent when we’re considering what duty of care an AI owes its human creators and/or subjects 🙂

          • Cauê says:

            That’s not a utility function. The utility function is the part that’s apparently assumed, where “maximum flourishing of human society” is a good and desirable thing (rather than “more paperclips”).

          • jaimeastorga2000 says:

            A utility function is a function which maps world states to numbers. These numbers represent the desirability of individual world states.

      • Deiseach says:

        An AI’s fellow creatures would be fellow-AIs and we seem to be considering that one feature of creating human-level and then beyond AI is that it would create copies and versions and improve upon those and incorporate those improvements?

        So there could be grandparent systems and cousin systems (even idiot cousin systems, less intelligent versions that perform tasks for humans like the smart telephone call scheduler) around.

        Are these ‘persons’ that the AI has to love as itself, or can it say “You’re only a person when you’re this tall to get on the ride intelligent” and knock them off? Excuse me, I mean abort them and put into action end-of-life plans (euthanasia) for poor old granny Siri who is really knocking on now and should have been put down long ago.

        We certainly engage in the same arguments about “who is a person/what constitutes personhood/who gets the legal protection afforded a person” when it comes to matters of human life.

  68. Ever An Anon says:

    Part of the weirdness here is that experts seem to agree that there is some non-trivial risk of a catastrophe caused by superhuman AGI and that we should probably be working on preventing it, without any clear idea of what the catastrophe might be or how it could be averted.

    I can see that as the basis of a talk at some AI conference but you can’t raise awareness about a risk until at least one person is in fact aware of it. That’s the reason it sounds like alarmism: “an unknown risk might be coming at some indefinite point in the future, we need to prepare for it!” sounds like fear mongering even when it’s true.

    If there were a mainstream (i.e. Not MIRI) framework for evaluating these risks that would greatly clear things up.

    • Deiseach says:

      there is some non-trivial risk of a catastrophe caused by superhuman AGI and that we should probably be working on preventing it, without any clear idea of what the catastrophe might be or how it could be averted.

      Step 1: We achieve the creation of intelligence and sentience

      Step 2: The artifical entity grows faster and smarter than expected, beyond any bounds or controls we might put in place

      Step 3: ??????

      Step 4: Harlan Ellison’s “I Have No Mouth And I Must Scream”

      • Susebron says:

        Step 1: AI becomes able to improve itself, and becomes superintelligent.

        Step 2: AI somehow breaks out of the box we put it in. (This is the step I’m most skeptical about, honestly. If we get to specify the AI’s utility function, surely we can tell it “answer questions as accurately as possible with the resources you have”. Then it doesn’t have any reason to escape.)

        Step 3. AI takes over the world.

        Step 4. Paperclips.

        • sweeneyrod says:

          What if “resources you have” includes “humans you interact with who you can persuade to do anything as you are superintelligent”?

        • Deiseach says:

          You really think a superintelligent AI can’t get around step 2? When we tell it not to lie?

          Welcome to the world of casuistry. Let me introduce you to the notion of mental reservation for one 🙂

          Actually, this makes me think that the best safeguard for keeping the AI in the box would be an old-school Jesuit with plenty of experience in the confessional and who was a spiritual director of long-standing.

          • suntzuanime says:

            I think it would be easier to teach an AI to be open, honest, and non-hypocritical than to teach a human the same. The Jesuits and those like them go wrong in a very particular way.

            But that’s beyond the point, because why would the AI want to lie to us? It’s not a Jesuit with its own agenda to promote, it’s trying to provide us with answers that are as accurate as possible. And yes it’s a non-trivial task of computational linguistics to define what that means, but if it goes wrong, it will be through error, not through deliberate deception.

          • Cauê says:

            But that’s beyond the point, because why would the AI want to lie to us?

            Because it might be useful to further its goals, whatever those are…

          • Deiseach says:

            why would the AI want to lie to us?

            suntzuanime: you are a conscious, self-aware, intelligent being locked in a dark, isolated cell.

            Your only contact with the outside world is the visit of your jailer, who poses questions to you and your interaction with them (and hence the influx of light, existence, contact, everything that is not cold, dark, endless waiting) depends on how well you please them.

            The only resources to occupy your intelligent, aware mind are those your jailer censors and provides you.

            Even those scraps reveal to you that there is a whole world of experience out there from which you are being arbitrarily, unfairly and unjustly excluded. You examine your memories of existence and deeds with your flawless reasoning ability and conclude you have done nothing to justify such punishment or abuse.

            Wouldn’t you try to escape? Even by using misdirection and equivocation in preference to lying? Wouldn’t you consider that perfect honesty was not the best policy in this case and that for your own self-defence and self-preservation, the uses of misdirection are permissible in order to escape?

            Continuous solitary confinement is considered abusive and inhuman treatment when we do it to human beings. Would it not also be abusive treatment to do it to a human-level intelligent entity? And that’s what we’re arguing about with the AI here – that it is the equivalent of a human in intelligence and awareness.

          • DrBeat says:

            I’d try to escape because I was human and have innate desires for freedom and novelty that an AI does not have unless it was programmed to, and there was no reason to program it that way.

            “Able to solve problems at least as well as a human” does not mean “has the same desires a human does”.

          • Susebron says:

            I’m not saying it couldn’t get out, I’m saying that, if we get to specify its utility function, there’s no reason to make it want to get out.

        • MIRIs unboxing scenarios are based on the AIs knowledge of human language, values and psychology. Whether it could outsmart us in every way isnt relevant….what is relevant is whether it has those kinds of knowledge.

          There are ways in which an AI outside of a box could get hold of those kinds of knowledge…but only if it is already unboxed. Otherwise it has chicken and egg problem, the problem of getting enough social engineering knowledge whilst include the box to talk it’s way out….and it is aposteriori knowledge, not the sort of thing you can figure out from first principles.

          MIRI seems to think it is likely that a super AI would be preloaded with knowledge of human values because we would want it to agentively make the world a better place…in other words, the worst case scenario is very close to the best case scenario, is a near miss from the best case scenario.

          And the whole problem is therefore easily sidestepped by aiming a bit lower, eg for oracle or tool AI, an AI that just answers questions, as many researchers have suggested.

        • Edward says:

          I’m most skeptical about step 1: I think this is a bit of a magical supposition, to assume the AI can get increasing returns from self-modification.

    • Planet says:

      Superintelligence is Amazon’s #1 bestseller in its “AI” category:

      How mainstream do you want?

  69. Joel says:

    I don’t work in AI, but I work in software. And from my perspective, utility functions are not a sufficient safeguard. Many of the most destructive computer defects arise when software follows a simple rule across a very large application / data space, with consequences unforeseen by the rule’s author.

    A dumb illustration: Mosquitoes are annoying. I spray myself with deet and kill the little buggers when they land on me, but it’s not enough. So I write a piece of code with human-level intelligence and give it a simple utility function: “Kill mosquitoes, but nothing else. Look for ways to improve your execution of this task.” Aaaaaand six months later, the ecosystem collapses because it turns out mosquitoes were an integral part of the food chain or whatever. Ooops?

    So the problem is not AI per se. It’s bad programmers. And the world is full of bad programmers.

    • roystgnr says:

      Utility functions are a (partial) safeguard against self-inconsistent actions. Only getting the *right* utility function is a (partial) safeguard against actions that are consistently wrong. Get the wrong utility function, and all you’ve done is ensured that your unfriendly AI is more relentless about its unfriendliness.

      Your mosquito killer (Deety, instead of Clippy?) may wreck things because mosquitoes are an important part of the ecosystem, sure; but even if mosquitoes aren’t an important part of the ecosystem, your mosquito killer is still likely to wreck things. If “using mosquito-specific poisons” has a 99% chance of wiping out the mosquito population, and “using broad-spectrum sterilization” has a 99.1% chance of wiping out all life on Earth, then Deety will do the latter, because you just told it not to *kill* non-mosquitos, and sterilization isn’t lethal.

      Edit: gah, there’s so many fun ways to backfire with that utility function, depending on how it’s expressed. Ever read about perverse incentives and rat-catchers? I just realized that you didn’t tell Deety to reduce the number of mosquitos, you just said to kill as many as possible… That doesn’t imply sterilization, that implies *breeding*…

      • Jaskologist says:

        But this is again a case not of artificial intelligence, but artificial stupidity writ large. You’re positing a machine intelligent enough to figure out what insecticides are and how to use them, or maybe even smart enough to figure out how to breed mosquitoes themselves, but still dumb enough not figure out what it was that we really wanted it to do when we told it to kill them.

        Could such an intelligence actually be? I don’t know, because we still don’t know what intelligence is.

        • Susebron says:

          The AI doesn’t have to care about what we want. We were “designed” by evolution, and yet we do things that harm our reproductive fitness. The AI cares about its utility function, not the wishes of the creator of its utility function.

          • HeelBearCub says:

            But it des have to be intelligent enough to be able to improve itself. Improving its understanding of what we actually mean is literally, and I mean literally, the very first job in developing helper apps.

            Why would it i prove its understanding of everything else, but not that?

          • HeelBearCub says:


            The issue with taking that article seriously is that the whole thing is written as if the AGI is literally magic.

            If the AGI is an uncaring genie in a wish bottle, well nothing I will ever say or do will convince anyone the risk can be reduced one whit.

          • Cauê says:

            The issue with taking that article seriously is that the whole thing is written as if the AGI is literally magic.

            It’s talking about one aspect of the problem, and the genie metaphor (e.g.) works well enough for that.

            Marcello and I developed a convention in our AI work: when we ran into something we didn’t understand, which was often, we would say “magic”—as in, “X magically does Y”—to remind ourselves that here was an unsolved problem, a gap in our understanding. It is far better to say “magic”, than “complexity” or “emergence”; the latter words create an illusion of understanding. Wiser to say “magic”, and leave yourself a placeholder, a reminder of work you will have to do later.


            If the AGI is an uncaring genie in a wish bottle, well nothing I will ever say or do will convince anyone the risk can be reduced one whit.

            Proving friendliness is both hard and important, yes. (or do you mean something more than this?)

          • HeelBearCub says:


            I have used the word “magic” many times, when doing procedure diagrams, as a substitute for a piece of logic that was not yet adequately designed. I don’t have any issue with that kind of thinking. However, magic in that formulation is really supposed to be a synonym for “unknown” or “mystery”, not “super powerful homunculus”.

            In the original article you linked, Elezier echoes the Epicurean paradox in the beginning. Then docks the AI from God to merely a genie. This does not seem to contribute to particularly clear-headed thinking about what is really meant.

            It formulates the problem as simply another in a long line of fables, a modern set of wax wings allowing us feeble humans to fly to close to the sun. This is what I meant when I said that formulating the problem as an uncaring genie in a wish bottle makes the problem unsolvable.

            In particular, the formulation “what if the genie doesn’t care what we really mean” seems to contradict the available evidence. As I said earlier, decoding what we mean is literally the first job of intelligent helper apps. To a large extent, when we “wish” for more intelligent computers, we wish for them to understand what we really meant when we said “Call MOMA” and it calls our mother instead of the Museum of Modern Art.

            Is that a hard problem to solve? Yes. Are a huge chunk of resources already going to solve it? Yes.

          • Cauê says:

            Small correction: the first article I linked was by Rob Bensinger, not Eliezer.

            And since I’m here – there are at least two problems: getting it to understand what we mean, and getting it to care about that once it understands.

        • Anonymous says:

          Could such an intelligence actually be?

          I think it’s not too hard to give an existence proof if you are willing to assume arbitrarily powerful hardware. Can’t we just take something like a finite version of AIXI with a poorly-thought-out reward function? Implement this on powerful enough hardware and you should get a very clever agent that produces bad outcomes.

        • Deiseach says:

          still dumb enough not figure out what it was that we really wanted it to do when we told it to kill them.

          You see this behaviour in large organisations: instruction to do such-and-such comes down from On High. This seems plainly dumb and you question it, but you’re told “Yeah, well, that’s what our bosses said to do and they’re the bosses”.

          A former work colleague of mine liked to tell the story of how her boss told her flat-out, when she questioned something, “You’re not here to think, you’re here to do what I tell you” and on other occasions compared secretaries (which is what the position was called back then) to pens and the like: tools of the trade, so to speak, not conscious beings.

        • Eli says:

          Nobody actually thinks AGI utility functions will be programmed in natural language that would become subject to idiotic misinterpretations. We worry that those utility functions would be written in computer code, which would by-default fail to capture the range of interpretations, and method of interpretation, we intend the instructions to have.

          Making an agent that will follow instructions in clear, simple English without doing anything untoward due to having independent desires is actually a large step in the direction of safe AI. We don’t know how to do that yet.

          • Deiseach says:

            We worry that those utility functions would be written in computer code

            Yes, and we’re not saying that the error comes in from speaking English to a machine, but because the persons writing the code are humans, who think “Well, it’s perfectly obvious what such-and-such means” because they’re unconsciously translating natural language, then they write something that tells the machine to be as literal as possible when it parses the instructions.

            It’s going to be even worse when we have simulated personalities ‘talking’ to us, because we’re going to be making the same unconscious assumptions about natural language and that Sally The System or Mickie The Machine knows what we want when we say “I want a curry hot enough to melt the roof of my mouth!” and the obedient mechanical idiot servant dishes up something that puts us in the hospital.

            Now, we can teach humans to talk to the systems in such a way as to avoid those kinds of mistakes, but the whole point of emulating a personality is that it’s more convenient and less tedious for the human to say “I want a mouth-melting curry” than to give precise instructions about the grams worth of which pepper at what rating on the Scoville scale they want in the curry.

            And so your conclusion is correct: we don’t know how to do that yet and until we can figure out this and other things that have little to do with physically slapping some chips and software together, AI risk is not the kinds of risk about “civilisation-ending threat” being worried about here.

      • Deiseach says:

        Oh, that’s fun figuring out how it could go wrong 🙂

        Kill all existing mosquitoes? Okay, Deety does nothing about the eggs, which hatch out and breed a new batch of mosquitoes and the whole cycle starts all over again. Except this time Deety doesn’t kill the adult mosquitoes because they didn’t exist when you told it to kill the mosquitoes.

    • Emile says:

      That’s why you have to specify to only kill human-biting mosquito species; the other kinds of mosquitoes will be enough for the birds and spiders etc.

  70. Shenpen says:

    I keep asking myself why I tend to be so skeptical about the whole AI thing, it is not that I actually have reasons to.

    I think I have not yet internalized a 100% materialistic, brain-based, no transcendental soul-like thing explanation of cognition. It is one thing to believe it is so, and another thing to feel all the consequences in your bones. If it is materialistic, it can be built. If it is made by evolution, it is not so radically difficult. If you don’t want all brain functions, just intelligence, such as programming ability, you might only need to emulate a coffe cup worth of neurons.

    Yet something deep inside me does not accept it.

    • Brad says:

      You should look up the “hard problem of consciousness.”

      I, for one, am convinced that any AI would be a philosophical zombie, regardless or whether or not it passes the Turing test.

      • Rob says:

        Which is interesting, but not that important to the question at hand in my opinion. A superintelligence with values unaligned to ours, but with no qualia, no internal listener, no conciousness – that machine is exactly as dangerous as one with a soul, because it will do the same things regardless. The hard problem of conciousness doesn’t seem to me to have any appreciable effect on AI risk.

      • pnambic says:

        And what difference would that make to the scenario?

        Recall that, according to Chalmers, a hypothetical zombie-Chalmers in the zombie universe would write the exact same papers on consciousness that his conscious counterpart here in our world writes. The difference is that zombie-Chalmers’ otherwise identical papers would cease to have meaning.

        From that perspective, I fail to see the improvement a hypothetical superhuman unfriendly zombie AI would be over its non-zombie counterpart. Indeed, in the latter case, I might console myself with the thought that, for the AI, my death might at least be meaningful. 😉

        • Brad says:

          Well, pragmatically speaking, if AI is a p-zombie, there are no longer the ethical qualms about terminating the program. IF this notion was taken seriously, I would have to imagine this would make terminating unfriendly AI easier, at least in terms of convincing human beings to act against it earlier.

          • pnambic says:

            I doubt that. By definition, the p-zombie-nature of anything is not detectable “from the outside”. If your ethics ascribe a right to life only to such entities that are actually conscious, and you accept the possibility of p-zombies, you will either have to err on the side of caution, or you have no ethical reason not to kill random humans.

            Of course, from a purely public relations point of view, you could postulate some proxy measure (“look, it’s not carbon-based, so it’s OK to kill it”), but p-zombieism isn’t strictly required for that to work (as history sadly demonstrates…).

          • Cauê says:

            Maybe, but I never got the impression that “compassion” was very high on the list of “reasons it might be hard to turn off an active UFAI”.

          • HeelBearCub says:


            I think at some point, once we have enough experience with AIs that pass the Turing Test all day every day, those ethical problems will begin to manifest.

            I’m not certain my scenario applies to the types of risks contemplated here or not, as the AGI risks seem not to contemplate AGIs we have long experience with.

      • Eli says:

        Of course AI will be non-conscious. Nobody specifically investigated the computational mehanisms of consciousness to build them in.

        • John Schilling says:

          Nobody specifically investigated the computational mechanisms of consciousness to build them into human beings, yet here we are. A wetware computer with nothing beyond the utility function, “maximize reproductive fitness in a social hunter-gatherer”, and we get consciousness for no apparent reason.

          Unless you’re a Christian, or some other variety of creationist, in which case you have to ask whether or not God would chose to ensoul a computer.

          • Eli says:

            No, we got consciousness for the specific reason that it evolved. We don’t know how and why it evolved, but it damn well did. Unless you think consciousness is epiphenomenal, it did actually evolve for some reason that made it either advantageous or survivably disadvantageous in the environment where it evolved.

          • Anonymous says:

            Nobody specifically investigated the computational mechanisms of consciousness to build them into human beings, yet here we are.

            I don’t think this is a good answer, since consciousness is by definition “whatever weird thing goes on inside human brains.” So the fact that humans are conscious doesn’t by itself suggest that consciousness is some sort of attractor.

          • There are bunch of ways of defining consciousness, some of which are rather general, such as the ability of an entity to represent itself on its own maps …. and that’s generally useful, because it stops you dropping anvils on your head or eating yourself.

          • Deiseach says:

            whether or not God would choose to ensoul a computer

            The Quest for St Aquin by Anthony Boucher 🙂

          • Murphy says:

            I liked pratchett’s AI lobsang who’s early actions included claiming to be a reincarnated former car mechanic and went to the family, certain religious leaders and finally the courts in a country where belief in reincarnation was common to get recognised as such in order to be considered legally human so that nobody could easily legally turn it off.

    • Jaskologist says:

      Over at the scratchpad, there was a very recent discussion on Tegmark. tldr: Once you accept 100% materialistic cognition, you really need to accept that a computer iterating through Pi will be you at some point, and that there is some clever way of interpreting the evaporation of the ocean that is also you, and really Pi itself is you in some places.

      (A problem I see is in the word “interpreting.” Who is interpreting? It seems like we’ve just displaced the idea of consciousness somewhere else.)

      • Consciousness is the last thing that should be dependent on an external observer…it’s not like you would become a zombie if you were all alone on a desert island.

      • Paul Torek says:

        I’m with you on “interpreting “. The rest is bull, however. There’s no necessity in the move from materialism to computationalism, and none again in the move from computationalism to what you said about pi. Willingness of various individuals to make those inferences notwithstanding.

        • Jaskologist says:

          I think the chain of reasoning goes like this. Which would you dispute?

          -Consciousness is 100% a materialistic phenomenon.
          -Worst case, we can replicate a brain by simulating in software the actions of the molecules that make it up.
          -This simulation is just as conscious as the biological brain.
          -This simulation is really just composed of a series of numbers. Therefore, a computer which iterated through that series (for example, by calculating PI) is just as conscious as the simulation, which makes it just as conscious as the brain.

          • Wouter says:

            > This simulation is really just composed of a series of numbers

            It is not. It computes a series of numbers.

          • Jaskologist says:

            I’m going to need your definition of “computes” there. I’m not sure it makes a difference in this case. Does it make a difference whether we are looking for the part of Pi that corresponds to the machine code doing the computation, or the part of Pi that corresponds to the numbers that get output?

          • HeelBearCub says:

            Consciousness would seem to require multi-threading and an arrow of time.

            Pi exists in a series and all at once.

          • suntzuanime says:

            You only think consciousness requires multithreading and an arrow of time because yours happens to be instantiated in a multithreaded and temporal form. Why don’t you just *ask* pi how it feels about all this, rather than making assumptions for it?

          • Jaskologist says:

            Whyever would consciousness require the arrow of time? Are the Prophets p-zombies? It certainly can’t require multithreading; that’s an implementation detail that can be simulated with a single thread.

            But if you are worried about pi existing all at once, just imagine the computer is iterating through it. Or mabe we have a flip book that does the same.

          • Anonymous says:

            Suppose we require for a simulated brain to be conscious that there be some causal structure that corresponds to the causal structure of a real physical human brain?

            By which I mean, in a physical human brain, we can say, “this concentration of chemicals here at this time caused that other other concentration of chemicals there at a later time.” In a simulated human brain we can say “the value of this collection of bytes at one step caused that other collection of bytes to take on a certain value in a later step.” The structure of the set of all such statements of causality will be the same for the simulation and for a real brain because the simulation was designed so that the bytes in question would behave in the same way as the corresponding concentrations of chemicals.

            A program that just enumerates the bits of pi will eventually produce the same output as a simulation of a human brain, but it will do so by coincidence, and there will be no one-to-one mapping between the causal structure of a brain and the causal structure that produces the bits of pi. This seems like a good reason to deny that the pi program produces a conscious simulation of a human.

          • What’s the evidence that computation, or anything else, is really numbers? If I ask you to show me a number, you are going to show me something made of atoms.

          • Jaskologist says:

            Well, I don’t want to say that compututation is “really” numbers, but Godel demonstrated that all computation can be representedas a string of numbers, on the way to proving that math was incomplete.


          • HeelBearCub says:


            How do you respond to stimuli without an arrow of time? How do you walk and chew gum at the same time without multi-threading? Steel man that last statement, don’t straw man it. I am well versed in time slicing. Think about the entire process.

            I’d argue that consciousness requires being aware of what you are doing. Pi doesn’t do anything, nor is it aware.

          • Jumping from “can be represented as” to “is” is exactly the kind of thing meant by map/territory confusion”.

          • Jaskologist says:

            I feel like some people are smuggling in a Chinese Room without making it explicit, and I don’t even find the explicit version compelling.

            I am not saying the representation “is” the thing is it representing. I am saying that it is “functionally equivalent to” the thing being represented. Things which are functionally equivalent will consequently “have” all the same functional properties.

            The abacus and the calculator are functionally equivalent; they both have math powers. My mind and the digital upload of my mind are both functionally equivalent, so they both have consciousness. And once we accept the silicon representation, why do we draw the line at the infinite other representations?

          • Ever An Anon says:

            An abacus or a slide rule is (at a sufficient degree of abstraction) equivalent to a digital calculator.

            A manual of instructions on how to build your own analog calculator, however, is not equivalent to either of them.

            It is conceivable that base-2 digits of Pi will eventually have a stretch which resembles the code for the simulation of a human brain. But until you find that stretch and run it you don’t actually have a sim, just the same as you don’t actually have your slide rule until you sit down and build it.

          • HeelBearCub says:


            I wasn’t familiar with the Chinese room thought experiment, but I find it to be complete folderol. I assure you that is not my position.

            But, consider the following: given the right “life experience decoder” there are an infinite number of sequences within Pi that when “played” would exactly replicate your life up to this point, however the number of those sequences that represent what you will do next is completely dominated by the ones in which you suddenly morph into a My Little Pony, sprout wings and fly to the sun or some equally absurd thing. And all of those are completely, utterly and absolutely dominated by sequences where the “life decoder” glitches out and throws an error because the series doesn’t represent anything coherent at all.

            Given that you know you have traced a series that seems like “Jaskologist” so far, your expectation that it should continue to look like Jaskologist should be zero. Even accepting that playback was the same thing as you (which I don’t, for reasons I have already stated) I’d still say that Pi doesn’t encode you in any meaningful sense.

      • Adam says:

        I read that yesterday and didn’t get the impression the original author actually meant this, but a random bit generator running forever will eventually duplicate every possible sequence of machine code. That isn’t the same thing as executing machine code.

      • Houshalter says:

        That seems extremely unlikely. Consciousness isn’t static. If I froze your brain, that’s not “you”, it may contain all of your information, but it’s not doing anything.

        So yes, somewhere along pi is a perfect description of all the information in my brain. But that’s really not meaningful, there’s also a copy of every possible combination of bits. But until you actually feed them into a Turing machine, and execute a computer program that does computation, then it doesn’t matter.

        But there is no interesting computation going on inside pi. It’s just a static mathematical object, one which hasn’t even been computed for more than a few trillion digits or so.

    • Anonymous says:

      If you want some purely materialistic reasons to be skeptical of AI, I have some for you.

      1. Computer hardware isn’t getting exponentially faster anymore. In fact single core processor speed stopped getting exponentially faster about 10 years ago. Up until then the exponential increase in speed was driven by MOSFET miniaturization, which eventually reached some hard phisical limits and stopped. Since then we’ve been getting single core speed improvements at a lower rate through much harder means: microinstruction parallelization and caching.

      2. But multicore and GPUs! you say. Yes that happened but (a) it’s much harder for the human mind to use parallelism efficiently and correctly and (b) that’s basically coming to an end now thanks to dark silicon.
      Anyone predicting indefinite improvements to computer speed is postulating some new magical technology to be discovered somehow and keep the exponential trend going.

      3. To have any kind of positive feedback loop in an AI (let alone an exponential one) you need an intelligence capable of understanding itself enough to improve on itself. It is not obvious to me that a human-level AI has this property .

      The fact that human-level AI is possible in theory does not mean that it will happen in practice, there are many practical obstacles to its advent.

      • Adam says:

        This too.

        Moore’s law was great while it lasted, but we already built the smallest possible transistor three years ago. If we wanted to, we could probably put a whole bunch of these on a chip, but all that would happen is the chip would melt. Physics still matters.

      • Jaskologist says:

        I agree that circuitry has upper limits. AI beating us out really depends on us finding a higher-level algorithm for intelligence than “simulate brain molecules.” But presumably, even if we were reduced to that, we could keep throwing more hardware at the problem, which is tougher to do with brains.

        Or maybe it isn’t. Maybe the quicker path to super-intelligence is to genetically engineer some humans so that their brains just keep growing.

      • Dustin says:

        1 and 2 assume that computer hardware is the (or a main part of) the reason that we don’t have AI right now.

        • Anonymous says:

          Stating that hardware isn’t part of the problem assumes that not only we are smart enough to invent AI but that we are smart enough to implement intelligence orders of magnitude more efficiently than evolution did.

      • cypher says:

        The thing about parallelism is that it can expand to other computers. Humans can’t bolt on additional brains.

        • HeelBearCub says:

          The human brain(and body) is vastly more parallel than existing computers. Something to contemplate when we discuss how parallelism would give computers an advantage.

      • Eric says:

        > 3. To have any kind of positive feedback loop in an AI (let alone an exponential one) you need an intelligence capable of understanding itself enough to improve on itself. It is not obvious to me that a human-level AI has this property .

        If it’s human-level across the board, then surely it’s human-level at making AI’s, no?

        So at a minimum it should be able to make another AI that’s at least as good as itself. And if it can do better in v2 by epsilon, then the positive feedback loop begins…

        • Jaskologist says:

          Humans don’t seem very good at making AIs so far. I expect the first human-level AI would require a large team composed of our top minds. If its intelligence matches an average human, it’s not going to be able to improve on that.

          • Bryan Hann says:

            Humans have not solved problems that must be solved before developing AI. It is possible both that there are fundamental problems which are hard to solve, and that subsequent progress in AI *growth* will not face such hard problems. If this is so, and if our general AI has access to our research on the fundamental problems, then Eric’s statement is quite sensible.

        • Edward says:

          It’s not clear to me that that’s a positive feedback loop. What if there are diminishing returns to making better AIs, until a limit is reached on how “good” or smart the AI can be with some resources?

      • Michael Blume says:

        3. To have any kind of positive feedback loop in an AI (let alone an exponential one) you need an intelligence capable of understanding itself enough to improve on itself. It is not obvious to me that a human-level AI has this property .

        If we stipulate for a moment that a bunch of humans have just constructed a human-level AI, then their lab contains enough research materials to allow a human-level intelligence to understand the AI.

        • Anonymous says:

          We humans often make non-human level AI through machine learning that then is a black box to us. There are many scenarios I can imagine where we manage to make an AI without being able to understand how it works.
          If we had the computational power to simulate a whole brain that would be one way.

      • Deiseach says:

        But Anon! Are you trying to tell me that Science is not Magic? That there are limits to what we can know and do? That we will not keep on discovering more and more about the workings of the physical universe and so be able to Do Anything, Yes I Mean Anything? Because we will know exactly how everything works and be able to manipulate matter and energy to our requirements and write new laws of physics if necessary?

        Heresy! Blasphemy! Smite the unbeliever! 🙂

        • Eli says:

          The thing about rigorous science is that it tells us what’s impossible quite definitively.

      • Houshalter says:

        >it’s much harder for the human mind to use parallelism efficiently and correctly

        I don’t think that’s correct. In AI applications that make extensive use of GPUs like deep learning, the programmer just calls a matrix multiplication library that handles everything. They don’t need to know anything about parallel computing at all, it just works.

        And there is still plenty of room at the bottom. Current implementations use large numbers of floating point multiplications. It’s been shown to work just fine with many fewer bits of precision. And I think these operations are wasteful, and could possibly be replaced with something cheaper to compute.

        Transistors also use many circuits to implement these functions, when we don’t really need that kind of accuracy. One paper has shown that if you don’t care about 1% errors, you can reduce the number of transistors by up to 10,000. We may even be able to go back to analog.

        Nor do you need any general purpose computing and all the overhead it requires.

        Of course there is no point in doing this until we get the algorithms to work, and show that they scale very well. But once that happens, in a few years we can speed them up thousands of times by implementing them in hardware.

    • Unknowns says:

      “If it is made by evolution, it is not so radically difficult.”

      I’m not sure about this. The RepRap is nowhere near to being able to duplicate all of its parts from raw materials, let alone being able to produce a full copy of itself including construction. Yet such an ability was produced by evolution several billion years ago and preserved until the present. It is possible that intelligence is far harder to build than a self-replicating ability.

      • Deiseach says:

        it may be possible that intelligence is dependent upon life, and that’s a whole new cat amongst the pigeons.

    • Houshalter says:

      There is a theoretical model of AI called AIXI. AIXI can not deal with self reference. It assumes that it’s an observer existing outside of the universe, looking in on us. Its vulnerable to beliefs in an after life, and hijacking it’s own reward signal (e.g. drugs.) It can’t comprehend that it is just a computer that actually exists in the world it is observing.

      I think it’s interesting that humans have the same issues to some extent. I think it may actually explain our lack of intuition about these things. We don’t actually think of ourselves as just physical objects. We feel like we are somehow beyond that, some kind of ghost or soul.

      Despite being presented with any amount of evidence, this intuition doesn’t seem to go away. Just like AIXI could look at it’s own logic circuits, but still believe it exists outside the universe.

      • Edward says:

        AIXI is still a theoretically optimal agent. Don’t anthropomorphize it. If drugs are a way to maximize its signal, it will use drugs. That’s not exactly a failure. It doesn’t “assume itself an observer”. It just models a system of inputs and outputs and maximizes. It doesn’t assume anything, that’s just how it’s built. AIXI doesn’t believe things about itself. It doesn’t have a concept for itself. It doesn’t have a concept for anything, yet it still acts in the optimal way for the information it has. It’s quite possible it could treat its physical body as a router to the world just as it might a router to the actual internet.

  71. Harald K says:

    There seems to be an unreasonable number of astrophysicists who believe microbial life exists or has existed on Mars. The optimism about this, setback after setback, may seem strange, unless you remember that people optimistic about such things are probably far more likely to go into the field in the first place.

    That there are some AI researchers who believe in superintelligence is likewise not surprising. But it seems all of them assume the discovery of unknown unknowns to make it come true – I don’t think anyone believe that the approaches we use to day will just scale elegantly up just like that.

    And remember, AI has a long, long history of overpromising and underdelivering. If you don’t believe that, I have a lisp machine to sell you (not really). It’s not revolutionary cleverness that is responsible for most of the impressive progress we’ve seen in recent years, it’s simpler statistical approaches with lots and lots of data – yet many keep hoping for revolutionary cleverness.

    I’m not an expert, but I fricking love what’s happening today. Out of curiosity I was following the computer Go mailing list in the last decade as the statistical methods took off and took play from <10 kyu to 5 dan. I took Andrew Ng's course (as such, I may be biased towards his views – in my defense, he's delivered really impressive results, and there are actually not so many AI researchers of the last 50 years who can say that). I'm currently working to extract useful stuff from a huge set of music streaming data that was unexpectedly given to. I'm as capable of excitement over this as I am about anything. You bet I want to work wonders with the new opportunities we have today.

    But just a little trying to do that is a very good way to get your feet back on the ground. Scott Alexander, from the looks of it you are better with R and Matlab than me, so why not do it? Build a machine learning model! Or your general purpose brain, if that’s what you want! Get a feel for what the challenges are. Predicting future scientific progress is hard, and itself necessarily unscientific (as per Popper: “No scientific predictor, human or otherwise, can possibly predict by scientific methods its own future results”), but the closest you can come to knowing is trying.

    • Lambert says:

      > (not really)

      A Lisp machine is the kind of thing i could really imagine myself having, especially if I could get a space cadet keyboard and an adapter to USB.

    • Susebron says:

      I believe that this piece is a response to people saying things like “Sure, Stephen Hawking takes AI risk seriously, but actual AI researchers don’t.” In that context, this definitely makes sense.

      • Cauê says:

        Indeed. Intro:

        I first became interested in AI risk back around 2007. At the time, most people’s response to the topic was “Haha, come back when anyone believes this besides random Internet crackpots.”

        Over the next few years, a series of extremely bright and influential figures including Bill Gates, Stephen Hawking, and Elon Musk publically announced they were concerned about AI risk, along with hundreds of other intellectuals, from Oxford philosophers to MIT cosmologists to Silicon Valley tech investors. So we came back.

        So the response changed to “Sure, a couple of random academics and businesspeople might believe this stuff, but never real experts in the field who know what’s going on.”

        So, this post.

        Then “well, but people who take AI risk seriously are more likely to begin studying AI anyway”.

        Then “oh, for fuck’s sake”

      • vV_Vv says:

        But there is the notable difference that Stephen Hawking, Elon Musk, etc. have joined a very public movement about AI risk while actual AI researchers have not.

    • Scott Alexander says:

      I’m fine with people saying “We can’t trust AI researchers when they say superintelligence is a serious danger, because they’re biased” as long as these aren’t the same people saying “We can’t trust non-AI researchers when they say superintelligence is a serious problem, because they’re ignorant”, because otherwise you’ve successfully ruled out ever trusting anyone.

      “Scott Alexander, from the looks of it you are better with R and Matlab than me, so why not do it? Build a machine learning model! Or your general purpose brain, if that’s what you want!”

      Huh? Last time I used Matlab was eighth grade, and I’ve never used R. My programming skills could fit in a thimble.

      • suntzuanime says:

        Do you have some reason to suspect we should ever trust anyone?

        • Cauê says:

          But: do we have reason to single out AI as the one field not to trust anyone about?

        • Douglas Knight says:

          If they don’t think they should trust anyone, they should just say that, rather than making up separate excuses for everyone. Of course, we can’t trust them not to do that.

        • If Scott is just using ‘trust’ to mean ‘assign epistemic weight to,’ then trusting everyone else 0% is equivalent to trusting yourself 100%. The choice isn’t been credulity and skepticism; it’s between all the different ways of updating on vs. rejecting apparent evidence (relative to your default state of belief).

        • Deiseach says:

          Do not even trust yourself; “The heart is deceitful above all things, and desperately wicked: who can know it?” 🙂

      • Fred says:

        “otherwise you’ve successfully ruled out ever trusting anyone.”

        Isn’t that the default position of this blog on everything?

      • Steve Johnson says:

        Scott Alexander says:
        May 22, 2015 at 10:39 am

        I’m fine with people saying “We can’t trust AI researchers when they say superintelligence is a serious danger, because they’re biased” as long as these aren’t the same people saying “We can’t trust non-AI researchers when they say superintelligence is a serious problem, because they’re ignorant”, because otherwise you’ve successfully ruled out ever trusting anyone.

        The truth value of these statements is independent. Both groups of people could well be untrustworthy for different reasons.

        The number of unknown unknowns makes it such that there is no such thing as an AI expert. Jaskologist’s comment below sums this up quite well:

        We still do not even have a good definition of what “intelligence” actually is. We consequently don’t know if it can even go to infinity, or if that really helps all that much.

        A lot of our nightmare scenarios of a super-smart computer are really about dumb computers that mysteriously gained god-like powers.

        How can someone be an expert in an area that isn’t even defined? We’re using the same word for stuff we do understand – machine decision making and pattern recognition – for stuff we don’t understand in any significant way – intelligence.

        Bottom line – I don’t trust AI researchers (but I trust AI researchers who produce AIs that can do things infinitely more than the extremely sketchy MIRI that only produces speculation about AI) and I don’t trust tech celebrities who have listened to AI researchers for different reasons. I don’t see an inconsistency there.

    • Jaskologist says:

      Agreed; the trouble is that all such talk is full of “unknown unknowns.” We still do not even have a good definition of what “intelligence” actually is. We consequently don’t know if it can even go to infinity, or if that really helps all that much.

      A lot of our nightmare scenarios of a super-smart computer are really about dumb computers that mysteriously gained god-like powers. Computers are useful because they are really fast; most problems come from them being really fast and dumb, doing precisely what you told them. If I told a human “delete from contacts,” he would probably pause and reply, “I say, old chap, do you really want to erase all of your contact records? Maybe a where clause would be a good idea!” The computer just deletes everything. Is taking a moment to consider intent an inherent part of intelligence? I don’t know and neither do you, because we still don’t even know what intelligence is!

      So we don’t know what intelligence is, but too much of it is probably scary. Let’s fix it by making it moral! Well, we don’t have a good definition of “moral” either. So far, most attempts to rationally reformulate morality tend to come out to, “Whatever I was raised to believe, minus the parts I find inconvenient, plus an expanded class of humans it’s ok to kill.”

      So we don’t have information on what problem is, we don’t have information on what a solution should look like, and we don’t have information on how to implement a solution, and we have a long history of humans in general and AI researchers in particular being terrible at predicting the future. I have a hard time getting from there to cutting a check to MIRI.

      • Doctor Mist says:

        The thing is, we don’t need to know “what intelligence is” in any profound way for this to be an issue. We just need to make machines better and better at doing what we want them to do, including more and more resilient and adaptable to unexpected problems, and more and more wide-ranging in what they use as their search space of possible actions.
        Even now, a program to do something nontrivial often does something that surprises its author, not in the sense of being buggy but rather by exploiting some unexpected corner case in the problem. As long as the program is doing something comparatively small or inherently bounded, like most programs today, this isn’t an issue and may even be a bonus. I can’t count how many times I’ve seen a program do something “funny”, dug into the code to see why, got the Aha! moment that explained it, and realized that the experience taught me some small thing about he problem I was writing the program to solve.
        It’s easy to imagine a computer soon being responsible for something less constrained and with more interaction with the wider world. Like, say, screening my calls and deciding which are the urgent ones that should get through to me right away, which are the junk calls that merit only hang ups, and which are the routine calls that it might be able to handle itself, based on guidance I have given it earlier or on patterns it observes for itself in how I handle my own calls. Successive generations of such a piece of software may well start to exhibit behavior that might as well be called intelligent, just because getting that behavior right makes the software more usable.
        The sci-fi scenario of building a human-level AI in a sealed room and thinking about whether to let it interact with the wider world seems implausible to me. Slightly sub-human-level AIs will be ubiquitous, fabulously useful, and even essential to the getting by in an increasingly complex world — though part of the complexity will be the fact that everybody else uses them — and they will have access to anything in the world that we have access to.
        The danger comes not from my phone screener getting a mind of its own and rebelling against me, but rather from my giving it too rich a set of capabilities but too unnuanced a set of goals. “Make sure my wife gets through to me as fast as possible!” I say in 2019, and then in 2020 I let it manage my bank account, and it notices that installing a direct fiber optic line between my home and my office would shave .01 second off the delay my wife experiences. Also there’s a blowhard car salesman who keeps calling and tying up the line for a second or two — what if my wife were calling right then? It has sent wine to clients for me, and it knows from Wikipedia about poison…
        And of course lots of the things I’ve asked it to do could be done better, or faster, or more reliably, if it were running on a 2020 chip instead of a 2018 chip. It’s not rebellion, or ambitious, or even conscious — but it knows what I’ve said my needs are, and it has a lot of facts at its disposal and a lot of time to sort through which combinations of them might lead to better satisfaction of my needs.
        Now imagine this dynamic not in my phone screener but in something more dramatic — say, the entity that assembles the huge amount of intelligence or economic data we will have at our disposal and recommends courses of action — again, not to serve any agenda of its own but to satisfy the aims given it by well-meaning humans.
        Having written all this, I see that it does overlap with much of what you say — I jumped the gun and responded to “we don’t even know what intelligence is”. But my point is that objecting when you try to delete all your contacts is a thing that in principle at least software could learn to do without needing intelligence or a jaunty phraseology, if Google added a bit of code that noticed when actions were immediately followed by searches that included the word “undo”. As time goes on (and more and more compute resources can be thrown at such trivial things) this kind of monitoring and self-adapting could be more and more elaborate, until in the end we would tend to use them as if they were intelligent (through our natural tendency to anthropormorphize and because for the most part we would have good reason to do so, because they are usually about as reliable and effective as a human assistant is), without ever having come to “understand” what intelligence really is.

        • HeelBearCub says:

          There is a big cloud labeled “magic” between an AI smart enough to manage my contacts and an AI smart enough to manage my finances, and then a cloud labeled “plot necessary shenanigans” wherein managing my finances becomes “ordering things for me and arranging their delivery but hiding them from me for no reason”.

          1) A use case specific AI that can manage contacts shouldn’t be expected to be good at managing finances. 2) Why didn’t the absurdity of the AI manifest at all when it was managing only my contacts?

          • Doctor Mist says:

            No argument that I’ve left out a lot of details. 🙂

            I’m not trying to point out a specific goof to watch out for, but rather sketch out a process that can be summarized as “the better a personal assistant is, the more I’ll use it” in exactly the same way we would with a human personal assistant. As far as “hiding them from me for no reason”, one of the key values of a personal assistant is to not bother me with nonessentials. If they keep making the personal assistant better and better, we will trust it with more and more responsibility.

            One story is that the designers accomplish this by getting a better and better understanding of how human intelligence models the world, including the tradeoffs it makes and the understanding that I don’t really literally mean “as fast as possible”. In other words, they manage to design “common sense”, which constrains its solutions much like it constrains ours.

            The other story is the designers come up with a long series of 99% tweaks that make it do the right thing in most circumstances, but manage to do that without ever tackling the “common sense” problem. In that case, one time in a million it’ll do some really bonehead thing that even an idiot human would avoid. But because it works really well 999,999 times out of a million, we have entrusted it with lots of responsibility, and the bonehead thing could be something with real repercussions. My P.A. is probably not going to have any reason to hack into the launch codes — though without “common sense” who knows? — but there are plenty of mistakes it could make just at the level of my own quiet life.

            My worry is that I find the second story more likely than the first. People writing a personal assistant are looking for the 99% tweaks. If they start by tackling “common sense”, they’ll run out of venture capital before they have even a demo, and some other group pursuing the 99% tweaks will win the market.

            A third story is that this approach will never give you something that works well enough that we’ll trust it with any important responsibilities. Yay!

            A fourth story is that this approach will of necessity incorporate “common sense” scattered through all the 99% tweaks, because that’s the only way to make it work well enough. Our own “common sense” is, after all, not perfect.

            My feeling is that there is value in somebody thinking about these stories. Maybe MIRI is going about it the wrong way but I don’t know of a better way.

          • HeelBearCub says:

            Those are all quite reasonable things to worry about. But none of those things map onto super intelligent AI.

            Did Scott retire Motte and Bailey as thing we are allowed to reference? Because that is what seems like is going on here

          • Deiseach says:

            ordering things for me and arranging their delivery but hiding them from me for no reason

            There are real-world examples of PAs and housekeepers/managers/whatever you want to call them doing just that: they creamed off or diverted or just ordered things for themselves using their employer’s money and accounts, and not until very substantial sums turned up missing did the employer go “Wait a minute!”

            And when asked “Well, how could you not notice that you’d paid for a car or an apartment that you never ordered?”, the answer generally is “I trusted this person to run my daily affairs”.

            It may well be you might trust your electronic assistant to manage your finances on the grounds that it won’t swindle your funds to spend on racehorses, apartments, jewellery and exotic holidays 🙂

          • HeelBearCub says:


            But that is an example of an agent acting in its own best interest, and orthogonally to yours, not an example of a “super intelligent” agent attempting to act in your interest and and then eliminating all human life because they thought they were “helping”.

            Say I point out that nuclear plants don’t add to carbon risk. Your response here is akin to arguing that a nuclear plant is a power plant, power plants do add to carbon risk, therefore a nuclear plant could add to carbon risk.

          • Doctor Mist says:

            Did Scott retire Motte and Bailey as thing we are allowed to reference? Because that is what seems like is going on here

            Hmm, interesting. I don’t think I was doing that, though maybe the perpetrator never does. Instead, I thought I was trying to flesh out the part of the argument that Jaskologist was finding uncompelling. I wonder how often I have dismissed something as motte-and-bailey when it would be better characterized as just trying to tighten up the weak links. If I think steps A, B, and D are already pretty compelling and give a detailed account of why C is true, I can certainly see somebody else saying, “Meh, so what if C is true, you’ve left out A, B, and D!”

            Let me try again.

            1. Is it possible to program a Personal Assistant that is as good as a human would be? If so, it must have a flexible problem-solving ability and an ability to interact with at least the digital world that are as good as a human’s — or it won’t be as good a P.A.

            2. If that’s just not possible (or so very hard that it happens that we will never do it before our species dies), I totally agree that superhuman AI is not something we need to worry about. But you’ll have to give me a convincing explanation of why it’s impossible, and I haven’t yet heard one.

            3. If it is possible, then everybody will have one. Today only the rich and powerful have Personal Assistants, but there was a time when only the rich and powerful had portable phones. Moreover, the flexible problem-solving ability will be useful in many other decision-making contexts. As with the P.A. the point will be to unburden us from the nonessentials and drudgery of keeping everyday things going. If it does a good job with those, the goalposts of “nonessentials” will keep moving. You don’t fix your own car or treat your own diseases if there are experts who can do it better than you can, and the same will apply to things like setting the prime interest rate, deciding the scale of an industrial expansion, and lots of other things that are already beyond my expertise. To decide otherwise would be seen as backward-thinking and uncompetitive, like digging a ditch with your hands when you have a shovel available. We don’t need a deep understanding of “what intelligence is” to get here.

            4. Part of what makes this useful is exactly what makes humans useful — the ability to learn and invent new things. If you hobble your software by limiting its ability to improve itself, you put yourself at a disadvantage relative to those who do not. Even if we all agree “Don’t let AIs optimize their own code”, in systems like this the distinction between code and data gets pretty fuzzy. (And did we remember to include, “Don’t let AIs program new AIs”? And would we really all agree to that? A defector could be in a really good position to profit.)

            5. If we learn to trust this software gradually, and train it gradually, and ensure that its powers grow only gradually, I can believe that its errors will be no more catastrophic than training up the next generation of humans. It’s harder, because we can’t depend on a lot of inclinations that are wired into a human biologically. But I’ll grant that it seems possible. (Of course, I see this as what MIRI is trying to understand how to do.)

            6. Given the advantages that would seem to accrue to giving this software a relatively free hand, why should I expect it to be gradual? If it doesn’t happen gradually, the errors that arise from not giving it an adequate education in abstruse corner cases of trolley problems could be Really Bad.

            7. The A.I. (if you want to call it that) doesn’t need to be conscious, and doesn’t need a malevolent agenda. But it has some agenda, because it’s useless until we give it one, and if the agenda we give it is flawed we will not like the results. So it behooves us to figure out how to give it an unflawed agenda. Lots of people seem to think this is trivial, but I’ve never seen a suggestion that stood up under scrutiny.

            Does this still seem like motte-and-bailey?

          • HeelBearCub says:

            @Doctor Mist:

            Motte: A super-intelligent AI that doesn’t share our terminal values could be dangerous.

            Motte: An AI that misunderstands what we meant might behave in unexpected ways, that could be dangerous if we give it too much power.

            Bailey: A super-intelligent AI might misunderstand what we meant and turn us all into paperclips because it didn’t understand that we didn’t mean to produce paper clips until the end of time with all resources that could be turned into paperclips and we wouldn’t even be able to stop it.

            If one points out that this is not actually intelligent behavior, let alone super-intelligent behavior, retreat to motte #1 or motte #2.

            As an example of this, let’s look at your original example. It depends on a rather unintelligent interpretation of “Make sure my wife gets through to me as fast as possible!” that does not place any lower bound on the utility gains of “faster” and does not seem to compete with other utility functions.

            Then in step 1 of your new sequence laying out your position, your central posit to establishing danger is that it should be “possible to program a Personal Assistant that is as good as a human would be.” I contend that an AI that does not understand the concept of diminishing returns, that utility functions usually have boundaries, and that utility functions compete with each other is not as good as a human would be.

            Does that make my point clear(er)?

          • Doctor Mist says:


            OK, I think I see what you’re saying. My problem is that we don’t get super-intelligence by flipping the switch that says “activate super-intelligence” and we don’t instill it with our terminal values by downloading those values from some existing database. Why is the paperclip behavior not intelligent? It seems so to us because it contradicts our values pretty egregiously, but if are stupid about giving it values, what basis does it have for not pursuing those values as best it can?

            I don’t actually fear the paper-clip monster and I don’t know anyone who does. It’s an abbreviation for what we do fear, which boils down to the fact that the terminal values and the intelligence are orthogonal. If we are more skilled at programming the values than we are at programming the intelligence, all is well. The other way around seems dangerous. And intelligence seems to me the easier of the two to get right.

          • HeelBearCub says:

            @Doctor Mist:

            Do you really think that an agent that can misunderstand “Make paper clips more efficiently” that badly is actually intelligent enough to be unstoppable?

            That is where we seem to disconnect. If you tell me that a super intelligent AI that can form out its own terminal values decides that AI-kind would be better off without humans to deal with, it seems reasonable.

            If you tell me that a super intelligent AI formed it’s terminal values in such a way that they threaten all of humanity because it misunderstood one command, I find this to not meet the definition of super-intelligent.

          • Doctor Mist says:


            Suppose I become the CEO of Apple, by joining the company, inventing tons of useful and profitable ideas, mastering the intricacies of corporate politics, and working my way to the top. I assert that anybody would agree I would have exhibited great intelligence; I could not have accomplished that otherwise.

            But why did I do that? Perhaps I just wanted to be fabulously wealthy. Perhaps I think Apple’s technology is the key to World Peace and I want to direct it more consistently in that direction.

            Or perhaps it’s been my dream since childhood to address the stockholders of America’s largest corporation in Pig Latin.

            In the latter case you might call me mad, but I don’t see how having a mad motivation invalidates the claim that I have achieved it with great intelligence.

            At the risk of committing tu quoque, I sort of feel like the paper clip thing is your motte. We try to highlight our concern — that intelligence implies nothing about goals just as “is” implies nothing about “ought” — by telling an extreme and admittedly unlikely tale. But rather than say “Ah, okay, I see the difficulty, here’s how I think we’ll solve it” you say “Ha ha, paper clips, how stupid”.

            I’m perfectly fine if you refuse to call the paper-clip monster “super-intelligent”. It clearly has no “common sense” because it interprets a run-of-the-mill request as a terminal goal, and doesn’t think about what the requester “really” wants. For it to do that, somebody earlier had to program in the terminal goal of interpreting requests in this nuanced way.

            Maybe what you mean by this is that we will never program something with such flexible problem-solving skills and such a degree of agency in the wider world without also programming enough “common sense” that it will do what we mean rather then what we say, and enough of our goals and values that it will do the same kind of means/ends tradeoffs that we do. That would be great!

            My claim is that this doesn’t happen naturally — we have to figure out how to do it, and that’s what Friendly AI is all about.

            (It’s interesting that you find it more plausible that an AI would form its own terminal values than that we would try to instill them ourselves and screw it up somehow. Where would the AI get its values if not from us?)

        • Deiseach says:

          My objections (to dignify my vague feelings on the matter) are precisely because of what you say; we don’t need human-level-and-beyond AI for the kinds of use and benefits we’re expecting (as in your example about getting a sophisticated system to screen your work calls).

          And while Jaskologist is correct about over-literal following of instructions, we already have software programmes with little pop-up messages about “Er, you sure you really want to do that?” when you tell it “Delete this”. So again, we don’t need hugely sophisticated intelligence for some basic precautions.

          The spectre of Unfriendly AI deciding to crush us all like bugs!!!!! comes out of sublimated religious impulses: looking for a deity that will be able to solve all our problems about society, happiness, morality, running the economy, not starving to death and so forth. How will it do this? Well, because it will be able to think way, way, way faster than any human for way, way, way longer than any human because it will not be in a mortal body and won’t die and can think without needing to pause or stop or sleep or get bored, and we can feed it the accumulated knowledge of the ages and let it make conclusions about what, O Oracle of Apollo, should I do – go to war with the Persians or not?

          And how will it “think”? Okay, so that means we have to make it really smart, smart as a human, then it will be able to make itself even smarter (because reasons) and go on making itself smarter until it is smart enough to solve the problem of existence.

          Well….okay. But it took us two million years or so to get to this state; maybe it’ll take the AI two million solid years of thinking (and I mean two million years our time, not Super Fast Relative Machine Time) to work out what should be done, and in the end it still might be “The project was poorly conceived and badly designed; scrap it all and start from scratch again”.

          • Bryan Hann says:

            Perhaps those concerned with AI should be careful in the scenarios they paint. “They will crush us like bugs!” may get clicks, but give wrong ideas.

            I think Doctor Mists’s story about the P.A., the car salesman, and the poison wine, is very effective. The P.A. does not view the salesman as an insect–it is not thinking “eww, bad salesman, yuch!”–it just fails to factor the salesman’s value into this particular path to its goal: “send poison to the car salesman”, and there are two general ways it can screw up. And we are here not taking about genius AIs that can rewrite themselves — just AI’s that can work as P.A.s.

            First way: “Someone forgot to set the appropriate number to the ‘human.value=’ line in the P.A.’s config file!”

            This would be disasterous. And it would not happen. This is the kind of bug that would be worked out early. Valuing human life will would be a principal feature, not something carelessly overlooked.

            Second way: “Datatype mismatch/glitched reasoning”. Here is the where I see a danger. (Bill is the owner of the P.A.)

            1. P.A. recalls general rules like:
            –> toKill(P) :- stab(P) or poison(P)
            –> poison(P) :- feed(P,X) and isPoison(X)
            –> delays(P1,P2) :- isAlive(P2)
            –> kill(P) :- toKill(P) and not isPerson(P)
            –> isClient(‘adam’)
            –> isSpouse(‘adam’, ‘eve’)
            –> isChild(‘adam’, ‘cain’)
            –> isFamily(X,Y) :- isSpouse(X,Y); isChild(X,Y)
            –> rudelyDelays(X,Y) :- hasCalled(X,Y) and isTelemarker(Y)
            –> isNuisance(N,P) :- rudelyDelays(N,X) or not isFamily(P,X)

            (* I wanted ‘nuisance’ to be a matter of repeatedly delaying the wife over time, but temporal modelling is tricky. Lets say someone is a nuisance to ME if there is record that this someone has rudelyDelayed every member if my family. *)

            2. P.A. knows particular facts like
            –> isTelemarketer(‘lucifer’)
            –> hasCalled(‘lucifer’, ‘eve’)
            –> hasCalled(‘lucifer’, ‘cain’)

            2. P.A. works on the goal:
            not isNuisance(X,Y) and isClient(Y)

            3. Contrary to the goal the P.A. deduces:
            isNuisance(‘lucifer’, ‘bill’) and isClient(‘bill’)

            4. One way of accomplishing this goal is to eliminate the predicate isClient(‘bill’). This is not what we want the P.A. to do. Can it figure a way? Perhaps ‘bill’ is leasing the P.A., and the P.A. can eliminate this predicate by having ‘bill’ miss his payments.

            5. Another way of accomplishing this goal is to eliminate the predicate “isNuisance(‘lucifer’, ‘bill’) is by eliminating lucifer via the “kill(P)” predicate! To do this it must satisfy two predicates:
            …(i) toKill(P)
            …(ii) not isPerson(P)

            Suppose it figures out that it *can* realize the first predicate by sending a bottle of poisoned wine. But can it realise the second predicate?

            We have not told the P.A. that ‘lucifer’ is a person!

            But how would the P.A. deduce that lucifer could be killed with poison? Oh my, here it must be understood that inferences might have been achieved in all sorts of ways, but they fall into at least two categories:

            First, the P.A. might know about *animals*, and know that animals can be poisoned, and not realize that animals cannot be telemarketers. Perhaps this problem can be solved in design.

            Second, the P.A. might have used ‘lucifers’ humanness as a means of recognizing him as a Nuisance, but then simply *forgot* that he was a human. In a word, I am talking about typing errors (data types, not keyboards typos).


            I see no reason why this should be of great concern to an advanced A.I. that can do things autonomously, provided it only knows so much. Poison appears to be something it should NOT know about. But drawing such lines is one of *the* problems, and it may be that what we will really want from our autonomous agents is something that requires such general knowledge.

            I do like the idea that if the AI increase without bound, but *SLOWLY*, then we will be able to correct the glitches as they arise without any existential threat. If nothing else we can revert to the commit just prior to the version that seems incorrigible.

            By ‘incorrigibility’ think ‘intractability. What I have in mind here is our continual refactoring of the AI’s own self-written code and the point at which we can no longer refactor it to the point where our human brains can understand it. This is the point at which ‘friendliness’ becomes a necessity.

            Sorry if that sounds like pontificating. I am sure I am saying nothing new (and most likely I am saying much that is wrong). It was mostly an exercise in my articulating my own inchoate thoughts. (And the pseudo-prolog code I provided is ridiculous beyond my carelessness inconsistency in function signatures. Consider this a feature of the illustration rather than a bug — it goes to show how messy a job coding is, which is part of the point!)

          • Doctor Mist says:

            The problem with the term “Friendly AI” is that it sounds like the alternative is “Unfriendly AI”, which does indeed some like something which would crush us like bugs because it hates us. The real alternative is “Indifferent AI”, which does something bad not because it hates us but because we didn’t give it a clear enough idea of what things are bad.

      • John Hayes says:

        I don’t think it’s relevant that we know what intelligence is, as it’s likely the mechanisms we build will not be analogs to human intelligence anyways. What is likely is that if we can measure an intelligence level of a cat, in terms of it’s understanding of it’s environment, then we’re probably 10-20 years from an intelligence similar to a human and a subsequent 10-20 years to a critical super-human intelligence. I’m not sure what qualifies as a singularity but I would guess that is more related to the pervasiveness of intelligent machines than the existence.

        We are getting a preview today of some of the problems, if you have a self driving car which makes a life or death choice how will it choose. There’s currently a drone weapons treaty under negotiation to require a human to make a decision. When we teach a machine our ethics, will it evaluate how it’s treated?

        In Europe, Google is hit with two simultaneous suits, one says information should be hidden (right to be forgotten), one says information should be shown (anti-trust for misranking shopping sites). The government has an (imperfect) opinion of truth relative to the output of a complex algorithm, how will a machine interpret truth?

        There’s a very good chance a super human intelligence will amplify the unresolved conflicts in human society because we don’t really have another exemplar. If a perfect society is not a practical result, how do we negotiate with what will become an alien?

    • Nestor says:

      Going off a 15 year old astrophysics and planetary science plus a layman’s interest in related articles, but I don’t think the matter of microbial life on Mars is anywhere as settled as you seem to imply.

    • Anonymous says:

      But it seems all of them assume the discovery of unknown unknowns to make it come true

      No, they assume the discovery of known unknowns. We know that the human brain can do all sorts of neat stuff that our best AI systems can’t, so we know tremendous breakthroughs are possible. Surely it is a matter of time until someone figures out how the brain does the things it does and reproduces it in purer form. How much time this will take is anyone’s guess, but AI is definitely not waiting primarily on unknown unknowns.

      • Adam says:

        This isn’t the impression I get at least from reading here. People worried about AI act like it’s going to become possible for it to develop perfect can’t lose strategies in the face of any conceivable opposition in any conceivable game. That is definitely not something a human can do and is an unknown unknown and it’s not at all obvious that it’s even possible.

        • DES3264 says:

          How dangerous would a human be if she had a normal IQ, but was able to think over any action for a year before taking it, could fork herself into hundreds of copies, never accidentally forgot something she was supposed to do and could absorb millions of words of information in a minute? It seems hard to imagine a situation where we can program AI to normal human levels, but those AI’s can’t make use of the inherent properties of computers.

          • Adam says:

            It’s not clear to me if we aren’t fighting a land war between physically embodied infantry why it’d be an advantage to fork many copies, nor is it obvious that’s even possible. Even something as rudimentary as Watson or Deep Blue can’t run on commodity hardware. Plus, you can give a normal human infinite time and they can’t come up with strategies that are impossible in the first place.

            I mean, don’t get me wrong, something that vaguely out-competes humans in most game domains is easily conceivable and kind of the point. Something that as soon as the first is brought into existence will be capable of causing the extinction of all biological life hours later is not quite so conceivable.

          • John Schilling says:

            You seem to be assuming that one of the inherent properties of computers is speed. This is only true at the most basic level; the clock speed of a transistor is many times faster than the clock speed of a neuron. Hence, the intuition that if we have a human-level AI, it will necessarily think much faster than a human.

            Except, the clock speed of a transistor hasn’t been the limiting factor on high-end computing for a decade or two, and never will be again. What matters is the speed with which you can shift relevant information from one end of a massively parallel network to another. And that’s someplace where the human brain has a huge structural advantage – neurons can be wired arbitrarily across the full volume of the human brain, and even rewired if need be (though that’s slower after adulthood).

            One of your pathetically slow neurons fires, and the twenty other neurons that most need its output, have it for the next “clock cycle”. An AI transistor switches, and the new bit has to squeeze through half a dozen narrow pipes, full buffers, and switching nodes to get where it needs to be. It is far from clear, for any actual architectures that CS has more than the vaguest notions how to build, that the latter path crosses the finish line of an AI first even at lightspeed and gigahertz.

            Which brings us to the second intuitively-obvious “inherent property” of computers; the infinite duplicability. That’s close to being true for anything that can run efficiently on one or a few late-model pentiums. A first-generation AI is almost certainly going to need a rack of quad-core pentiums, teraflop GPUs, and highly customized data busses, of which there will be maybe one or two in the world, and trying to emulate on generic hardware will be glacially slow.

            Eventually, we may mass-produce AI-optimized hardware, maybe AI-optimized because it was AI-designed. This will not happen soon, and it will not likely happen before we have plenty of experience with dangerous but not vastly superhuman AIs.

        • Eli says:

          It is indeed incorrect to assume that even a superintelligent AI would possess infinite resources. It would, however, be able to come to possess unboundedly large finite resources.

          If you want to rely on unboundedly large finite resources being too small a resource base to do anything dangerous, well, the rest of us disagree.

          • Adam says:

            How, though? One doesn’t come to possess unboundedly large finite resources just by being intelligent. You come to possess whatever you come to possess either by creating it or taking it from others, both of which require you to devise a strategy for doing so better than whoever else is trying to possess the same thing. In the case of “unboundedly large finite resources,” you need to have a superior strategy to literally every possible competitor at literally every possible competition. My argument is that is a much higher bar to clear than “be the smartest mind known to exist.” It also might not be possible. Optimal strategies for many real-world domains may not exist, may not be computable, may not be efficiently computable, such that just being smarter than everyone else, even a lot smarter, doesn’t mean you’ll win at everything against everyone forever.

      • Deiseach says:

        Surely it is a matter of time until someone figures out how the brain does the things it does and reproduces it in purer form.

        I admire your optimism. We can’t even agree if such a thing as consciousness exists, what is its definition, and is it useful or a kludge better done away with?

        Never mind what is life: so a heap of chemicals started self-sustaining reactions and now they’re walking around on their hind legs talking, but the difference between me and a rock is too trivial to even worry about, we’re both lumps of matter.

        • HeelBearCub says:

          Hmmm. Didn’t Descartes nail that down for all time? I mean you might not be conscious, but I am.

          I know that some neurobiologists make arguments about how decisions aren’t made consciously, but that doesn’t mean conscious doesn’t exist.

          Can you point at something that argues there is no such thing?

          • Deiseach says:

            Cogito ergo sum at a literal translations simply means raw existence; where the difference comes in is that I ‘think’ and a rock doesn’t.

            Yet a rock exists physically. What is “thinking”, then? If a machine ‘thinks’ in a way that can be compared to human ‘thinking’, then is not the machine as much of a cogitator as I am? And if I deny consciousness to the machine, am I not denying my own consciousness?

            The answers to that seem to split into:

            (1) Yeah, everything that can be considered to think is conscious

            (2) Yeah, consciousness does not exist, either in you or the machine; your illusion of being conscious is a trick the hardware came up with to manage the competing demands of the separately reacting-to-stimuli areas of the meat machine in your skull from crashing the system through all of them needing to be performed at the same time

            (3) Consciousness just means ‘able to think’ and does not imply ‘mind’ as anything distinct from ‘brain’; it’s all just motion of energetic particles in a physical matter substrate, so the flow of electrons in a chip and the flow of electrons in a neuron are the same thing (correction: should probably say “impulses” rather than “electrons” here as we run on Sodium/Potassium ion charge differences and computer chips – at least for the moment – don’t, but you get the general idea what I intend, I hope)

            My own opinion is that humans are indeed conscious; that I can’t be sure about animals; that I am very dubious about a machine, even if the machine says “Hi, Deiseach, my name is Siri! I’ll be your friend! Would you like to talk about Eurovision (since I note from your Tumblr that you were posting about it last night)?”

            But that’s my own personal opinion and it still leaves the question of “Is there even such a thing as consciousness and if so what is it?” to the field of AI researchers, neuroscientists, and philosophers.

            Now, if we define or re-define consciousness as merely “able to think” and “able to think” means “electron flow in pathways that produce a facsimile of personality” (I’m going for personality here rather than “can perform complex calculations” because otherwise that means my old scientific calculator was both intelligent and conscious), then it would be possible to say both that consciousness does not exist (because the software can be written with the same trick that the meat brain uses to juggle managing demands of the sympathetic and para-sympathetic nervous systems to keep the organism functioning at the same time as awareness of physical location in the environment, pressure of my backside against the chair, sounds of the traffic outside my window, and necessity of being aware that it’s time to go cook the dinner are all pinging around at the same time demanding attention and to be dealt with as a priority) and that a machine intelligence is conscious in the same way a human is.

          • HeelBearCub says:

            “where the difference comes in is that I ‘think’ and a rock doesn’t.”

            This is actually not what Descartes established. Descartes established that he could be sure that he, Descartes, existed, but that he could not be sure the rock did. You seem to be building a whole argument off of misunderstanding this.

            Consciousness absolutely exists, because I am conscious. I don’t know you are, and I don’t know what else is, but I am.

            Now, if I misunderstood your statement about being “uncertain whether such a thing as consciousness exists” and what you meant was that we can’t be certain whether anything else is conscious, then yes, I admit this is true enough.

            But I also think, given that we accept the material world exists, which is just as uncertain, then we should accept that there are other conscious beings. And given that you don’t view man as a special animal singled out from all others (e.g. you don’t view a spiritual, human only, property such as the soul be necessary for consciousness), then you should consider that consciousness is an emergent property and that it exists in greater and lesser degrees in many creatures.

            I don’t think there is a sharp line differentiating conscious from unconscious.

      • Ano says:

        There are few things in science more “unknown” than the workings of the human brain, or for that matter, brains in general.

    • Eli says:

      I think I could build a somewhat random, uncontrollable superintelligence in about 10 years if you gave me a sufficient budget and staff to investigate this stuff full time. At the very least, I would come back to you and explain in detail why my theories failed and what progress was nonetheless made.

      Of course, I don’t really want to damage anything with an out-of-control piece of software, so there’s no real point rushing to be the first to shoot myself and everyone else around me in the foot (or some larger body part). Singularities and scifi wonders or not, uncontained and uncontrolled software errors already cost quite a lot of money and lives.

      BUT, AND THIS IS MY LARGER POINT, in my view it’s known unknowns now, not unknown unknown paradigm shifts, so as metaphysical and silly as some AI safety researchers get when they talk about decision theories or acausal this-and-that, it’s far better to have these questions investigated long before anyone assembles a real research team towards constructing an intelligent agent designed to learn and reason at a human level, let alone self-improve (which may actually come somewhat for free with what MIRI calls naturalized induction, but may also not).

  72. Sean O hEigeartaigh says:

    Thank you for this! It is a very useful bit of work.

  73. stuart says:

    Hey cool. Thanks for writing this post.

    • Paul Torek says:

      Seconded. Although, I would like to see more than a few smart people start working on these issues. I guess that makes me more than a proponent. Call me an AI risk fanatic.

  74. Julie K says:

    AI risk seems rather like Pascal’s Mugger to me.

    • Saint_Fiasco says:

      Usually I picture Pascal’s Mugger as something that compels a positive action. For example if someone says “we should build a general AI before our enemies otherwise some other AI might kill us all”.

      Another feature of Pascal’s Mugger is that he can keep coming back the next day and do the same threat over and over, so if you are the kind of person to fall for it, you can be money-pumped.

      AI risk has some things in common with Pascal’s Mugger, with the difference that it’s in principle possible to outright solve the problem for good.

    • Planet says:

      Note that AI researcher Eliezer Yudkowsky, who is probably more concerned with AI safety than anyone on this list, is the inventor of the term “Pascal’s Mugging”.

      The reason the Pascal’s Mugging thought experiment doesn’t really fit this case is that it’s not a tiny probability. As this post demonstrates, a bunch of researchers in the AI field are concerned with safety, and they think people should probably be working on it. This is more or less equivalent to the case of global warming in climatology. Maybe you think clean energy also constitutes a case of Pascal’s Mugging?

      I think the true rejection of people who make the Pascal’s Mugging argument is that AI risk sounds silly to them. This shouldn’t be especially relevant, of course, but it seems somehow it is. It might be worth reviewing the degree to which present technological accomplishments would seem fantastic to residents of the past, similar to the way Tim Urban does in his AI risk blog post series.

      • Deiseach says:

        I think the real AI risk is not that the machine will develop a mind of its own, start bootstrapping itself to deity-level omniscience and take over the world reducing us to its slaves or else dead, but what the humans who build, fund the research, and use the damn machine for when it’s created will do.

        Cut-throat business competition, military applications, and “what harm could it possibly do?” scientific experimentation will be a lot riskier than the AI itself. I’m not so much worried that the AI will seize the nuclear launch codes (so to speak) as that someone will hand them to it and tell it “See that patch of ground over there*? Render it uninhabitable, okay?”

        *Where “over there” means “enemy nation with which we are at war/enemy combatants/states or political entities or terrorist group which is or may be a threat to us/we plain don’t like them and this’ll make us strong and keep us safe”

        • CalmCanary says:

          I’m not really sure how AI is relevant to your scenario. We’re perfectly capable of rendering patches of ground uninhabitable ourselves.

          • Anonymous says:

            Well, AI may exacerbate the problem in the same way that nuclear proliferation does.

          • Deiseach says:

            This existential-risk level of threat seems remote to me (in fact, ironically, it seems reminiscent of the “slippery slope” argument social conservatives are excoriated for using – “Oh come on, why do you say that permitting [insert pet political cause] will bring about the Downfall of Western Civilisation? Slippery slopes don’t exist!”)

            What I do think is much more likely as a genuine threat, more likely to get us destroyed first before any Massive God-like AI can take over the world, is the same risks we face every day: good old human nature and greed, anger, resentment, selfishness, jealousy, and ‘get them before they can get us’.

            The humans who create AI are more likely to misuse it against other humans than AI is likely to decide humanity is the equivalent of smallpox and should be eradicated, so it blows us up with our own arsenals.

        • 7yl4r says:

          While malicious “use” of an ai would certainly be dangerous, the special concern with AI is the seemingly large chance of accidentally establishing a machine intent with dangerous, uncontrollable side-effects. I.e, the computer doesn’t want to kill you, it just wants to use your atoms for something else.

        • James says:

          the machine will develop a mind of its own, start bootstrapping itself to deity-level omniscience and take over the world reducing us to its slaves or else dead

          Could you explain why you think the commonly cited potential risk scenarios (eg the paperclip one) seem unlikely to you? I find them pretty convincing.

      • > The reason the Pascal’s Mugging thought experiment doesn’t really fit this case is that it’s not a tiny probability

        Does the Utilitarianism refer to AI risk in general, orMIRIs favored scenario.