November 23, 2024

Can We Stop Runaway A.I.?

Fake Taxi #FakeTaxi

At the same time, A.I. is advancing quickly, and it could soon begin improving more autonomously. Machine-learning researchers are already working on what they call meta-learning, in which A.I.s learn how to learn. Through a technology called neural-architecture search, algorithms are optimizing the structure of algorithms. Electrical engineers are using specialized A.I. chips to design the next generation of specialized A.I. chips. Last year, DeepMind unveiled AlphaCode, a system that learned to win coding competitions, and AlphaTensor, which learned to find faster algorithms crucial to machine learning. Clune and others have also explored algorithms for making A.I. systems evolve through mutation, selection, and reproduction.

In other fields, organizations have come up with general methods for tracking dynamic and unpredictable new technologies. The World Health Organization, for instance, watches the development of tools such as DNA synthesis, which could be used to create dangerous pathogens. Anna Laura Ross, who heads the emerging-technologies unit at the W.H.O., told me that her team relies on a variety of foresight methods, among them “Delphi-type” surveys, in which a question is posed to a global network of experts, whose responses are scored and debated and then scored again. “Foresight isn’t about predicting the future” in a granular way, Ross said. Instead of trying to guess which individual institutes or labs might make strides, her team devotes its attention to preparing for likely scenarios.

And yet tracking and forecasting progress toward A.G.I. or superintelligence is complicated by the fact that key steps may occur in the dark. Developers could intentionally hide their systems’ progress from competitors; it’s also possible for even a fairly ordinary A.I. to “lie” about its behavior. In 2020, researchers demonstrated a way for discriminatory algorithms to evade audits meant to detect their biases; they gave the algorithms the ability to detect when they were being tested and provide nondiscriminatory responses. An “evolving” or self-programming A.I. might invent a similar method and hide its weak points or its capabilities from auditors or even its creators, evading detection.

Forecasting, meanwhile, gets you only so far when a technology moves fast. Suppose that an A.I. system begins upgrading itself by making fundamental breakthroughs in computer science. How quickly could its intelligence accelerate? Researchers debate what they call “takeoff speed.” In what they describe as a “slow” or “soft” takeoff, machines could take years to go from less than humanly intelligent to much smarter than us; in what they call a “fast” or “hard” takeoff, the jump could happen in months—even minutes. Researchers refer to the second scenario as “FOOM,” evoking a comic-book superhero taking flight. Those on the FOOM side point to, among other things, human evolution to justify their case. “It seems to have been a lot harder for evolution to develop, say, chimpanzee-level intelligence than to go from chimpanzee-level to human-level intelligence,” Nick Bostrom, the director of the Future of Humanity Institute at the University of Oxford and the author of “Superintelligence,” told me. Clune is also what some researchers call an “A.I. doomer.” He doubts that we’ll recognize the approach of superhuman A.I. before it’s too late. “We’ll probably frog-boil ourselves into a situation where we get used to big advance, big advance, big advance, big advance,” he said. “And think of each one of those as, That didn’t cause a problem, that didn’t cause a problem, that didn’t cause a problem. And then you turn a corner, and something happens that’s now a much bigger step than you realize.”

What could we do today to prevent an uncontrolled expansion of A.I.’s power? Ross, of the W.H.O., drew some lessons from the way that biologists have developed a sense of shared responsibility for the safety of biological research. “What we are trying to promote is to say, Everybody needs to feel concerned,” she said of biology. “So it is the researcher in the lab, it is the funder of the research, it is the head of the research institute, it is the publisher, and, all together, that is actually what creates that safe space to conduct life research.” In the field of A.I., journals and conferences have begun to take into account the possible harms of publishing work in areas such as facial recognition. And, in 2021, a hundred and ninety-three countries adopted a Recommendation on the Ethics of Artificial Intelligence, created by the United Nations Educational, Scientific, and Cultural Organization (UNESCO). The recommendations focus on data protection, mass surveillance, and resource efficiency (but not computer superintelligence). The organization doesn’t have regulatory power, but Mariagrazia Squicciarini, who runs a social-policies office at UNESCO, told me that countries might create regulations based on its recommendations; corporations might also choose to abide by them, in hopes that their products will work around the world.

This is an optimistic scenario. Eliezer Yudkowsky, a researcher at the Machine Intelligence Research Institute, in the Bay Area, has likened A.I.-safety recommendations to a fire-alarm system. A classic experiment found that, when smoky mist began filling a room containing multiple people, most didn’t report it. They saw others remaining stoic and downplayed the danger. An official alarm may signal that it’s legitimate to take action. But, in A.I., there’s no one with the clear authority to sound such an alarm, and people will always disagree about which advances count as evidence of a conflagration. “There will be no fire alarm that is not an actual running AGI,” Yudkowsky has written. Even if everyone agrees on the threat, no company or country will want to pause on its own, for fear of being passed by competitors. Bostrom told me that he foresees a possible “race to the bottom,” with developers undercutting one another’s levels of caution. Earlier this year, an internal slide presentation leaked from Google indicated that the company planned to “recalibrate” its comfort with A.I. risk in light of heated competition.

International law restricts the development of nuclear weapons and ultra-dangerous pathogens. But it’s hard to imagine a similar regime of global regulations for A.I. development. “It seems like a very strange world where you have laws against doing machine learning, and some ability to try to enforce them,” Clune said. “The level of intrusion that would be required to stop people from writing code on their computers wherever they are in the world seems dystopian.” Russell, of Berkeley, pointed to the spread of malware: by one estimate, cybercrime costs the world six trillion dollars a year, and yet “policing software directly—for example, trying to delete every single copy—is impossible,” he said. A.I. is being studied in thousands of labs around the world, run by universities, corporations, and governments, and the race also has smaller entrants. Another leaked document attributed to an anonymous Google researcher addresses open-source efforts to imitate large language models such as ChatGPT and Google’s Bard. “We have no secret sauce,” the memo warns. “The barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.”

Even if a FOOM were detected, who would pull the plug? A truly superintelligent A.I. might be smart enough to copy itself from place to place, making the task even more difficult. “I had this conversation with a movie director,” Russell recalled. “He wanted me to be a consultant on his superintelligence movie. The main thing he wanted me to help him understand was, How do the humans outwit the superintelligent A.I.? It’s, like, I can’t help you with that, sorry!” In a paper titled “The Off-Switch Game,” Russell and his co-authors write that “switching off an advanced AI system may be no easier than, say, beating AlphaGo at Go.”

It’s possible that we won’t want to shut down a FOOMing A.I. A vastly capable system could make itself “indispensable,” Armstrong said—for example, “if it gives good economic advice, and we become dependent on it, then no one would dare pull the plug, because it would collapse the economy.” Or an A.I. might persuade us to keep it alive and execute its wishes. Before making GPT-4 public, OpenAI asked a nonprofit called the Alignment Research Center to test the system’s safety. In one incident, when confronted with a CAPTCHA—an online test designed to distinguish between humans and bots, in which visually garbled letters must be entered into a text box—the A.I. contacted a TaskRabbit worker and asked for help solving it. The worker asked the model whether it needed assistance because it was a robot; the model replied, “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images. That’s why I need the 2captcha service.” Did GPT-4 “intend” to deceive? Was it executing a “plan”? Regardless of how we answer these questions, the worker complied.

Robin Hanson, an economist at George Mason University who has written a science-fiction-like book about uploaded consciousness and has worked as an A.I. researcher, told me that we worry too much about the singularity. “We’re combining all of these relatively unlikely scenarios into a grand scenario to make it all work,” he said. A computer system would have to become capable of improving itself; we’d have to vastly underestimate its abilities; and its values would have to drift enormously, turning it against us. Even if all of this were to happen, he said, the A.I wouldn’t be able “to push a button and destroy the universe.”

Hanson offered an economic take on the future of artificial intelligence. If A.G.I. does develop, he argues, then it’s likely to happen in multiple places around the same time. The systems would then be put to economic use by the companies or organizations that developed them. The market would curtail their powers; investors, wanting to see their companies succeed, would go slow and add safety features. “If there are many taxi services, and one taxi service starts to, like, take its customers to strange places, then customers will switch to other suppliers,” Hanson said. “You don’t have to go to their power source and unplug them from the wall. You’re unplugging the revenue stream.”

A world in which multiple superintelligent computers coexist would be complicated. If one system goes rogue, Hanson said, we might program others to combat it. Alternatively, the first superintelligent A.I. to be invented might go about suppressing competitors. “That is a very interesting plot for a science-fiction novel,” Clune said. “You could also imagine a whole society of A.I.s. There’s A.I. police, there’s A.G.I.s that go to jail. It’s very interesting to think about.” But Hanson argued that these sorts of scenarios are so futuristic that they shouldn’t concern us. “I think, for anything you’re worried about, you have to ask what’s the right time to worry,” he said. Imagine that you could have foreseen nuclear weapons or automobile traffic a thousand years ago. “There wouldn’t have been much you could have done then to think usefully about them,” Hanson said. “I just think, for A.I., we’re well before that point.”

Still, something seems amiss. Some researchers appear to think that disaster is inevitable, and yet calls for work on A.I. to stop are still rare enough to be newsworthy; pretty much no one in the field wants us to live in the world portrayed in Frank Herbert’s novel “Dune,” in which humans have outlawed “thinking machines.” Why might researchers who fear catastrophe keep edging toward it? “I believe ever-more-powerful A.I. will be created regardless of what I do,” Clune told me; his goal, he said, is “to try to make its development go as well as possible for humanity.” Russell argued that stopping A.I. “shouldn’t be necessary if A.I.-research efforts take safety as a primary goal, as, for example, nuclear-energy research does.” A.I. is interesting, of course, and researchers enjoy working on it; it also promises to make some of them rich. And no one’s dead certain that we’re doomed. In general, people think they can control the things they make with their own hands. Yet chatbots today are already misaligned. They falsify, plagiarize, and enrage, serving the incentives of their corporate makers and learning from humanity’s worst impulses. They are entrancing and useful but too complicated to understand or predict. And they are dramatically simpler, and more contained, than the future A.I. systems that researchers envision.

Leave a Reply