A.I. has a ’10 or 20% chance’ of conquering humanity, former OpenAI safety researcher warns

This post was originally published on this site

https://content.fortune.com/wp-content/uploads/2023/05/GettyImages-1252189891-e1683137685743.jpg?w=2048

The rapid rise of new A.I. models in recent months, like OpenAI’s ChatGPT, has led some technologists and researchers to ponder whether artificial intelligence could soon surpass human capabilities. One key former researcher at OpenAI says that such a future is a distinct possibility, but also warns there is a non-zero chance that human- or superhuman-level A.I. could take control of humanity and even annihilate it.

A “full-blown A.I. takeover scenario” is top of mind for Paul Christiano, former head of language model alignment on OpenAI’s safety team, who warned in an interview last week with the tech-focused Bankless podcast that there is a very decent chance advanced A.I. could spell potentially world-ending calamity in the near future.

“Overall, maybe you’re getting more up to a 50-50 chance of doom shortly after you have A.I. systems that are human-level,” Christiano said. “I think maybe there’s a 10 to 20% chance of A.I. takeover [with] many, most humans dead.”

Christiano left OpenAI in 2021, explaining his departure during an Ask Me Anything session on LessWrong, a community blog site created by Eliezer Yudkowsky, a fellow A.I. researcher who has warned for years that superhuman A.I. could destroy humanity. Christiano wrote at the time that he wanted to “work on more conceptual/theoretical issues in alignment,” a subfield of A.I. safety research that focuses on ensuring A.I. systems are aligned with human interests and ethical principles, adding that OpenAI “isn’t the best” for that type of research.

Christiano now runs the Alignment Research Center, a nonprofit group working on theoretical A.I. alignment strategies, a field that has gained considerable interest in recent months as companies race to roll out increasingly sophisticated A.I. models. In March, OpenAI released GPT-4, an update to the A.I. model that powers ChatGPT, which only launched to the public in November. Meanwhile, tech behemoths including Google and Microsoft have kicked off an A.I. arms race to stake a claim in the burgeoning market, launching their own versions of A.I. models with commercial applications.

But with publicly available A.I. systems still riddled with errors and misinformation, Christiano and a host of other experts have cautioned against moving too fast. Elon Musk, an OpenAI cofounder who cut ties with the company in 2018, was one of 1,100 technologists who signed an open letter in March calling for a six-month pause on development for advanced A.I. models more powerful than GPT-4, and a refocus on research on improving existing systems’ reliability. (Musk has since announced to be starting a competitor to ChatGPT called TruthGPT, which he says will focus on “truth-seeking” instead of profit.)

One of the letter’s concerns was that existing A.I. models could be paving the way for superintelligent models that pose a threat to civilization. While current generative A.I. systems like ChatGPT can capably handle specific tasks, they are still far from reaching human intelligence levels, a hypothetical future iteration of A.I. known as artificial general intelligence (AGI).

Experts have been divided on the timeline of AGI’s development, with some arguing it could take decades and others saying it may never be possible, but the rapid pace of A.I. advancement is starting to turn heads. Around 57% of A.I. and computer science researchers said A.I. research is quickly moving towards AGI in a Stanford University survey published in April, while 36% said entrusting advanced versions of A.I. with important decisions could lead to “nuclear-level catastrophe” for humanity.

Other experts have warned that even if more powerful versions of A.I. are developed to be neutral, they could quickly become dangerous if employed by ill-intentioned humans. “It is hard to see how you can prevent the bad actors from using it for bad things,” Geoffrey Hinton, a former Google researcher who is often called the godfather of A.I., told the New York Times this week. “I don’t think they should scale this up more until they have understood whether they can control it.” 

Christiano said in his interview that civilization could be at risk if A.I. develops to the point that society can no longer function without it, leaving humanity vulnerable if a powerful A.I. decides it no longer needs to act in its creators’ interest. 

“The most likely way we die involves—like, not A.I. comes out of the blue and kills everyone—but involves we have deployed a lot of A.I. everywhere,” he said. “If for some reason, God forbid, all these A.I. systems were trying to kill us, they would definitely kill us.”

Other voices have pushed back against these interpretations of A.I., however. Some experts have argued that while A.I. designed to accomplish specific tasks is inevitable, developing AGI that can match human intelligence might never become technically feasible due to computers’ limitations when it comes to interpreting life experiences. 

Responding to recent dire warnings over A.I., entrepreneur and computer scientist Perry Metzger argued in a tweet last month that while “deeply superhuman” A.I. is a likelihood, it will likely be years or decades before AGI evolves to the point of being capable of revolting against its creators, who will likely have time to steer A.I. in the right direction. Responding to Metzger’s tweet, Yann LeCun, an NYU computer scientist who has directed A.I. research at Meta since 2013, wrote that the fatalistic scenario of AGI developing dangerous and uncontrollable abilities overnight is “utterly impossible.”