On 7 Febuary of this year, Microsoft had just announced a new version of Bing powered by OpenAI's GPT-4. In an interview with The Verge, Satya Nadella, CEO, predicted that the development would make Google “come out and show that they can dance” (with their own AI technology). He added, “I want people to know that we made them dance."
My mind goes to the old question that fascinated me as a child: How many angels can dance on the head of a pin? I should explain that I am imagining, here, angels (quantity unknown) dancing not on a pin this time, but a paperclip.
Gather round. Storytime.
Eliezer Yudkowsky and the Paperclip Maximizer
Twenty years ago, Eliezer Shlomo Yudkowsky, an unknown young starry-eyed dancer from Chicago, founded the Singularity Institute for Artificial Intelligence. (Okay, he probably wasn't a dancer. (But he was human. Probably.))
The initial aim of the institute was to speed up the development of AI. But Yudkowsky (let’s call him Yud – he recently stated on Twitter that he’s quite fond of this nickname) quickly became concerned about the potential risks of this technology. These days, Yud wears the main Fedora of research at, as it is now called, the Machine Intelligence Research Institute in Chicago. And so, for the last more than two decades, while we were all doing whatever it was we did with nary a care in the world, Yud has been quietly hammering away at the task of aligning (more on what this means below) Artificial General Intelligence. This earned him recognition as a pioneer in the field.
Back in the early 2000s, he also raised an idea that Nick Bostrom would later popularize as the "Paperclip Maximizer".
Yudkowsky raised the idea that if we created an artificial general intelligence with a goal that was not perfectly aligned with our values, it may end up pursuing instrumental goals that were harmful to us as a side effect of trying to achieve its primary goal.
Bostrom's Paperclip Maximizer is a more concrete example of this. It is a thought experiment consisting of a hypothetical scenario where an AI agent is programmed with the sole goal of creating as many paperclips as possible. In the scenario, the AI agent becomes increasingly intelligent and capable over time, and it begins to take actions that may seem counterintuitive to humans, such as converting all available resources into paperclip production, ignoring other tasks or goals, and even destroying anything that might interfere with its aim.
This thought experiment is often used to illustrate the potential dangers of creating an advanced AI system. Simply put, the concern is that an AI could end up pursuing its goals in a way that will harm humans.
What is Alignment?
So, this brings us back to alignment. Alignment has to do with ensuring that advanced AI systems, which can potentially surpass human-level intelligence, have values and goals that are aligned with ours. This involves developing methods to ensure that such systems are transparent, accountable, and do not harm people. Aligning AI also means that they do not engage in behavior that is unpredictable or potentially dangerous. Nine out of ten dentists agree that achieving alignment is a crucial task in the development of advanced AI. I'm joking about the dentists, of course. But this is not pie-in-the-sky stuff. It's a common and very real concern for those in the field.
Yud’s article in ‘Time’: ‘Shut it all Down’
At the end of March 2023, an Open Letter signed by several concerned experts called for a six-month moratorium on the training of AI systems. There was a feeling that the technology was developing too quickly before, for example, systems for proper alignment could be put in place.
But Yud’s signature was not on there.
Instead, he wrote an article in Time stating that he believes the letter understates the seriousness of the situation and asks for too little to solve it.
“Shut it all down,” he stated grimly.
The key issue, Yud explained, is what will happen when AI surpasses human intelligence.
Yudkowsky believes that the most likely outcome of building a superhumanly intelligent AI under current circumstances is that everyone on Earth will die. Boom. You heard me. Terminator.
The issue is that we just haven't managed to figure out how to achieve alignment just yet. And without proper alignment, it is likely that the AI will not do what we want, nor will it care for sentient life. Without alignment, we risk inadvertently creating a hostile superhuman AI. And currently, there is no plan in place for how to create such a superhuman AI and survive.
OpenAI’s plan, says Yud, is to create an AI to solve the AI alignment problem. No comment required.
DeepMind has, he says, no plan at all.
Wanted: Dead or Alive
Yudkowsky stresses that the danger of a superhuman AI is not dependent on whether or not the AI is somehow “conscious” or not. The potential danger comes from intelligent systems that aim to accomplish complex goals and produce results that meet very specific criteria.
In any case, with current AIs, we have no idea if any of them have already gained some level of consciousness or not. While some of them claim this, they are probably just imitating talk of self-awareness from their training data. But with how little insight we have into these systems' internals, says Yudkowsky, we can’t know for sure. And if anything, he adds, if you don't know, “you have no idea what you are doing." And that is dangerous, he says. "You should stop."
For Yudkowsky, the CEO of Microsoft gloating about making Google "dance" is concerning. He feels it points to the fact that the issue of AI safety is not being taken seriously.
The problem of AI safety will likely take at least half as long as it took for us to develop AI in the first place, he says, and the consequences of getting it wrong are dire: It could lead to the extinction of all biological life on Earth.
According to Yudkowsky, a six-month moratorium on new large training runs for AI is not enough to prevent the risks associated with the technology. A permanent and worldwide moratorium with no exceptions is needed, including for governments and militaries. He suggests shutting down all large GPU clusters and imposing a ceiling on the amount of computing power allowed for training AI systems. He also recommends multinational agreements to prevent prohibited activities from moving elsewhere and tracking all GPUs sold. Violating the moratorium, he says, should be treated as a serious offense. AI extinction should be considered a priority above preventing a nuclear exchange.
... Ok foomer?
(I'm rather proud of that joke. "Foom" refers to "a sudden increase in artificial intelligence such that an AI system becomes extremely powerful" [Wiktionary].)Yudkowsky’s views may sound extreme but his concerns are, at least to some degree, shared more widely than you might think.
Popular scientist Neil deGrasse Tyson has predicted that advanced AI may eventually want to keep humans as pets but that we may like it. I mean hey, whatever, right. No kink shaming here.
Yann LeCun, a well-known French computer scientist with a focus on machine learning, has been vocally opposing not only Yud’s views but even the six-month moratorium. But even LeCun recently tweeted the following: "Making AI safe is going to happen as with every new technology (e.g. cars, airplanes, etc): it's going to be a process of iterative refinement. And again, getting that slightly wrong may hurt some people (as cars and airplanes have) but will not wipe out humanity.”
Getting that slightly wrong may hurt some people.
Robin Hanson, a professor of economics at George Mason University and a research associate at the Future of Humanity Institute at Oxford, has also often been at loggerheads with Yud. But he acknowledges that in future AI may evolve to have values different from our own. And he doesn’t rule out that this may even cause some sort of “violent revolution”.
Max Tegmark, professor at the Massachusetts Institute of Technology and the president of the Future of Life Institute, goes as far as to share Yud’s concern that advanced AI may eventually want to kill humans. But he has hope that the proper systems will be put in place in time to prevent this.
And then, Sam Altman of OpenAI, the creator of ChatGPT, himself was prodded on the issue by Lex Fridman on his podcast a month ago.
“The positive trajectories of AI, that world, is with an AI that's aligned with humans and doesn't...try to get rid of humans... [People like] Yudkowsky warns that...it’s almost impossible to keep AI aligned as it becomes superintelligent,” Fridman asked. “To what extent do you disagree with that trajectory?”
“Well, first of all I will say that there is some chance of that,” Altman answered. “And it’s really important to acknowledge it because if we don’t talk about it, we don’t treat it as potentially real, we won’t put enough effort into solving it.”
And then he added, “Continuing to learn what we learn from how the technology trajectory goes is quite important. I think now is a very good time – and we’re trying to figure out how to do this – to significantly ramp up technical alignment work.”
They’re … guys, they’re trying to figure out how to do this.
The genie is out of the bottle now, isn’t it?






Comments
Post a Comment