The existential risks of AI

Earlier this year, several doctors and public health experts stated that AI could harm the health of millions and has the potential to pose an existential threat to humanity. They call for a halt to the development of artificial general intelligence until it is properly regulated. Even Geoffrey Hinton, ‘the godfather of AI’, warns the world against the potential dangers of his own creation. 

So what are the most important and potentially gravest existential risks of advanced and poorly regulated AI technology? And how can we control and mitigate these risks? Read on to find out.


what are existential risks?

Before we delve a little deeper into the existential risks of AI, we should define what an existential risk or threat exactly is. The term existential risk refers to scenarios that have the potential to bring about the extinction of the human race or cause a global civilization collapse. They are distinguished from other global catastrophes by the scope (they affect the entirety of humanity and all future generations) and severity (they cause death, destruction or a reduction of the quality of life at an almost unprecedented scale). Prime examples of existential threats are asteroid impacts (such as the one that wiped out the dinosaurs 66 million years ago), a global nuclear meltdown, pandemics, or the dangerous cocktail of climate change and global biodiversity loss.

existential threat number one

Research by Toby Ord and other existential risk analysts, including the Amsterdam-based Existential Risk Observatory, suggests that new technology, and especially unaligned AI that equals or exceeds human cognitive capabilities, is the biggest risk for the survival of humanity’s long-term potential. It might take AI a while to wrap its tentacles around a particular skill, but once it succeeds in doing so, it rapidly becomes unstoppable. 

To quote Elon Musk: “AI doesn’t have to be evil to destroy humanity. If AI has a goal and humanity happens to be in the way, it will destroy humanity as a matter of course, even without thinking about it. No hard feelings. It’s like when you are building a road and an anthill happens to be in the way: although we don’t hate ants, we’re building a road. So it’s goodbye anthill.”

where are we now?

Current AI research is geared towards the creation of general purpose AI (AGI). This is the type of artificial intelligence that is capable of quickly learning high-quality behavior in “any” task environment. If we succeed, the technology could lift the living standards of everyone on earth to a respectable level and spur advances in health, education and science.

Although we are making giant steps towards the creation of real general purpose AI, computer scientist Stuart Russell thinks we are not quite there yet. For example, ChatGPT doesn’t really “know” things in the human sense of the word. According to Russell, present-day AI still misses:

  • A real understanding of language and the world
  • The integration of learning with knowledge
  • Long-range thinking at multiple levels of abstraction (from milliseconds to years)
  • The cumulative discovery of concepts and theories

the problem of misalignment

Although AI still lacks some important features that are at the core of human intelligence, experts in the field expect that AI will develop the cognitive attributes that it currently misses at some point in time. The problem is that it is nearly impossible for us to exactly pinpoint when and how these breakthroughs (and unforeseen consequences that will likely arise from the rise of general purpose AI) are going to take place. 

Alan Turing (1912-1954), generally seen as the father of artificial intelligence, painted quite a bleak picture of our ability to control the advance of AI: “It seems probable that once the machine thinking method has started, it would not take long to outstrip our feeble powers. At some stage therefore we should have to expect the machines to take control.”

The main problem in the relationship between humans and AI is misalignment. With incompletely or incorrectly defined objectives, better AI produces worse outcomes. The social media misalignment is a perfect example of this principle. Social media’s AI intention to maximize engagement and increase clicks culminated in a commercial arms race that modified people to become more predictable. It also created a lot of unintended problems such as information overloads, addiction, shortened attention spans, doom scrolling, an often toxic influencer culture, the sexualization of kids, QAnon, polarization, fake news, and deep fakes. 

An additional problem is that we don’t really know if advanced AI solutions like GPT-4 and other large language models are actively pursuing goals. This is especially dangerous for one particular reason. When humans pursue a goal, we usually don’t pursue it to the exclusion of everything else. To give you an example. We know that we want to mitigate climate change, but realize that removing all human beings, who cause and accelerate the problem, is not the right way to go about because we also value being alive and (at least most of us) would have ethical concerns about killing our fellow humans. An AI system might not have such reservations and decide that wiping out the human race is the most effective way to solve the global climate and biodiversity crisis.

how do we retain power over entities more powerful than us?

So how do we retain power over entities that are bound to become more powerful than us? What we really want are machines that are beneficial to the extent that their actions can be expected to achieve our objectives. Avoiding the mis-specification of objectives means designing machines that act in the best interests of humans and are explicitly uncertain about what those objectives are. 

This also means that AI systems should exhibit minimally invasive behavior and a willingness to be turned off if humans deem this necessary. Current trends in AI development show an opposite tendency. Most AI systems that are pursuing a fixed objective will prevent themselves from being switched off because that would lead to them failing in their objective. 

Therefore, future AI systems should adhere to the following principles:

  • They should be well-founded. This means that they are semantically well-defined and consist of individually checkable components.
  • There should be a rigorous theory of composition for complex agent architectures. This means that developers and users should have a clear understanding of how the AI solutions work and build in proofs of safety
  • The digital ecosystem should be certified. The existing model adheres to the principle “everything runs unless known to be unsafe.” But the chief guiding principle should really be “nothing runs unless known to be safe.”

want to know more?

The existential threat of AI is a hot topic these days. If you want to know more, the existential risk observatory has a recording of a debate held in April this year.