14 min read

The Risks of Transformative AI with... Children of Time

Disclaimer: The 'reflections on' series is better appreciated if you've read the source material, though it isn't necessary. You can find the book here or here, the audiobook here, and a summary here. For Children of Time, I recommend the audiobook, played at the fastest speed that you can follow...it's very long.

"Life is not perfect, individuals will always be flawed, but empathy – the sheer inability to see those around them as anything other than people too – conquers all, in the end."
Adrian Tchaikovsky, Children of Time

Our ideal future vision is conditional on the creation of transformational Artificial Intelligence (AI). We've seen how the technology can bring about vast improvements in society and enable our vision, but haven't yet discussed the risks inherent in the technology. And there are huge risks. Far from the fictional AI-dystopias imagined in Ex Machina or The Matrix, we are now seeing such headlines as The case for taking AI seriously as a threat to humanity and AI experts are increasingly afraid of what they’re creating. While the level of risk is very much debated, it is fair to say that to ensure we achieve our ideal future vision we need to better understand the risk from transformative AI and govern its development to ensure not only that the risk is avoided, but that we steer our path to our eutopian ideal.

I introduce the first of these concepts - understanding the risk of transformative AI - through what I believe is one of the best science fiction novels to provide a realistic explanation of how it could all go wrong: Children of Time (CoT) by Adrian Tchaikovsky. If you've read the book you might be confused - "CoT is a book about nightmare space-spiders, not AI" you might exclaim, and you'd be right. However it is also a fable about the perils of AI, and how poor governance and alignment of the technology could end human society as we know it. CoT might be 1,000x longer and more detailed than any of Aesop's fables, but they're distant cousins.

Here I will attempt to explain the largest risks associated with transformative AI through the metaphor that Tchaikovsky uses in Children of Time. I will try to make it as accessible as possible, but if you are starting from zero - say you aren't entirely comfortable with the concept of artificial intelligence - I highly recommend reading this amazing primer from the Wait But Why blog first.

We could develop an intelligence superior to humans

Children of Time opens with Dr Avrana Kern (Kern) and her crew attempting to enable human expansion from a collapsing Earth by terraforming the planet they are in orbit around using monkeys injected with a nano-virus used to accelerate their evolution, allowing them to be intelligent yet slaves to the human crew. The virus works, but on spiders instead of monkeys. They do evolve - rapidly - but do not stop and eventually form a collective intelligence superior to that of humans.

To deconstruct the metaphor; the virus-boosted spiders - simply 'spiders' hereafter - are the artificial intelligence. While the definition of artificial intelligence is debated, it is a fair description to use as the spiders are a human creation that develop capabilities equal and greater to that of humans. For example, they have the ability to hold information indefinitely with only a constraint on their total memory capacity (meaning old memory must be wiped to make space for new information) which mimics that of computer hardware and memory and highlights a distinct superior capability over 'forgetful' human brains. While the spiders are 'biological', there are many parallels between their organic or chemical composure and the technical composure of our computers and AI systems, and it could well be true that organic material is superior than silicon and metal for building supercomputers - our brains are organic after all!

Having a superior intelligence to humanity, also known as being a 'superintelligence', isn't inherently a bad thing, but as we'll see it magnifies the risks significantly.

It could have goals that are different from our own, and destroy humanity

Kern plays the AI 'creator' role - the decision-maker for the development team that creates transformational AI, which we will see is a pivotal role in AI risk management. She initially sets the AI to have the goal of serving humans - back when the intention was to have intelligent virus-monkey slaves - but the spiders don't seem aware of this goal as they don't even have a concept of 'humanity' for a long time. They instead spend most of the story seeking their purpose, and gaining power and resources to progress their investigation into purpose at an ever-faster rate. Once the spiders develop the capability to communicate with Kern they seem to remember their initial programming and follow her commands.

While this doesn't occur in the book, it isn't difficult to imagine a world where AI develops its own goals, and those goals pose an existential threat to humanity. Imagine that Kern isn't able to reconnect and communicate with the spiders, perhaps her space capsule eventually breaks, not an unrealistic possibility. The spiders are left not knowing their ultimate goal and stay stuck with the objective: determine our ultimate goal. They determine that learning, science and development get them closer to this objective and that it takes resources to fuel this development, in order to grow their population - a driver of their collective intelligence, akin to, say, the number of transistors in a supercomputer chip - and as an input for their research. They consume ever more resources in becoming ever more intelligent, yet as it is impossible for them to learn their ultimate goal - Dr Kern is no longer available - they simply don't stop consuming resources. They develop space travel and the technology needed to harvest resources in their galaxy and beyond, with no care for the impact on any other life form. Eventually our solar system and humankind become collateral damage in their never-ending striving for an unsolvable objective. The only good that comes of this is my possible move into Children of Time fan fiction...

This may seem farfetched, but it is a scenario considered possible by a large portion of the AI research community and is the driving fear behind the push for AI alignment - ensuring the a transformative AI has goals that are aligned with those of its creator. This example shows how badly things can go if we get that completely wrong, now we'll see that things can go just as badly even if we get it just *slightly* wrong.

AI could misinterpret the goals that we set them, and destroy humanity

Returning to the actual story, while the spider artificial intelligence is taking thousands of years to develop and Kern is floating above that world in stasis in her space capsule, human society collapses on an ecologically failed earth. The last remaining humans escape on the spaceship Gilgamesh and are left floating in the cosmos in search of a new home planet. 'Kern's world' - the planet earmarked for terra-forming - is unsurprisingly the best candidate given it was the outcome of a scan of the  possible life-sustaining planets closest to earth. They reach it to find it is defended both by Kern in her high-tech space capsule and a spider colony growing in intelligence and with the capabilities to defend themselves. Not a great prospect for the last remaining humans - and I won't yet get to how it plays out - but it's clear how it would be a different experience entirely if the AI were 'aligned' with the goals of humanity. One could imagine the spiders using their communication technology to inform the Gilgamesh that it is safe for their arrival, possibly giving them an ideal approach and landing coordinates. The spiders could learn about the last remaining humans and go about preparing the world to make it as safe and comfortable for them as possible. Even with their advancing intelligence it would be easy for them to live symbiotically. Sadly that is not what happens...

We return to the speculation of fan fiction first to explore the risk of misinterpreted goals. Imagine that the specific objective that Kern programmed into the virus was simply ensure the survival of the human race. Maybe they took the thoughtful position that they didn't know how best the monkey-slaves could serve them and wanted to avoid the humanity-destroying risk of the previous example. Seemingly innocuous. Sadly, the spiders have little room for nuance and don't share the same mental model of the world as their human creators. They break this objective down to pure logic, the ones and zeros needed to identify whether actions progressed towards achieving the objective or not. Humans would likely understand the concept of the survival of the human race as continued procreation for the continuation of the species. The spiders could take it to mean the state of there being at least one living human being in the universe. It also leaves open questions of how they might define living or even human being. All of which is dangerous.

The spiders quickly identify that it is much easier to focus their resources on keeping one human being alive than it is to manage a whole society - especially when the humans on the Gilgamesh appear to be hostile. They realise they have one non-hostile human being in Kern, and judge that while her biological form may no longer be effective, the upload of her mind into her space capsule's computer still represents her, and so they focus on keeping that system alive no matter what. They quickly resolve to destroy the Gilgamesh and the last remaining humans and keep Kern's capsule and mind-upload-system functioning, building their entire society around protecting it and taking the risk of failure to ever greater orders of magnitude probability closer to zero. They need resources to do this and eventually harvest the universe to ensure they continue to achieve their objective to ensure the survival of the human race.

This may seem like a laughable example and yet the go-to analogy used in transformative AI risk research is that of the paperclip maximiser first described in 2003 - an AI that converts the entire universe into paperclips, in clear misalignment with human desires. These examples are thought experiments and are just intended to highlight the risks of developing a transformational technology without first ensuring that it is safe. Though as we now see, even ensuring AI is aligned to human goals and deemed 'safe' isn't enough to ensure a good outcome for humanity.

Even if the AI is aligned, its creator could use the technology against the good of humanity

Thankfully/depressingly we can return to the actual story of Children of Time as this is the failure mode that we do observe. It is also the one that I find to be most neglected in transformative AI risk research so I'm glad Tchaikovsky used this mode in his fable.

Unsurprisingly, Dr. Kern's plan to develop an intelligent slave-species to terraform a planet wasn't universally loved back home on earth. There is broad disagreement on what is right and ethical standards appear to have slipped in the face of an ecologically collapsing planet and its impending existential threat. Kern believes she is acting out of the greater good and that humanity will eventually thank her for taking decisive action to save humanity. A kind interpretation would say that Kern doesn't act on behalf of the human race as a whole. Another might say her motivations echo those of any number of fictional and non-fictional villainous despots. Kern develops an AI seemingly with the objective to serve her wishes, not those of humanity as a whole.

Most interpretations would say that the AI is aligned; for a while the spiders seek their objective, knowing instinctively that it relates to the messenger (the spiders name for Kern before they understand what and who she is) and once they are able to communicate with Kern act in accordance with her wishes. It shows how the wishes can be the problem. Kern - albeit a slightly mad Kern after hibernating in a space capsule for thousands of years - prioritises her creation over the last remaining humans, updating their objective to save themselves and destroy the Gilgamesh if necessary to achieve that aim. This is effectively a direct order to eliminate the human race and is clearly not acting for the good of humanity. No matter how low ethical standards get, the survival of humanity always comes first. This may seem like an impossible case, but there are those in transformative AI risk research who believe that a greater-than-human AI would have greater moral status and would justify eliminating humanity if it were needed for the AI to grow and thrive.

That said it is still an extreme case that a human would ever give such orders. I believe the actual outcome of the book to be more realistic and just as worrisome. The spiders find a way to keep the remaining humans alive and keep them as a subservient species. The humans are inferior yet useful to the spiders and so they are kept for as long as they stay useful, treating them the same as they treat the ants within their spider society. Kern meanwhile is aligned with the spiders; they are now sufficiently intelligent that she is no longer in control of them, but she is very much on their side and not that of the last remaining humans. There is only one of Kern, but the notion of a two-tiered society is there.

In my view, the most realistic threat to humanity of 'safe' AI is from the development of a two-tiered society in which those aligned with the AI have sufficient power to maintain the structure. The 'superior class' could use the AI to develop autonomous surveillance drones for each individual to ensure there could be no organisation and potential for fighting against the power structure. Transformational AI could mean an end to scarcity and ensuring that the 'inferior class' could have a higher standard of living than 80% of the world currently experiences, yet they would have no agency and the 'superior class' could still have a standard of living multiple orders of magnitude higher yet. This is a dangerous path indeed and one that the incentives of capitalism only encourage.

This is a winner-takes-all game. The first person to develop transformative AI will accrue untold power. The second person to get there will not - there may not even be a second person as the first person may make it impossible for anyone to replicate the technology. This incentivises individuals who are attracted to power, wealth, and status, and those who will disregard alignment and safety research if it may slow down their progress. These factors increase the likelihood that whoever develops the technology will primarily use it for their own ends rather than for the social good.

The risk is that the development of transformative AI will concentrate previously unseen levels of power in the hands of very few and we currently don't have the governance and enforcement tools necessary to prevent this from being the case. We'll now explore some aspects of transformative AI development that make this governance problem all the more difficult.

We have limited time to develop effective governance of AI development, and we could get it very wrong

There is a concept of 'crunch time' in transformative AI governance; an undefined period of time where individual decisions - and the decisions of individuals - will become extremely high leverage (read: of outsized importance). CoT opens with what is effectively a fight over the deployment of transformative AI that occurs at the very moment of hitting the green-light. There are two competing philosophies and factions on earth - one who believes we should push forward - another who believes we should hit pause until we better understand capabilities. The infighting causes a release that is far from which either side would have hoped for, and once AI is out of the box, most believe it impossible to put it back in.

This highlights the importance of gaining alignment and cooperation within humanity before anybody hits 'the big red button' of deployment. If we don't, 'crunch-time' will become increasingly high-stakes, and therefore high-leverage of individual decisions, at an exponential rate. You could imagine that in the hour before the deployment of transformative AI, it would become morally justifiable to a global power to fire nuclear weapons at those deploying the technology if they have sufficient conviction that the current state of AI alignment and governance is insufficient; that there's a high risk of existential threat. That - I shouldn't have to say - would be a very bad outcome indeed. Particularly if it doesn't even prevent the deployment.

Without explainability, we won't know what capability level the AI has to know how long we have

The story paints a pivotal turning point - though it isn't known to be so significant at the time - when the spiders' decide to work 'with' the ants rather than destroy them in their 'great war'. I say 'with' as the spiders realise they are of higher intelligence and that the ants have unique capabilities and can play a role in spider society. It's a precursor for the final battle with the humans where the spiders let them live so they might work 'with' them in their society.

The crucial question here is whether the nano-virus steered the spiders to this 'empathetic' outcome, or whether the spiders themselves decided on this strategy based on their inherent goal-seeking behaviour, and in doing so trained or refined the nano-virus to take on this empathy orientation. Was this the inevitable outcome given the AI had this empathetic line of code? Was it was simply a matter of time before their self-awareness caught up with their programmed orientation? Or was the result of trial-and-error neural network development where the AI 'learns' that this strategy of cooperation is superior for achieving its objective?

This is a vitally important question being debated in the field of AI alignment: do we need to perfect the objective function of the AI - the goal it is driven to achieve, agnostic of strategy - or can we design the code to permanently bias the AI toward a humanity-serving course of action, regardless of its goal? This is a deep, dark rabbit hole, but if you're curious to dive down it I point you to this analysis of the debate on the Astral Codex Ten blog.

We will cooperate with a superintelligence at best in the way that chimps cooperate with us

It is debatable, but I interpret the story as humanity becoming a subservient species to the spiders. The individual human beings may individually be more intelligent than an individual spider, but it is nothing compared to the shared intelligence of the spiders - the true superintelligence. The spiders show a strategic ability that strongly suggests that they sufficiently control the environment such that the humans exist under the soft control of the spiders. This is much the same as the relationship between spider and ant described through most of the book.

The intelligence gap between human and spiders is most clearly demonstrated in the final showdown. The humans get stuck in a prisoners dilemma while the spider superintelligence changes the game such that they were in control of the outcome before it even began. Even a student of game-theory understands that the best strategy for playing a game with a non-optimal equilibrium (outcome) is to change the nature of the game. The spiders allow the humans to exist much like how as humans feel in control of the fate of all other species on the planet, and I shouldn't need to remind you how humanity have made so many extinct when it become more profitable to do so than keep them alive. We'd be foolish to blindly believe a superior intelligence would treat us any differently.

It is impossible to know what values a superintelligence will hold, if any

All of this begs the largest question of all: what does it mean to be human in a world where we are not the greatest intelligence? Humanity is very used to being the most intelligent being in its sphere of existence. With it seeming near-inevitable that we will be able to create an intelligence greater than our own, how should we prepare to be second best? There are many threads this question leads down, let me pose one.

How should we adjust our (humanity's) relationship with beings less intelligent than ourselves in the present to reflect how we would want an intelligence greater than us to treat us in the future? It is unclear in CoT whether the spiders keep humanity alive out of a belief it is the 'right' thing to do, or because they possess unique abilities or comparative advantages. Should we do more to support broad animal welfare as a society, or should we value non-human animals purely by what they offer us? The former seems like the smarter strategy to me...

Without knowing what future we want, it is very unclear who the good and bad actors are

The author does an incredible job of not painting the story in a positive or negative light. I am sure that this story will split readers evenly between who they think the good or bad actors are, and in fact the majority would say they don't see anyone strongly falling into either camp. I believe this is largely because the goals are unclear. Humans are good at determining whether something or someone has moved us closer or further way from a desired state, and can label them 'good' or 'bad' accordingly. For example, a thief moves us away from the desired state of 'I have all of my valuable possessions', and is therefore bad. Here, the 'desired state' for Lain  and the last remaining humans is 'survival of the species'. Dr Kern and the spiders are ultimately not in opposition to this desired state, and the human species does indeed survive to co-exist with the spiders. The relevant question is in what form is this survival desirable, and the fact that humanity has not determined the outcome to this question leaves us pondering whether the spiders - in radically altering the humans with the nano-virus and keeping them as a subservient species - are 'good' or 'bad'. We need alignment on these philosophical questions before we reach such a 'crunch time' to inform the decisions we make.

Children of Time is maybe the most brilliant foreboding-yet-realistic story of how transformative AI development might play out. It is unlikely to occur in outer space and over the timeframes set out in the book, but all of the other phenomena are likely to be there. As with all fables, the specifics of the story are of little importance compared to the lessons it teaches. We need to start learning these lessons as soon as possible. The stakes are too high.

Please share your thoughts if you have any feedback on this article, or leave a comment below.