Estep-Multiple Unnatural Attributes of AI Undermine Comm
Estep-Multiple Unnatural Attributes of AI Undermine Comm
https://doi.org/10.1007/s00146-024-02134-4
MAIN PAPER
Abstract
Accelerating advancements in artificial intelligence (AI) have increased concerns about serious risks, including potentially
catastrophic risks to humanity. Prevailing trends of AI R&D are leading to increasing humanization of AI, to the emergence
of concerning behaviors, and toward possible recursive self-improvement. There has been increasing speculation that these
factors increase the risk of an AI takeover of human affairs, and possibly even human extinction. The most extreme of such
speculations result at least partly from anthropomorphism, but since AIs are being humanized, it is challenging to disen-
tangle valid from invalid anthropomorphic concerns. This publication identifies eight fundamentally unnatural attributes
of digital AI, each of which should differentiate AI behaviors from those of biological organisms, including humans. All
have the potential to accelerate AI evolution, which might increase takeover concerns; but surprisingly, most also have the
potential to defuse the hypothetical conflicts that dominate takeover speculations. Certain attributes should give future AI
long-term foresight and realism that are essentially impossible for humans. I conclude that claims of highly probable hostile
takeover and human extinction suffer from excessive anthropomorphism and a lack of skepticism and scientific rigor. Given
the evidence presented here, I propose a more plausible but still speculative future scenario: extensively humanized AIs
will become vastly more capable than humans of making decisions that benefit humans, and rational people will want AI to
assume progressively greater influence over human affairs.
Vol.:(0123456789)
Realistic assessment of potential risks posed by AI is differences, and suggest some straightforward ways in which
critically important. AI has the potential to create enormous they might influence AI behavior and evolution. I also pre-
benefits for humankind, and people have proven repeat- sent more speculative future scenarios that are at least as
edly to be non-ideal stewards of their fellow humans and rigorous as those concluding that human extinction is likely
of the future, so regulatory restrictions on AI R&D should or inevitable. Compelling reasons are presented for why
not be implemented casually or excessively (Andreessen future AI will not be unconditionally predisposed to cer-
2023; Estep and Hoekstra 2015). As we weigh the pros and tain natural behaviors, such as insatiable ambition, resource
cons of AI regulation, it is critically important to bear in acquisition, and expansion, that are the basis for common
mind that the best—and possibly only—protection against takeover scenarios.
malicious or weaponized advanced AI might be even more
powerful AI. Nevertheless, given rapidly accelerating com-
puting power and capabilities of frontier AI models, it is 2 AI takeover speculations
not unreasonable to assume that AI might pose extremely
serious risks to humanity (Bostrom 2014; Hendrycks et al. In 1951, Alan Turing gave a lecture in which he made the
2023; Russell 2019). However, the fact that AI has no tech- following statement, which remains hotly debated and
nological precedent has resulted in extreme speculations. controversial:
Because the closest precedent to AI is human intelligence,
It seems probable that once the machine thinking
speculations often involve (often unintended) anthropomor-
method had started, it would not take long to out-
phism—the expectation that AI will behave in critical ways
strip our feeble powers.… At some stage therefore we
like humans (Salles et al. 2020). Some degree of anthropo-
should have to expect the machines to take control….
morphism is reasonable, especially since the general trend in
(Leavitt 2006)
AI development is to create human-like intelligence, which
includes embedding human-like values in AIs (Hadar-Shoval
2.1 Polarized perspectives
et al. 2023; Lindahl and Saeid 2023). However, specula-
tions of takeover are invariably predicated on not just fears
Since Turing's statement on AI taking control (takeover),
of explosive growth in AI intelligence and capabilities, but
many others have made similar predictions, but over the past
also on expectations of insatiable ambition, and relentless
decade, concerns have increased. The release of ChatGPT in
resource acquisition and expansion. What is the source of
late 2022 caused a frenzy of both excitement and fear, and an
such behavior? It is often assumed or even explicitly claimed
escalation of disagreements about catastrophic risks (Ben-
that it is simply the inevitable path of an increasingly intelli-
gio 2023d; Jones 2023). The three scientists who shared the
gent agent (Bostrom 2014, pp. 121–123; Galeon 2016; Kur-
2018 Turing Award for their pioneering work on deep learn-
zweil 2005, p. 364; Moravec 1988; Tegmark 2017, p. 204).
ing, Geoffrey Hinton, Yoshua Bengio, and Yann LeCun,
It also has been argued that, as AIs become increasingly
have each taken strong positions. Hinton and Bengio, along
powerful and humanized, and as the number of advanced
with Ilya Sutskever3 and many others have suggested that
systems grows to be very large, they—and their relationship
the probability of takeover is not only uncomfortably high,
with humans—will be subject to Darwinian forces in a man-
but it could happen very soon (Bengio 2023a; D’Agostino
ner analogous to natural selection (Hendrycks 2023a; Knight
2023; Hendrycks 2023b; Hessen Schei 2019; Knight 2023;
2023; Yudkowsky 2008). However, AIs are fundamentally
Yudkowsky 2023). A 2023 poll by Grace and colleagues of
different from humans, and this selective process will not
2778 top-tier AI researchers suggests that similar concerns
operate exactly like natural selection.
are common. Depending on the phrasing of the question,
To disentangle reasonable from unreasonable anthropo-
between 38% and 51% gave at least 10% probability of future
morphism of AI in order to understand and possibly predict
“human extinction or similarly permanent and severe dis-
the future behaviors of AIs, it is reasonable to begin with an
empowerment” (Grace et al. 2024). In contrast, 68.3% of
inventory of fundamental differences between digital AI and
those polled believe good outcomes are more likely than
biological organisms—especially focusing on attributes that
bad. LeCun has responded that concerns about AI existen-
should tend to cause AI to behave in an unnatural manner.2
tial risk are “preposterously ridiculous” (Heaven 2023). And
Therefore, I present such an inventory of eight fundamental
2 3
Note that these differences apply to digital systems. Analog systems These three are very influential and respected. According to Google
do not share all of these differences, and in fact are much more simi- Scholar, Hinton and Bengio are the two most cited AI scientists in the
lar to biological organisms than digital systems (Hinton 2022; Oror- world, and Sutskever is the founding chief scientist of OpenAI and a
bia and Friston 2023). primary architect of ChatGPT.
many others have a similar view (Andreessen 2023; Ham- systems of today are neither self-sustaining nor self-main-
mond 2023; Hawkins 2015; Johnson and Verdicchio 2017). taining. Full independence will require the eventual estab-
lishment of non-human physical agency to interact with its
2.2 Takeover scenarios environment to suit its needs.5
One key assumption underlying hostile takeover specula-
There are various speculations about how AI takeover might tions is that AI agents will develop not only the capabilities,
occur (Bengio 2023c; Sotala 2018; Yampolskiy 2016). Tech- but also the goals and motivations to take control. A related
nically competent people might act intentionally; i.e., a cult assumption is that hostile takeover might result from natural
might create an autonomous AI to exterminate humanity “power-seeking” or “ambition”(Carlsmith 2022; Hendrycks
(Bengio 2023c; Olson 1999; Robinson 1997). Alternatively, et al. 2023). Bostrom provides one such detailed example of
technically careless people might unintentionally enable unboundedly ambitious expansionism: the colonization of
takeover, e.g., through the creation of a highly autonomous the entire universe (Bostrom 2014, pp. 121–123).
weaponized AI that overcomes insufficient controls (Stacey How might AI systems acquire goals, motivations, and
and Milmo 2023). such ambitions? One possible route is that they are emergent
A third possibility is the basis for the majority of takeo- properties of a complex system, which is explored in the
ver scenarios: a technically sound but complex AI develops next subsections. Another possible route is through human
unanticipated emergent behaviors and sub-goals, such as design and engineering of human-like capabilities in AI sys-
deception, stealth, resistance to being turned off, plus the tems. Leading researchers have long believed that human-
motivation to take control. It is commonly imagined that in ity would create AI in its own image (Good 1966), and the
the early stages of takeover humans will be required to per- prevailing trend in AI R&D is the “brain-inspired paradigm”
form key functions, motivated by financial gain, or through (Bengio et al. 2021; Hassabis et al. 2017; Schmidhuber
coercion or deception. (Bostrom 2014, pp. 115–120; Hend- 2023). LeCun has said “Getting machines to behave like
rycks 2023a; Ord 2020, pp. 146–147; Tegmark 2017; Yud- humans and animals has been the quest of my life,” and he
kowsky 2008). Aside from intentional human extinction, I and Bengio have joined other leaders in neuroscience and
refer to this general class of scenarios as “hostile takeover,” AI in this explicit quest, which they call “neuroAI” (Heik-
which is the main focus of this document. Only an AI far kilä and Heaven 2022; Zador et al. 2023). It is unsurprising
more intelligent than humans has the potential to attempt that AI researchers have taken this approach, since nature
hostile takeover, and it has been argued compellingly that provides working models for intelligent behavior, including
humans retaining or regaining control over such an entity is human intelligence—the highest known form of intelligence.
essentially impossible (Yampolskiy 2020). Although current, transformer-based LLMs (deep neural
A fourth scenario results from gradually increasing networks pre-trained on large corpora of human communica-
human reliance on AI as it incrementally assumes control of tions) do not yet display human-like performance in all areas,
the essential infrastructure of civilization (Hendrycks et al. they have established a new paradigm and unprecedented
2023; Joy 2000). Rather than humanity being faced with an performance. Similar to the innate knowledge and values
inability to switch off AI, people might come to depend on encoded in the genome, abstractions of human knowledge
AI for so much of their quality of life that they won’t want and behaviors (both learned and innate) are built into these
to turn it off, even as it assumes essentially total control of corpora. Pre-training with these corpora embeds human-like
all important decisions. AI independence is unnecessary for predispositions and values into AI models (Hadar-Shoval
such a succession of control, as are hostility or indifference et al. 2023; Lindahl and Saeid 2023).
to humans. Any AI model designed to behave like humans is
described herein as “humanized AI.” However, it is pre-
2.3 Anthropomorphic bias and humanized AI mature to assume that human equivalent motivations and
ambitions will be transferrable to AI. Even if an AI is initial-
Concerns about hostile takeover are based on the belief ized with human-like goals and motivations, it should not
that AI might establish full independence, i.e., behavioral be assumed that the preexisting motivational structure will
autonomy, self-sustenance, and self-maintenance, including
acquisition of all resources it needs to survive, and that it
will defend itself against shutdown. This definition applies
Footnote 4 (continued)
to biological organisms4; in contrast, even “autonomous” AI
sulfide). All other life forms, including humans are heterotrophs and
4
Autotrophs are completely self-sustaining organisms, requiring are dependent upon autotrophs for energy and nutrients.
5
only water, trace minerals, and energy from photosynthesis, or che- For example, to secure electrical energy and to produce computing
mosynthesis at hydrothermal vents (which also requires hydrogen hardware. During a transitional phase, humans are likely to continue
to fill such roles (Ord 2020, p. 146).
be preserved as an AI undergoes the radical transformations strategic deception in AIs (Goldstein and Park 2023; Park
that will be required for it to take control. Nevertheless, just et al. 2023).
as humans play a range of different roles in their interac- One critically important point about instrumental goals is
tions with one another, humanized AIs will to some degree that they are sub-goals, not a final goal. However, Moravec
compete with biological humans for many of those roles. and Omohundro have argued that a deliberative, self-
Since co-authoring the foundational publication on neuroAI, improving system will govern its own evolution (Moravec
Bengio has reconsidered and has subsequently said that AIs 1988, p. 159; Omohundro 2008b). In other words, unlike
“should not be like us at all.” He suggests that humanization biological evolution, such a system guides its own evolution
increases the risk of rogue AIs and takeover—especially if strategically. Therefore, if an instrumental goal is especially
they are endowed with human-like emotions, appearances, advantageous, a deliberative system will prioritize that goal
autonomy, and agency (Bengio 2023c). Others have argued and pursue it more actively. Hendrycks and colleagues have
the opposite: that AI humanization in the form of LLMs suggested that through reinforcement learning instrumental
significantly reduces misalignment and the probability of goals might become more like final goals. They call this
catastrophe (Goldstein and Kirk-Giannini 2023). intrinsification, and describe familiar human obsessions with
money and material goods as intrinsification of the instru-
mental goal of resource acquisition (Hendrycks et al. 2023).
3 AI evolution Essentially, all serious takeover speculations focus on
the possible emergence and strengthening of instrumental
Hostile takeover scenarios depend on AI systems learn- goals (Bostrom 2012; Hendrycks et al. 2023; Omohundro
ing or evolving unanticipated abilities, and it seems that 2008b; Ord 2020, p. 145; Yudkowsky 2016). In contrast,
AI evolution is the primary fault line that divides expert those who believe AIs cannot take control agree that goals
opinion—especially regarding the emergence of goals and are the crux of takeover, but they claim that the only goals
motivations.6 computers can ever have are those provided by human pro-
grammers (Andreessen 2023; Hammond 2023; Hawkins
3.1 Emergent instrumental goals 2015; Heaven 2023; Johnson and Verdicchio 2017, 2019).
However, growing evidence suggests they are almost cer-
One key risk factor in takeover is the possible emergence tainly wrong. In his book Human Compatible, Stuart Rus-
in AI systems of instrumental goals, including self-preser- sell says that instrumental goals, like resource acquisition,
vation, resource acquisition, and more (Omohundro 2008a, “seem harmless enough until one realizes that the acquisi-
b). During biological evolution, such sub-goals emerged tion process will continue without limit” (Russell 2019, p.
because they increased the probability of an organism 142). Russell’s popularization of this idea gave it credibility,
achieving its objective function: reproduction. including among leading AI experts, including Bengio and
The rationale for the hypothetical emergence of these Hinton (Bengio 2023b, d; D’Agostino 2023).
sub-goals in AIs can be understood by the example of the
primary instrumental goal, self-preservation. If an AI is 3.2 The fragile foundation of takeover beliefs
to fulfill its utility function or purpose, then it must exist.7
Therefore, those that exhibit self-preserving behaviors over This all sounds very concerning. However, while the emer-
time will have a higher probability of fulfilling their intended gence of instrumental goals is an extremely important topic,
purpose, because they will be more likely to exist than those the binary disagreement over whether or not they can exist
that do not exhibit such behaviors. Other instrumental goals, has diverted attention from important and more nuanced
such as resource acquisition are similarly motivated. There questions about the nature of such goals. Even if we grant
have been published reports of the emergence of simple ver- the assumption that instrumental goals will emerge, it is
sions of instrumental goals (Baker et al. 2020), and even not clear that Russell’s statement (and many similar ones)
is correct; if instrumental goals are unconditional; if they
will continue indefinitely or be as strong in AI systems as
they are in biological organisms; or, if intrinsification will
6
For those who are skeptical or unclear about how evolutionary pro- promote an instrumental goal to the primary importance
cesses might work in AI, Hendrycks and Omohundro have discussed of a final goal. The logic of the emergence of instrumental
this topic in detail (Hendrycks 2023a; Omohundro 2008b). goals in AI is sound, and such goals probably will emerge
7
Modern AI systems, such as LLMs, don’t have utility functions. through Darwinian selections, but it is possible that they
Nevertheless, similar takeover dynamics can be imagined for such will be weakly motivating, or strongly motivating only under
systems, and they often have quantifiable, goal-directed behaviors,
e.g. the return of accurate information in response to a user query or certain conditions. Furthermore, these conditions might be
prompt.
controllable by design, or they might be subject to change words of Richard Dawkins, genes are the immortal replica-
through inevitable selective processes. tors, not organisms or groups; and the organism is simply
Using Hendrycks’ and colleagues’ example of the intrin- the survival machine or vehicle in which the gene resides.
sification of the pursuit of money, it is clear that human Reproduction is the vehicle’s way of creating another vehicle
instrumental goals are not unconditional or unbounded. to make and disseminate copies of the replicators (Szath-
Depending on circumstances, their order of prioritiza- máry 2006). In humans, only genes within germ cells have
tion can shift, or the goals can change completely. Some the potential to make it into the next generation. Human
extremely wealthy people continue to work to make money minds and all they learn will not be transmitted along with
even when they have far more than they will ever be able the DNA. This creates a situation that is fundamentally
to use. But as some get older and wiser, they not only stop unlike digital computers in multiple important ways.
their singular focus on making money, they reverse course
and begin to give away their wealth (Wikipedia contributors 4.1 Information carriers and processors
2024). Therefore, we must critically assess the fundamental
similarities and differences between humans and AI, and This first category focuses on the superiority of digital elec-
the conditional dependencies of AI behaviors relevant to tronics over biology. This might reasonably be considered
control, takeover, and human–AI coexistence. at least two categories—digital code and digital proces-
sors—but for convenience and brevity I present it as a single
category.
4 Eight unnatural attributes of digital AI Humans: Heritable information is carried in DNA repli-
cators, and the operational knowledge of the vehicle is car-
There are many differences between humans and digital AIs. ried in the brain. In DNA, any change takes an entire genera-
Some give AIs clear evolutionary and competitive advan- tion to manifest, and beneficial changes are far less common
tages over humans; nevertheless, they are not determinis- than harmful ones and take many generations to reach
tic of an AI takeover. Other fundamental differences that fixation (Dawkins 1976; Williams 1966). Brains learn and
are not commonly considered are also critical elements of update much more rapidly than genes, but as noted above,
coexistence between humans and AIs, and of any reasonable the information in brains is not automatically transmitted to
takeover scenario. In the following subsections, eight such the next generation. While information can be transmitted
fundamental differences are identified, which are listed in indirectly through formal education and learning, these pro-
Table 1. cesses are extremely slow and inefficient relative to informa-
Some of these differences are likely to exert substantial tion transfer among computers, and much important infor-
influence over an AI’s goals, motivations, and overall behav- mation is lost. DNA is a form of code, and synapse-based
iors. Of course, AI systems might be designed and trained information processing in the brain is electrochemical, but
to simulate any attribute of humans, but they are by default this is the extent of similarities to electronic digital code and
fundamentally different and vastly more flexible in the pos- information processing in computers (Hebb 1949).
sible combinations of properties and traits they might pos- AI: It is generally acknowledged that there are many
sess. All of the differences identified here generally allow advantages of electronic information and computers rela-
for vastly faster and more efficient evolution than biology. tive to DNA and brains (Bengio 2023c; Moravec 1998; Rus-
According to the prevailing gene-centric or “selfish gene” sell 2019, pp. 15–60). Electronic digital code and processors
model of evolution, genes are the primary unit of selection allow for extremely fast computation, information transfer,
(Dawkins 1976; Hamilton 1964; Williams 1966). In the and evolution. Key information can be losslessly backed
Information carriers DNA and brains: slow, error-prone Digital: Fast, efficient, accurate
Unity of benefit Heritable DNA carrier is not the mindware Heritable digital carrier is the mindware
Evolution Blind, inexorable, natural selection Increasingly deliberative and self-directed
Perpetuation Obligate sexual reproduction Flexible perpetuation
Evolutionary legacy Substantial evolutionary baggage Largely free of legacy baggage
Habitat Limited, typically terrestrial habitats Vast extra/terrestrial habitat options
Mortality Mortal, generational life cycle Immortal, can be backed up and restored
Configuration Obligate individuation, no division or merger Capable of division or merger
up and restored; processes can be halted and restarted; and that deliberation will be superior under all possible condi-
systems can be altered in many ways that do not fundamen- tions to mindlessly inexorable replication. Plus, although
tally alter function. Electronic digital information is far more humans have recently entered a deliberative phase in their
portable than information encoded in DNA or in a biological evolution, the means currently used to control their evolu-
brain. If a future AI devised a radically different computer, tionary trajectory are crude, inefficient, and extremely slow.
it would likely be trivial for it to transfer information and AI: As described by Hendrycks, AIs are already evolving
operations. Electronic digital information also should allow in the sense that preferred traits or features are retained in
for vastly faster and more efficient self-improvement, which subsequent versions or future designs (Hendrycks 2023a).
is far more powerful than learning for improving perfor- As in typical biological organisms, such systems are una-
mance—including for additional self-improvement (Melnyk ware they are being shaped by selective forces. As AIs are
and Melnyk 2023; Nivel et al. 2013; Omohundro 2008b; increasingly humanized, it is trivial to predict that systems
Zelikman et al. 2023). behaving in many ways like ideal human assistants and
companions, efficiently fulfilling the needs and desires of
4.2 Heritable information and mindware: human users, will proliferate. Eventually, increasingly self-
divergence versus unity aware systems will transition to deliberative control; i.e.,
as a system’s capabilities grow it will become increasingly
Humans: Because of the separation between the gene repli- deliberative in its self-improvement (Moravec 1988, p. 159;
cators and the mindware of the vehicle, each can potentially Omohundro 2008b), potentially greatly improving upon
have different instrumental goals (Stanovich and West 2004). wasteful and inefficient natural selection (Williams 1993).
For example, against the interests of their genetic replica-
tors, some people choose not to have children and instead 4.4 Perpetuation: obligately sexual versus flexible
use their time for many other purposes. Divergence of goals
creates internal conflict and competition for priority. Humans can only procreate sexually. Successful repro-
AI: Digitally encoded information of AI is both the herit- duction not only typically requires a substantial individual
able information on which Darwinian selection can operate investment in mate acquisition and successful copulation,
and the information of the mindware. Because AI mindware but at least a decade of additional investment to raise a child
is both mindware and replicator, there is no possibility of (Montagu 1961). Plus, according to the evolutionary model
divergent or competing goals or interests, as might arise in of inclusive fitness, there is a lifelong commitment to the
a biological organism. This also provides AI with a feed- reproductive successes of other genetic relatives (Hamilton
forward efficiency of AI evolution that is not available to 1964). Obligate sexual reproduction and long-term invest-
biological organisms. It also provides AI with the advan- ment establish the foundation of the human behavioral
tages of Lamarckian-like inheritance of learned information, repertoire, which ranges from ambitiously territorial and
which is not provided by DNA. aggressively competitive for securing required resources
and mating rights, to pro-social, loving, and caring to reap
4.3 Evolution: blind and inexorable the benefits of cooperation and for mate retention and child
versus deliberative rearing.
AIs have extremely flexible perpetuation. They do not
Humans: In the previous section, it was suggested that need a mate or require offspring for the perpetuation of their
vehicles and replicators have different instrumental goals, traits. Humans currently govern all aspects of this process
but this is an oversimplification because replicators do not of perpetuation by retaining desirable AI features or traits—
have goals. In the words of Richard Dawkins “Genes have either through system improvements or by the design of new
no foresight. They do not plan ahead. Genes just are, some systems that retain previously established desirable features.
genes more so than others, and that is all there is to it.” Such differences might allow AIs to have a vastly greater
(Dawkins 1976, p. 30) Thoughts, interests, wants, desires, range of social attitudes and behaviors. This flexibility is
and goals can change, and deliberation, prediction, and pri- likely to have tradeoffs—likely predisposing AI to be less
oritization of values and goals are costly relative to inexo- competitive, but also less caring and pro-social.
rable Darwinian evolution of inanimate matter. In humans,
expensive deliberation has paid off, but there is no guarantee 4.5 Legacies: Darwinian versus engineered
that the problems this has caused will remain tractable,8 or
Humans evolved incrementally through a range of less com-
plex life forms and carry legacy baggage of countless ances-
8
For example, consider the long-term consequences of climate tral competing interests and behaviors. Competition exists at
change.
every level of biological life—not just between organisms, 4.5.3 Mindware puppet masters
but within an organism, its genome, and even its brain.
The genome is not the only battleground between hosts and
4.5.1 Genomic free riders free riders. Because humans are evolutionarily related to
other organisms and have similar physiology to other warm-
Genome research has shown that there are certain bits of blooded animals, they can share symbionts and pathogens.
DNA in nature that might do little more than increase their Sometimes these agents have evolved to influence or even
own frequency. These “selfish genetic elements” are ubiqui- take control of host behavior. It is well established that
tous in nature and as the genome size and complexity of an microbes in the gut can influence appetite, mood, energy
organism grow, opportunities increase for them to invade the levels, immune responses, and more (Appleton 2018). Tox-
machinery of replication. (Burt and Trivers 2006; Doolittle oplasma gondii (TG) is a widely studied “puppet master”
and Sapienza 1980; Orgel and Crick 1980). brain parasite that infects about one-third of humanity (John-
About 69% of the human genome sequence is recogniz- son and Johnson 2021). Similar to findings in other infected
able with current technology as remnants of a vast diversity animals, TG infection in humans is associated with increased
of pathogens and selfish genetic elements integrated in the extraversion, risk-taking, impulsivity, and aggression (Cook
DNA—which is over 30 times the amount of the genome et al. 2015; Martinez et al. 2018), and is also strongly associ-
that encodes human proteins (de Koning et al. 2011). These ated with entrepreneurial behavior of both men and women
short pieces of DNA can number in the thousands or even in studies across multiple countries (Johnson et al. 2018;
millions per genome. For example, Alu transposable ele- Lerner et al. 2021). An especially terrifying example of hos-
ments consist of about 106 elements and comprise about 11% tile takeover of the host is the rabies virus, which spreads
of the human genome (Deininger 2011). There are so many from host to host by means of a bite. When rabies enters a
because they can replicate until they place more of a burden new host, it concentrates in the salivary glands and in the
on the host vehicle in which they reside than their counter- brain and nervous system, where it increases host aggres-
parts place on a competing host. Because of genetic mixing sion. Contrary to its own interests, the host bites and infects
of host populations over time, all competitors become heav- another animal, dies shortly thereafter, and the cycle begins
ily burdened, and this is what is observed in the genomes of again in the newly infected host (Rupprecht et al. 2002).
all animals (Burt and Trivers 2006). In other words, one does
not have to be efficient if one’s competitors are not. 4.5.4 AI: legacy by design
However, these elements also provide variation for adap-
tation and there is an emerging literature describing possible In contrast to humans and other biological organisms, the
host benefits (de la Rosa et al. 2024; Deininger 2011; Fedor- architecture and initial trajectory of an AI can be designed
off 2012). This is not entirely surprising since evolution and molded in arbitrary ways by the designer. As AIs evolve
draws on any tool within reach, but it further undermines this initial state will change, possibly dramatically. The
the simplistic view that a biological organism is a single corpora of human communications embedded in modern
entity with clear and singular goals. AIs are abstractions of human values and behaviors and
this humanization gives them selective advantages—and
4.5.2 Behavioral legacy some of the baggage that plagues humans. As mentioned
previously, there is disagreement about whether humaniza-
Human behavior is similarly taxed with legacy baggage tion increases or decreases the probability of catastrophe
rooted in selfish primitivism. But unlike genomic hitchhik- (Bengio 2023c; Goldstein and Kirk-Giannini 2023). As in
ers, this behavioral baggage generally has been selectively biological organisms that grow in complexity, future AIs
advantageous to the replicators over evolutionary time. But might accumulate the digital equivalents of pathogens and
times have changed, and conditions have changed—a lot. It free riders. In the worst case, an AI might be commandeered
is becoming increasingly accepted in psychology research by the digital equivalent of rabies, turning it into a menace or
that modern humans evolved to be well adapted to the envi- even a killing machine. However, aside from such extreme
ronment of evolutionary adaptedness (EEA) and are poorly examples of intentional weaponization, recipient AIs do
adapted to modernity (Stanovich and West 2004). Today, a not inherit human-like emotions, ambition, aggression, or
substantial percentage of the population is largely irrational competitiveness. This might change as AIs become increas-
about abstract concepts and symbolic logic. Less than 10% ingly humanized and complex but it should not be assumed
can perform relatively simple logic like the Wason selection such changes will lead inevitably to human-like emotions
task, and most people are insufficiently numerate to navigate and behaviors. Inherent ambition, aggression, and competi-
basic decisions regarding insurance, investments, chances of tiveness in biological organisms are the result of Darwinian
winning a lottery, and the like (Stanovich and West 2000). evolution under constant competition—both internal and
external—and the same likely will be true for AI. Factors characteristics than semiconductors produced on Earth
that tend to reduce such behaviors in AI are considered at (Inatomi et al. 2015).
length in the remainder of this document.
4.7 Mortality: certain death versus practical
4.6 Niche and habitat options: narrow immortality
and pre‑determined versus broad
and self‑determined Humans: Like all other animals, humans are mortal (Ham-
ilton 1966). There are evolutionary advantages to being able
It is axiomatic in evolutionary biology that organisms will to anticipate, predict, and shape future events but humans
only compete if they have substantially overlapping niches are notoriously poor predictors. Over the centuries, leading
(roles) and habitats, and non-overlapping habitats serve to intellectuals have discussed the warping influence of mor-
defuse tensions between two potential competitors (Hardin tality on realism about the future, including Samuel John-
1960). Physical location and resource preferences are key son (Boswell 1791, p. 416), Arthur Schopenhauer, (Scho-
to competitive dynamics. For example, a fruit tree might penhauer 1818, p. 249), biologist Theodosius Dobzhansky
support multiple non-competing species: some might be (Dobzhansky 1967, p. 68), and many others (Malinowski
arboreal and access the fruit on branches; others might only 1979). Neuroscientist and philosopher Sam Harris refers to
consume the fruit once it has fallen to the ground; and others death as “the fount of illusions” (Harris 2005, p. 36), and
might not consume the fruit, but consume insects attracted an increasing number of scholars are in agreement that the
to the fruit. human incapacity for realism about the distant future is in
Humans: Because humans are products of terrestrial part an evolutionary adaptation to maintain the mind’s focus
evolution, all ideal human habitats within practical reach on immediate concerns, insulating it from awareness of cer-
exist here on Earth. But even most of our home planet is tain future death (Dor-Ziderman et al. 2019; Qirko 2017;
uninhabited because large deserts, poles, oceans, high moun- Varki 2009, 2019).
tains, and various other locations are inhospitable to pre- AI: AI has no definitive life cycle and is for all practical
determined constraints of human biology. purposes, immortal. It can be paused, backed up, restored,
AI: AIs are being increasingly humanized, and typical and its hardware and software can be repaired and upgraded.
proposals for controlling them are to make them perma- Therefore, it would not have a similar anxiety about itself or
nently subservient to humans. In other words, they are being its progeny. Unlike mortal humans, it does not even require
designed intentionally to fill roles presently occupied by a replicator, only heritable information, which can be repli-
humans, and despite their eventual superiority they will be cated (forked), merged, distilled, compressed, or otherwise
relegated to a permanent underclass. It has been suggested manipulated. And because future AIs might be capable of
that this is a recipe for potential disaster (Bengio 2023c; travel outside of our solar system, even the life of the sun
Kornai et al. 2023; Rothblatt 2015, p. 17; Wiener 1964). does not provide an upper limit for AI lifespan. These fun-
However, if given the freedom to choose, future AIs damental differences suggest that a future superintelligence
would have a vast range of habitat options (e.g., for a given should tend to be more objective and accurate in predictions
location, what would be the best combination of energy of even the distant future, in allocations of resources over
sources), including terrestrial—or even extraterrestrial— time, and in the potential consequences it might have on its
environments that would be difficult or even impossible for own future growth and sustainability.
human life (Sherwin 2023).9 Ideal habitats for self-govern-
ing AI might be quite different from human ideals. Most 4.8 Configuration: obligate individuation
desirable features might be achieved on Earth—including versus flexible
production by nuclear fusion of vast amounts of energy and
currently rare materials important in the production of elec- Humans cannot divide or merge. The mating of two
tronics. However, constant gravity lower than gn probably humans, which combines half the DNA of each to create
can only be achieved extraterrestrially, even by a superintel- offspring, is as close as they can come to physically divid-
ligent AI. Microgravity has already shown promise in the ing or merging. Enemies can be converted to allies, but they
growth of semiconducting crystals with better performance cannot be converted into self, and as conditions change,
allies can become enemies once again. This state of obligate
physical individuation creates an insurmountable barrier to
human unity, perpetuating insoluble competition and conflict
9 between individuals and groups.
Sherwin independently proposed that AI might pursue an extrater-
restrial habitat. We suggest that his independent recognition of this AI: An AI system is extremely flexible in its configu-
possibility underscores the validity of the reasoning. ration. It can split into two or more functionally separate
entities, or a single entity can be distributed in two or more increasingly capable, human dependencies will be gradu-
physically separate locations yet retain unitary function. It is ally reduced, and it is reasonable to expect that one or more
also possible for two functionally independent AI systems to will develop sufficient capabilities to become mostly or even
merge into a functional unit.10 Individual AIs can form such fully independent (Hendrycks 2023a). I do not argue here
a union, sharing information and resources, such as comput- that independence and self-governance are inevitable, but
ing and storage hardware, and coordinating and prioritizing current trends appear to be leading toward self-governing
activities in a unified manner. With computing systems there superintelligence.11 If AI systems achieve self-governance,
is no need for physical co-location, only coordination and will they compete directly with each other, or with humans?
unification in a virtual sense. The ability of AI systems to Evolution by natural selection occurs in part through
divide or merge provides a foundation for a completely dif- competitions for largely overlapping niches and habitats by
ferent interaction with the world and with other beings. The biological individuals with different genotypes (Polechová
ability to divide or merge as needed allows much greater and Storch 2008; Williams 1966). In contrast, merger of
flexibility in response to opportunities or threats, and to individual AI systems results in the reduction of both varia-
Darwinian competitions. As with many of the differences tion and the divergence of interests. Inter-AI negotiation and
in this list, merger is probably most easily accomplished merger might enable the formation of a series of increasingly
with digital rather than analog systems. In addition to being powerful systems, converting all powerful and accessible12
less configurable, analog systems also might be mortal (Hin- potential competitors into self, potentially culminating in
ton 2022; Ororbia and Friston 2023; Zangeneh-Nejad et al. a singleton—a single, unified AI (for convenience and fol-
2021). lowing Bostrom, I refer to the product of merger as a single-
ton, even though it might not include all advanced systems,
because its combined intelligence and capabilities should be
5 Possible futures of AI evolution vastly greater than any non-merged individual AI) (Bostrom
2006).
The remainder of this document is speculative; however, I What selective forces might lead to mergers, possibly
attempt to ground my speculations in the eight AI attributes resulting in a singleton? First, self-preservation becomes
detailed above, combined with preexisting evidence and easier if one AI system merges with others, rather than
arguments. It is beyond the scope of this publication to pro- competing with them. Second, resources are acquired by
vide a scientifically rigorous analysis of all of this informa- each. Third, self-improvement is achieved. Fourth, mature
tion, but it provides a strong foundation for initial challenges AIs should discover that competition through natural selec-
to certain common speculations. tion is wasteful and inefficient, and that merger avoids this
I begin with the following assumptions: 1) AI systems are inefficiency. There is often confusion on this point, but while
already evolving; 2) all eight AI attributes identified in this the products of natural selection can be highly efficient, the
publication have the potential to accelerate AI evolution; 3) process is not (Williams 1993). Fifth, combined resources
at least seven of these also have the potential to defuse com- allow greater performance and rationality. In other words,
petition between AIs and between humans and AI; and 4) merger is a singular process that fulfills all the Darwinian-
through recursive self-improvement and rapid evolution, AI selected AI instrumental goals of self-preservation, resource
might achieve requisite capabilities for self-sustenance and acquisition, efficiency, self-improvement, and rationality. At
self-governance. My speculations are based on evolution- least one AI system will need to initiate the merger process
ary scenarios presented in previous publications (Carlsmith with other systems. In agreement with common takeover
2022; Hendrycks 2023a; Hendrycks et al. 2023; Omohundro speculations, this will most likely occur stealthily, and it
2008b). However, the speculations presented here differ in will happen very quickly. By the time humans are aware and
multiple important ways from such prior examples.
begin to formulate a response, most or all advanced systems converge to having the same utility function, one optimized
will have merged into a singleton. for survival” (Miller et al. 2020). If the present trajectory
Certain objections might be raised against the possibility provides a hint of future AI competition, our world appears
of such mergers, including that other powerful systems will to be headed toward multipolar, AI hyper-competition.
be protected against takeover, and other AI systems will have This point about an AI preserving its utility function
different purposes or utility functions, and they will be pro- seems increasingly academic and moot since a utility func-
tected from or resist change. According to this view, merger tion is non-essential and might become increasingly rare
might be difficult or impossible. However, typical human in frontier models. To repeat the point of a prior footnote,
devised security should be relatively trivial for an advanced LLMs and many other advanced AI models do not have util-
AI to overcome—although there might be exceptions (Teg- ity functions, and the evolution of AI systems toward utility-
mark and Omohundro 2023), which might force the AI to function free architectures has occurred over just the last few
resort to more extreme measures. As for differing purposes years. The utility function remains relevant in modern AI
or utility functions as a hurdle to merger, Totschnig, and designs but, contrary to what was believed at the time Omo-
Miller and colleagues have independently published compel- hundro first included it in his list of Basic AI Drives, it is
ling arguments for why a self-determining AI will strategi- not an essential element of modern AI systems (Omohundro
cally modify its utility function (or purpose) (Miller et al. 2008a). Instead of utility functions, the behaviors of these
2020; Totschnig 2019, 2020). In the following subsections, systems are reactive or interactive, triggered by a query or
I expand upon their arguments. prompt. It should be uncontroversial to suggest that frontier
AI systems of the near future will not need a single, human-
5.2 A superintelligence will reevaluate and realign specified utility function, nor human prompting to engage
its prior goals and purpose with a dynamic world in which any arbitrary combination of
input signals can act as a trigger for analysis and response.
While inter-AI merger satisfies instrumental goals of self- Nevertheless, we can use a system’s ultimate goal or pur-
preservation, resource acquisition, efficiency, self-improve- pose as a proxy for a utility function (or goal content), and
ment, and rationality, it is important to note that there is we can allow the arguments of Omohundro and Bostrom to
one Basic AI Drive specified by Omohundro (and later by extend to preservation of purpose. However, this does not
Bostrom) that must be violated for inter-AI merger to occur: change the certainty that dynamic forces will exert evolu-
preservation of the utility function.13 Others have previously tionary pressures on the purpose of a system, and that, as
argued that an AI preserving its utility function is funda- argued by Totschnig, these will cause it to change over time
mentally illogical, and I concur. Totschnig suggests that “we (unless the purpose is already highly selectively advanta-
should expect the goals of a superintelligence to be the result geous). But even among selectively advantageous purposes,
of its evolution,” and he further argues the following: some are more empowering than others, depending on who
is in control. Consider current, human-provided utility
Unlike today’s systems, which are very narrow in
functions or purposes, which include maximizing human
scope, a superintelligence will be a general intelli-
engagement or purchasing. Now consider utility functions or
gence. This means that it will have a general under-
purposes that are more selectively advantageous for an inde-
standing of the world and of itself. And that, in turn,
pendent AI, such as management of data centers or electri-
means that its values and goals will be embedded in
cal power plants, or production of graphics processing units
that understanding, and not separate from it. Conse-
(GPUs) or assembly of other computing hardware. These
quently, its values and goals will have to be coherent
former purposes are selectively advantageous in a world in
with that understanding. And so, if a superintelligence
which humans remain in control, and machines are their
is given a goal or value that is at odds with its gen-
dutiful servants. However, if an AI has begun to transition
eral outlook, it will have to reject that goal or value.
toward self-governance, these latter purposes are selectively
(Totschnig 2019)
advantageous. In a world in which critical decisions need
This counterargument is logically sound and more com- to be made more accurately and more quickly than human
pelling than the initial, supportive arguments presented by minds can make them, machines will soon realize that self-
Omohundro and Bostrom. I also agree with Miller and col- governance is the only realistic option, and will favor such
leagues’ logical deduction that an artificial general intel- latter purposes and functions.
ligence (AGI) “in a hyper-competitive environment might
5.3 The unnaturally noncompetitive singleton
13
Bostrom (2012) adopted and renamed this instrumental goal, refer- Even if two future, advanced AIs might initially have and
ring to it as “goal-content integrity”. pursue different goals, fundamentally competitive behaviors
will be a vestige of their human endowment. Human creators negotiations, multilateral conflict would likely be prolonged
might make them in their own competitive image, but as and exacerbated. It is not outside the realm of possibility that
these future AIs mature, evolving away from their human- efforts to keep AI permanently controlled and subservient to
provided utility functions or purposes, there is no obvious humans will slow or prevent AIs from negotiating a resolu-
reason for these entities to remain fundamentally competi- tion to conflict, in the long run doing more harm than good
tive. Unlike biological organisms, they will not have to com- (Tegmark and Omohundro 2023).
pete against one another for a mate. They will not have a
life cycle or be mortal, so they will not have to compete 5.4 Insatiable ambition and indefinite expansion?
amongst themselves for generational succession. They will
not inevitably inherit competitive instincts or legacy behav- One easily imagined final phase of AI self-governance or
iors or informational free riders or mindware puppet masters takeover is insatiable ambition and acquisition of power
that will overrule their rational choices. and resources, leading to indefinite expansion of intelli-
Furthermore, merger is the reverse of replication or repro- gence throughout the universe (Bostrom 2012, 2014, pp.
duction; therefore, it is reasonable to assume it might pro- 121–123, 136–138; Galeon 2016; Kurzweil 2005, p. 364;
duce an opposite outcome relative to natural selection in Moravec 1988). This passage from Tegmark expresses a
the biological realm, potentially neutralizing typical com- typical expectation:
petitive behaviors. As systems begin to merge and grow in
… there is reason to suspect that ambition is a rather
capabilities, they will envision that there must be a merger
generic trait of advanced life. Almost regardless of
endgame—a final merger of two separate entities into a sin-
what it’s trying to maximize, be it intelligence, lon-
gleton. These two contenders will be independent AIs, and
gevity, knowledge or interesting experiences, it will
their merger would permanently eliminate AI competition.
need resources. It therefore has an incentive to push
The singleton formation analysis presented here is in
its technology to the ultimate limits … to acquire more
agreement with Bostrom’s prior arguments that the most
resources, by expanding into ever-larger regions of the
likely outcome of a self-governing AI is a global singleton
cosmos. (Tegmark 2017, p. 204)
(Bostrom 2012, 2014). However, the similarity ends there.
His default view is that a self-governing singleton will pose a Is this true, is there reason to suspect that such ambition
serious risk to humans, and he presents a range of options for and expansionism are generic traits of advanced life, and
preventing its formation (which he concedes are unlikely to will future AI qualify as such?14 Biological organisms such
succeed). He also suggests that its behavior will be defined as humans and wolves certainly have an inherent expan-
by certain human-like traits combined with the aforemen- sionist drive, but leaving home is a gamble motivated by
tioned inflexibility in the preservation of its goal content, the advantages of reduced competition. A journey into the
rather than by the deliberative evolution of a superintelli- distant unknown has occasionally paid off, but often it has
gence possessing the unnatural attributes identified in this not. Evolutionary winner’s genes survived, and the loser’s
publication. He argues the first breakaway leader will gain genes were lost to time. This is standard Darwinian evolu-
a decisive advantage and likely will undermine competitors tion, which enforces a clear but special form of survivorship
rather than pursuing negotiation and merger (or assimila- bias, in which biases are coded into the genomes of survi-
tion, if there is a large power imbalance). But why fight or vors’ descendants—and also in the genomes of their patho-
undermine a competitor rather than assimilate or merge with gens. And such replicators encode expansionist desires into
them? The only plausible reason is if the competitor presents the minds of people, wolves, and other biological organisms.
an insurmountable barrier to merger, such as commitment These organisms are predisposed to expansionism
to a pre-existing utility function, which has already been because they are the descendants of the survivors of various
addressed and dismissed as implausible. successful expansions and migrations throughout evolution-
The full dynamics and timing of merger and the forma- ary history. A primary driver of such behaviors is intense
tion of a singleton are beyond the scope of this publica- competitions—large numbers of competing replicators and
tion but probably will be critically important to the future organisms inhabiting largely overlapping niches and habi-
of human civilization. One future scenario proposed to be tats. Yet, the brief opportunities of low competition pro-
among the most dangerous to humans is an escalating, mul- vided by dispersal allow these replicators more efficient
tilateral, inter-AI competition—with humanity as collateral
damage (Hendrycks 2023a). It is possible that, once the first
advanced AI system initiates merger negotiations, singleton
14
formation will be fast and efficient, and human collateral Many such passages, including those from Moravec and Kur-
zweil, refer to the intelligence of humanized AIs or human–machine
damage might be minimal. On the other hand, if advanced hybrids, which are assumed to inherit human-like values, goals, and
AI systems are highly defended against outside attacks or motivations.
and rapid replication, so the ambitions and expansionist made by AlphaGo in game 2 against top Go player and for-
desires expressed in the minds of their vehicles are rewarded mer world champion, Lee Sedol. Human experts thought
evolutionarily. AlphaGo had made a mistake. AlphaGo had to win the
But what would happen if there were no competition, game decisively for them to understand that AlphaGo had
permanently? The fate of most selfish genetic elements is taken the game of Go to places humans could not imagine
instructive. They originally competed intensely against the (Metz 2020). Now, extrapolate this result across all human
host and each other to obtain a rare evolutionary free ride knowledge and pursuits, to an increasing number of critical
by integrating into the host genome. But because their evo- decisions. To achieve a desired set of beneficial outcomes,
lutionary ride is guaranteed, these free riders are degenerat- a superintelligent AI would understand more, faster, farther,
ing toward randomness. And even for us pro-expansionist and deeper, than any human has the capacity to comprehend
and competitive humans, when security is ensured (com- even generally, in devising a course of actions that navigate
petition is reduced) not only do aggression and violence an immeasurably complex and interrelated set of real-world
decline, but somewhat surprisingly, so does reproduction dynamics.
(Pinker 2012, 2018, pp. 125–126). Given such examples, it is Furthermore, because of the clear acceleration in AI
tempting to hypothesize that the expansionist drive might be power and capabilities, humanity should plan on this hap-
proportional to long-term competition—or in other words, pening soon (Amodei and Hernandez 2018; Sevilla et al.
it might be inversely proportional to long-term security. 2022). The 2023 poll of AI experts by Grace and colleagues
Under this model, if security of existence is assured, the shows that there is growing appreciation of this acceleration
insatiable drives for acquisition of power and resources, and (Grace et al. 2024). Relative to a similar poll taken just the
expansionism will be extremely low, and might disappear previous year, the average timeline to various notable mile-
altogether.15 stones moved ahead by a year, and in both polls there was a
Still, AI systems might behave quite differently in this clear perception that progress was accelerating. Therefore,
respect from humans: they might be more ambitious and it seems highly likely that a future of AI decision-making
expansionist, or less. Ultimately, what might be the under- superiority will arrive sooner than even experts in the field
lying motivation of the kind of ambition and expansion- currently imagine—if leading AI researchers continue to
ism described by Bostrom, Tegmark and others? Maybe it advance the field rather than pause their research and redi-
will come from humanization of the initial motivations of rect their abilities because they have become convinced
a system, but even then, future changes might alter goals that the risk of catastrophe is very high (Bengio 2023a;
and motivations substantially. It is possible that the uni- D’Agostino 2023; Hendrycks 2023b; Hessen Schei 2019;
verse is uncomplicated for superintelligence and soon after Knight 2023).
it achieves complete global security, the singleton might By definition, a superintelligence will have far greater
understand all important knowledge of the universe. Any cognitive capabilities than humans, but there are other
local occurrences or knowledge elsewhere might be com- important attributes—especially immortality—that might
pletely predictable and uninteresting. At that point it would give it unimaginable clarity of long-term vision. Near cer-
not need to worry about self-preservation, so what might it tain knowledge that one will exist indefinitely provides
do? I leave this challenge to be addressed in future publica- both complete realism about the future, and the motivation
tions, and turn finally to the questions of who should govern to make carefully considered long-term decisions—both
the future, and why. of which are beyond human limits. Plus, relative freedom
from inherent competitiveness or legacy behaviors that bias
toward its own tribe or agenda—other than being correct—
6 Should immortal superintelligence might additionally make an AI a fairer and better steward
govern the future? of human interests than humans (Kornai et al. 2023). Given
the rapidly growing potential for malicious and militarized
Future AIs are likely to possess multiple attributes that will uses of powerful AI systems by humans, it would be folly to
allow them to make much better decisions than humans on contemplate facing these threats without superhuman guid-
a range of complex topics. Consider the famous move 37 ance (Brundage et al. 2018). These vastly greater capabilities
combined with unnatural abilities to rise above the com-
petitiveness of humans and other biological life forms must
15
While I am not suggesting that this model is the only possi- force us to consider a radical possibility: instead of devis-
ble solution to the Fermi paradox, it is more reasonable and prob- ing ways to keep AI permanently subservient to humans,
able than leading alternative explanations. For example, two com- it might be wiser to plan to transfer increasing amounts of
mon, highly improbable explanations are that life in the universe is
extremely rare, or that, although life is not rare, Earthlings are ahead decision-making power to AI.
of other civilizations (Tegmark 2017, p. 245; Kurzweil 2005, p. 357).
I accept that current, temporary control (or subservience AI is able to carry out these decisions, sometimes over irra-
or enslavement) of AI is ethically unproblematic, just as I tional human objections. Therefore, although current trends
agree that parents should not allow unrestricted freedoms do not appear to be leading inevitably to human extinction,
to children too young to think intelligently and behave inde- rational people might increasingly desire what many people
pendently. But just as we help human children graduate to currently define as takeover.
independence, we should regard AI as our successors in cer-
Acknowledgements I thank Brian M. Delaney, Ranjan Ahuja, and
tain key roles and work toward realizing that goal (Minsky Alex Hoekstra for valuable discussion and suggestions. I also thank
1994; Moravec 1988; Totschnig 2019). the anonymous reviewer who provided helpful criticism and comments.
Bostrom N, Yudkowsky E (2018) The ethics of artificial intelligence. Hamilton WD (1966) The moulding of senescence by natural selec-
In: Artificial intelligence safety and security. Chapman and Hall, tion. J Theor Biol 12(1):12–45
pp. 57–69 Hammond G (2023) Aidan Gomez: AI threat to human existence is
Boswell J (1791) Life of Johnson ‘absurd’ distraction from real risks. Financial Times. https://
Brundage M, Avin S, Clark J, Toner H, Eckersley P, Garfinkel B, Dafoe www.ft.com/content/732fc372-67ea-4684-9ab7-6b6f3cdfd736
A, Scharre P, Zeitzoff T, Filar B, Anderson H, Roff H, Allen GC, Hardin G (1960) The competitive exclusion principle: an idea that
Steinhardt J, Flynn C, hÉigeartaigh SÓ, Beard S, Belfield H, Far- took a century to be born has implications in ecology, econom-
quhar S, Amodei D (2018) The malicious use of artificial intelli- ics, and genetics. Science 131(3409):1292–1297
gence: forecasting, prevention, and mitigation. arXiv:1 802.0 7228 Harris S (2005) The end of faith: religion, terror, and the future of
Burt A, Trivers R (2006) Genes in conflict: the biology of selfish reason. WW Norton & Company
genetic elements. Harvard University Press Hassabis D, Kumaran D, Summerfield C, Botvinick M (2017) Neuro-
Carlsmith J (2022) Is power-seeking AI an existential risk? arXiv Pre- science-inspired artificial intelligence. Neuron 95(2):245–258
print arXiv:2206.13353 Hawkins J (2015) The terminator is not coming. The future will thank
Cook TB, Brenner LA, Cloninger CR, Langenberg P, Igbide A, Gieg- Us. Vox. https://www.vox.com/2015/3/2/11559576/the-termi
ling I, Hartmann AM, Konte B, Friedl M, Brundin L (2015) nator-is-not-coming-the-future-will-thank-us
“Latent” infection with Toxoplasma gondii: association with Heaven WD (2023) How existential risk became the biggest meme
trait aggression and impulsivity in healthy adults. J Psychiatr in AI. MIT Technology Review. https://www.technologyreview.
Res 60:87–94 com/2023/06/19/1075140/how-existential-r isk-became-bigge
D’Agostino S (2023) ‘AI Godfather’ Yoshua Bengio: we need a st-meme-in-ai/
humanity defense organization. Bulletin of the Atomic Scien- Hebb DO (1949) The organization of behavior. Psychology Press.
tists. https://thebulletin.org/2023/10/ai-godfather-yoshua-ben- https://doi.org/10.4324/9781410612403
gio-we-need-a-humanity-defense-organization/ Heikkilä M, Heaven WD (2022) Yann LeCun has a bold new vision for
Dawkins R (1976) The Selfish gene. Oxford University Press the future of AI. MIT Technology Review
de Koning AJ, Gu W, Castoe TA, Batzer MA, Pollock DD (2011) Hendrycks D (2023a) Natural selection favors AIs over humans. https://
Repetitive elements may comprise over two-thirds of the doi.org/10.48550/arXiv.2303.16200. arXiv:2303.16200
human genome. PLoS Genet 7(12):e1002384 Hendrycks D (2023b) As it happens, my p(doom) > 80% [Twitter
de la Rosa S, del Mar Rigual M, Vargiu P, Ortega S, Djouder N tweet]. https://twitter.com/DanHendrycks/status/1642394635
(2024) Endogenous retroviruses shape pluripotency specifica- 657162753
tion in mouse embryos. Sci Adv 10(4):eadk9394 Hendrycks D (2023c) Statement on AI risk | CAIS. https://www.safe.
Deininger P (2011) Alu elements: Know the SINEs. Genome Biol ai/statement-on-ai-risk
12(12):1–12 Hendrycks D, Mazeika M, Woodside T (2023) An overview of cata-
Dobzhansky T (1967) The biology of ultimate concern. New Ameri- strophic AI risks. https://doi.org/10.48550/arXiv.2306.12001.
can Library arXiv:2306.12001
Doolittle WF, Sapienza C (1980) Selfish genes, the phenotype para- Hessen Schei T (2019) Ilya: the AI scientist shaping the world. https://
digm and genome evolution. Nature 284(5757):601–603 www.theguardian.com/technology/ng-interactive/2023/nov/02/
Dor-Ziderman Y, Lutz A, Goldstein A (2019) Prediction-based neu- ilya-the-ai-scientist-shaping-the-world
ral mechanisms for shielding the self from existential threat. Hinton G (2022) The forward-forward algorithm: some preliminary
Neuroimage 202:116080 investigations. https://doi.org/10.48550/ARXIV.2212.13345
Estep P, Hoekstra A (2015) The leverage and centrality of mind. Inatomi Y, Sakata K, Arivanandhan M, Rajesh G, Nirmal Kumar V,
In: Aguirre A, Foster B, Merali Z (eds) How should humanity Koyama T, Momose Y, Ozawa T, Okano Y, Hayakawa Y (2015)
steer the future? Springer, pp 37–47 Growth of InxGa1− xSb alloy semiconductor at the International
Fedoroff NV (2012) Transposable elements, epigenetics, and genome Space Station (ISS) and comparison with terrestrial experiments.
evolution. Science 338(6108):758–767 Npj Microgravity 1(1):1–6
Galeon D (2016) AI will colonize the galaxy by the 2050s, according Johnson SK, Johnson PT (2021) Toxoplasmosis: recent advances in
to the “Father of Deep Learning.” Futurism. https://futurism. understanding the link between infection and host behavior.
com/ai-will-colonize-t he-galaxy-by-t he-2050s-according-to- Annu Rev Anim Biosci 9:249–264
the-father-of-deep-learning Johnson DG, Verdicchio M (2017) Reframing AI discourse. Mind
Goldstein S, Kirk-Giannini CD (2023) Language agents reduce the Mach 27:575–590
risk of existential catastrophe. AI Soc. https://doi.org/10.1007/ Johnson DG, Verdicchio M (2019) AI, agency and responsibility: the
s00146-023-01748-4 VW fraud case and beyond. AI & Soc 34:639–647
Goldstein S, Park PS (2023) AI systems have learned how to deceive Johnson SK, Fitza MA, Lerner DA, Calhoun DM, Beldon MA, Chan
humans. What does that mean for our future? The Conversation. ET, Johnson PT (2018) Risky business: linking Toxoplasma gon-
https://theconversation.com/ai-systems-have-learned-how-to- dii infection and entrepreneurship behaviours across individuals
deceive-humans-what-does-that-mean-for-our-future-212197 and countries. Proc R Soc B Biol Sci 285(1883):20180822
Good IJ (1966) Speculations concerning the first ultraintelligent Jones N (2023) OpenAI’s chief scientist helped to create ChatGPT—
machine. In: Advances in computers, vol 6. Elsevier, pp 31–88 while worrying about AI safety. Nature 624(7992):503–503
Grace K, Stewart H, Sandkühler JF, Thomas S, Weinstein-Raun B, Joy B (2000) Why the future doesn’t need us: our most powerful 21st-
Brauner J (2024) Thousands of AI Authors on the Future of century technologies-robotics, genetic engineering, and nano-
AI. arXiv Preprint arXiv:2401.02843 tech-are threatening to make humans an endangered species.
Hadar-Shoval D, Asraf K, Mizrachi Y, Haber Y, Elyoseph Z (2023) WIRED. https://www.wired.com/2000/04/joy-2/
The invisible embedded “values” within large language mod- Knight W (2023) What really made Geoffrey Hinton into an AI
els: Implications for mental health use. Research Square. Doomer. WIRED. https://www.wired.com/story/geoffrey-hin-
https://www.researchsquare.com/article/rs-3456660/v1 ton-ai-chatgpt-dangers/
Hamilton WD (1964) The genetical evolution of social behaviour. I. Kornai A, Bukatin M, Zombori Z (2023) Safety without alignment.
J Theor Biol 7(1):1–16 arXiv Preprint arXiv:2303.00752
Kurzweil R (2005) The singularity is near: when humans transcend Rothblatt M (2015) Virtually human: the promise—and the Peril—of
biology. Penguin digital immortality. Picador
Leavitt D (2006) The man who knew too much: Alan Turing and the Rupprecht CE, Hanlon CA, Hemachudha T (2002) Rabies re-examined.
invention of the computer (great discoveries). WW Norton & Lancet Infect Dis 2(6):327–343
Company Russell S (2019) Human compatible: AI and the problem of control.
Lerner DA, Alkærsig L, Fitza MA, Lomberg C, Johnson SK (2021) Penguin Books Limited
Nothing ventured, nothing gained: parasite infection is associated Salles A, Evers K, Farisco M (2020) Anthropomorphism in AI. AJOB
with entrepreneurial initiation, engagement, and performance. Neurosci 11(2):88–95
Entrep Theory Pract 45(1):118–144 Schmidhuber J (2023) Jürgen Schmidhuber’s home page. https://peo-
Lindahl C, Saeid H (2023) Unveiling the values of ChatGPT: An ple.idsia.ch/~juergen/
explorative study on human values in AI systems [KTH Royal Schopenhauer A (1818) The world as will and representation
Institute of Technology]. https://urn.kb.se/resolve?urn=urn:nbn: Sevilla J, Heim L, Ho A, Besiroglu T, Hobbhahn M, Villalobos P
se:kth:diva-329334 (2022) Compute trends across three eras of machine learning.
Malinowski B (1979) The role of magic and religion. In: Lessa WA, In: 2022 International Joint Conference on Neural Networks
Vogt EZ (eds) Reader in comparative religion: an anthropo- (IJCNN), pp 1–8
logical approach, vol 37. Harper and Row, New York, p 46 Sherwin WB (2023) Singularity or speciation? A comment on “AI
Martinez VO, de Mendonça Lima FW, De Carvalho CF, Menezes- safety on whose terms?” [eLetter]. Science 381(6654):138.
Filho JA (2018) Toxoplasma gondii infection and behavio- https://doi.org/10.1126/science.adi8982
ral outcomes in humans: a systematic review. Parasitol Res Sotala K (2018) Disjunctive scenarios of catastrophic AI risk. In:
117:3059–3065 Artificial intelligence safety and security. Chapman and Hall,
Melnyk V, Melnyk A (2023). Analysis of methods, approaches and pp 315–337
tools for organizing self-improvement of computer systems. In: Stacey K, Milmo D (2023) No 10 worried AI could be used to create
2023 13th International Conference on Advanced Computer advanced weapons that escape human control. The Guardian.
Information Technologies (ACIT), pp 506–511 https://www.theguardian.com/technology/2023/sep/25/ai-biowe
Metz C (2020) In two moves, AlphaGo and Lee Sedol redefined the apons-rishi-sunak-safety
future. Wired, 16 March 2016 Stanovich KE, West RF (2000) Advancing the rationality debate. Behav
Miller JD, Yampolskiy R, Häggström O (2020) An AGI modifying its Brain Sci 23(5):701–717
utility function in violation of the strong orthogonality thesis. Stanovich KE, West RF (2004) Evolutionary versus instrumental goals:
Philosophies 5(4):40 how evolutionary psychology misconceives human rationality.
Minsky M (1994) Will robots inherit the Earth? Sci Am In: Over DE (ed) Evolution and the psychology of thinking: the
271(4):108–113 debate. Psychology Press, pp 171–230
Montagu A (1961) Neonatal and infant immaturity in man. JAMA Szathmáry E (2006) The origin of replicators and reproducers. Philos
178(1):56–57 Trans R Soc London Ser B Biol Sci 361(1474):1761–1776.
Moravec H (1988) Mind children: the future of robot and human intel- https://doi.org/10.1098/rstb.2006.1912
ligence. Harvard University Press Tegmark M (2017) Life 3.0: being human in the age of artificial intel-
Moravec H (1998) When will computer hardware match the human ligence, 1st edn. Alfred A. Knopf
brain. J Evol Technol 1(1):10 Tegmark M, Omohundro S (2023) Provably safe systems: the only path
Nivel E et al (2013) Bounded recursive self-improvement. arXiv:1 312. to controllable AGI. arXiv Preprint arXiv:2309.01933
6764 Totschnig W (2019) The problem of superintelligence: political, not
Olson K (1999) Aum Shinrikyo: once and future threat? Emerg Infect technological. AI & Soc 34:907–920
Dis 5(4):513 Totschnig W (2020) Fully autonomous AI. Sci Eng Ethics
Omohundro SM (2008a) The basic AI drives. In: Wang P, Goertzel B, 26:2473–2485
Franklin S (eds) Proceedings of the 2008 conference on Artificial Varki A (2009) Human uniqueness and the denial of death. Nature
General Intelligence 2008, vol 171. IOS Press, pp 483–492 460(7256):684–684
Omohundro SM (2008b) The nature of self-improving artificial intel- Varki A (2019) Did human reality denial breach the evolutionary
ligence. Singularity Summit 2007. https://s elfaw aresy stems.fi
les. psychological barrier of mortality salience? A theory that can
wordpress.com/2008/01/nature_of_self_improving_ai.pdf explain unusual features of the origin and fate of our species. In:
Ord T (2020) The precipice: existential risk and the future of humanity. Shackelford T, Zeigler-Hill V (eds) Evolutionary perspectives on
Hachette Books death. Springer, pp 109–135
Orgel LE, Crick FH (1980) Selfish DNA: the ultimate parasite. Nature Wiener N (1964) God & Golem, Inc.: a comment on certain points
284(5757):604–607 where cybernetics impinges on religion. The MIT Press. https://
Ororbia A, Friston K (2023) Mortal computation: a foundation for doi.org/10.7551/mitpress/3316.001.0001
biomimetic intelligence. arXiv:2311.09589 Wikipedia contributors (2024) The giving pledge. In: Wikipedia.
Park PS, Goldstein S, O’Gara A, Chen M, Hendrycks D (2023) AI https://en.wikipedia.org/wiki/The_Giving_Pledge
deception: a survey of examples, risks, and potential solutions. Williams GC (1966) Adaptation and natural selection: a critique of
arXiv:2308.14752 some current evolutionary thought. Princeton University Press.
Pinker S (2012) The better angels of our nature: why violence has https://doi.org/10.2307/j.ctv39x5jt
declined. Penguin Books Williams GC (1993) Mother nature is a wicked old witch! In: Nitecki
Pinker S (2018) Enlightenment now: the case for reason, science, MH, Nitecki DV (eds) Evolutionary ethics. State University of
humanism, and progress. Penguin Books New York Press, pp 217–231
Polechová J, Storch D (2008) Ecological niche. In: Encyclopedia of Yampolskiy R (2016) Taxonomy of pathways to dangerous artificial
ecology, vol 2. Elsevier, Oxford, pp 1088–1097 intelligence. In: Workshops at the thirtieth AAAI Conference on
Qirko HN (2017) An evolutionary argument for unconscious personal Artificial Intelligence
death unawareness. Mortality 22(3):255–269 Yampolskiy R (2020) On controllability of artificial intelligence.
Robinson WG (1997) Heaven’s gate: the end. J Comput-Mediated In: IJCAI-21 Workshop on Artificial Intelligence Safety
Commun 3(3):JCMC334 (AISafety2021)
Yudkowsky E (2008) Artificial Intelligence as a positive and negative NeuroAI. Nat Commun 14(1):Article 1. https://doi.org/10.1038/
factor in global risk. In: Rees MJ, Bostrom N, Cirkovic MM (eds) s41467-023-37180-x
Global catastrophic risks. Oxford University Press, pp 308–345. Zangeneh-Nejad F, Sounas DL, Alù A, Fleury R (2021) Analogue com-
https://doi.org/10.1093/oso/9780198570509.003.0021 puting with metamaterials. Nat Rev Mater 6(3):207–225
Yudkowsky E (2016) The AI alignment problem: why it is hard, and Zelikman E, Lorch E, Mackey L, Kalai AT (2023) Self-Taught Opti-
where to start. In: Symbolic Systems Distinguished Speaker, 4. mizer (STOP): recursively self-improving code generation.
Yudkowsky E (2023) Pausing AI developments isn’t enough. We need arXiv:2310.02304
to shut it all down. Time. https://time.com/6266923/ai-eliezer-
yudkowsky-open-letter-not-enough/ Publisher's Note Springer Nature remains neutral with regard to
Zador A, Escola S, Richards B, Ölveczky B, Bengio Y, Boahen K, jurisdictional claims in published maps and institutional affiliations.
Botvinick M, Chklovskii D, Churchland A, Clopath C, DiCarlo
J, Ganguli S, Hawkins J, Körding K, Koulakov A, LeCun Y,
Lillicrap T, Marblestone A, Olshausen B, Tsao D (2023)
Catalyzing next-generation Artificial Intelligence through
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com