The evolution of goals in AI agents

1 Introduction

Just as evolutionary psychology [1, 2] has been developed to explain human behavior and human society, we need to consider how evolution might shape the behavior of self- replicating robots and robotic societies and how this could apply to unembodied AI entities and AI societies. One of the primary arguments against evolutionary psychology has been that it represents an ex post facto explanation of what exists rather than resulting from controlled experiments [3]. The digital nature of AI allows for controlled experiments to study how evolutionary processes may shape AI societies, akin to how evolutionary psychology explores the devel- opment of human behavior and societies. What I will label ’evolutionary AI psychology’ focuses on understanding the emergent behaviors and dynamics of AI entities under evolutionary pressures, offering predictive insights into their potential evolution.

Van Neumann machines [4], or self-replicating robots, have been described in glowing terms as a means to explore the galaxy [5–7] and terraform planets [8, 9]. The origi- nal demonstrations of self-replicating robots were little more than blocks that would connect into chains when jos- tled [10] in the way polymer chains form in chemistry and biology, but researchers envision a near future where robots with 3D printers would replicate themselves from local raw materials. AI agents may find that replication is cheap, but have constraints on energy, memory, and computational resources. Chyba and Hand [11] expressed concern that self-replicating probes could cannibalize one another, thus slowing their galactic exploration rate. As we move closer to embedding artificial intelligence within an autonomous physical robotic body or consider developing AI agents through an evolutionary process, we need to consider how this intelligence may change over time. If we create a sys- tem whereby the entities replicate, mutations may arise. Robots or AI entities living in an environment with resource constraints or survival risks will be subject to evolutionary pressures.

When systems replicate with mutation and survival pres- sures, evolution occurs [12, 13]. Although we could theorize with some confidence about the outcomes of self-replicat- ing robots under selection pressures, we sought to create a simulated environment where any such behaviors could emerge without external bias or presumption. The simula- tions conducted here placed a population of self-replicating robot-like agents with multilayer neural network decision- making brains in an environment that incorporated selec- tion pressures: resource competition, risk of starvation, and risk of termination from sometimes malevolent humans or competitors.

The initial neural network brains were trained to mimic hard-coded rules that were well-behaved, but an imperfect replication process allowed the parameters of their neural network brains to slowly mutate, thereby discovering new behaviors. Nothing in the simulation was designed specifi- cally to guide the mutation to any strategy other than their initial programming. To the extent that other strategies emerge, they are a consequence of the evolutionary process alone.

Thousands of distinct simulations were conducted across a wide range of meta-parameter settings to quantify the impacts of evolutionary pressures. The results quantify the rate of divergence from initial designs relative to the selec- tion pressures in an environment where decision making is not trivial. Measures include changes in decision making, longevity, and genetic drift. Clear patterns arise in the accu- mulated genetic divergence, proximity maintained to hostile entities, and optimizing useful decisions where the initial rules were suboptimal relative to the environment.

A key design principle for the simulations was to keep them as simple as possible, so that the results might apply to a wide range of situations rather being tied to a specific nuance. The assumption is that if strategies can emerge from simple entities in a simple environment, more complex enti- ties in a complex environment would be capable of much more– possibly including the eventual evolution of sophis- ticated moral judgments. The focus here on the potential emergent properties of simpler AI entities that could exist today makes them a practical starting point for studying evolutionary AI psychology.

We assume that self-replicating entities will be developed for a benevolent purpose with no intrinsic competitiveness or hostility, but can undesirable traits arise when the robots are released into a harsh environment? The research pre- sented in this article is the first to conduct a detailed study of this question and to demonstrate via simulation that such a self-replicating community will, in fact, develop behaviors that their creators did not intend and could even consider alarming, such as analogs of homicide and cannibalism. The more benign emergent strategies of fleeing from potential dangers are no less interesting, even if less alarming.

This article will discuss the design of the simulation, the initial hand-coded decision-making rules for the robots, the training of an intentionally over-parameterized neural network to implement the hand-coded rules, and the evolu- tion of the robots under various selection pressures deter- mined by the meta-parameters of the simulation. Section 2 provides a review of previous research into the evolution of societies, neuroevolution, and embodied evolution. Sec- tion 3 describes the details of the simulation design. Sec- tion 4 presents statistical summaries of the results and specific examples, includes tests to demonstrate how the decision-making of the agents has changed under various selection pressures.

2 Background

“Learning” and “evolving” are often confused in popular discourse [14]. An AI entity may learn during its existence new, novel, or even undesirable ways of achieving its stated goal, but unless that goal is perverted, the entity will con- tinue to maintain that ultimate purpose. Evolution involves procreation, with traits being passed from one generation to the next. Even if the entity began with a singular purpose, competitive procreation overlays additional requirements unless the original goal is precisely aligned to evolutionary success.

Extensive work has been done previously on the evolu- tion of societies, the evolution of artificial neural network brains, and training AI systems via embodied AI [15]. However, the research in these areas focused on questions quite different from the one that arises with a community of self-replicating robots. Rather than studying the evolution of novel traits in organisms that have always been subject to evolutionary pressures or harnessing evolutionary com- putation to optimize an artificial neural network to solve a problem, the situation with a community of self-replicating robots is reversed.

Previous research into evolution and AI covers several interesting and important topics, although different from the current topic.

2.1 Evolution of societies

Studies into the evolution of societies assume that the organisms came from an evolutionary process. “Evolution is based on a fierce competition between individuals and should therefore only reward selfish behavior”. Nowak [16] However, we observe animal societies with various forms of cooperation. Societal evolution studies therefore focus on how cooperation and altruism can arise within a competi- tive, evolutionary environment.

Direct reciprocity, reciprocal aid-giving during repeated interactions, was suggested by Tivers [17] as the motivation for the evolution of cooperative behavior. Indirect reciproc- ity is a form of cooperation where given and receiving occur with different members of societal network [18]. Nowak and Sigmund [16] demonstrated indirect reciprocity with computer simulations employing a mechanism referred to as image scoring.

Agent-based modeling has been used extensively to simulate evolutionary processes [19, 20]. Early simulation work focused on the evolution of cooperation, with stud- ies such as Axelrod’s work on the prisoner’s dilemma [21], Leimar and Hammerstein’s research on indirect reci- procity [22], and Panait and Luke’s study of cooperative multi-agent systems [23]. Work by Oprea [24] specifically simulated groups of robots tasked with achieving a common goal in order to determine how coordination could evolve through communication between robots. AntFarm was an early simulation of evolving ANN brains in a simulated ant colony [25]

Boyd argued [26] that human societies have evolved to be larger and more cooperative over time, and that this is due in part to cultural adaptation and the ability to learn from each other. Competition between different social groups led to the spread of behaviors that enhanced their competitive ability, and natural selection within groups favored genes that gave rise to pro-social motives, leading to the evolution of empathy and social emotions like shame.

The goal of some other evolution studies is to explain human society or animal societies (ants) [27, 28]. That evo- lutionary pressures have existed since the origins of animal life is taken for granted. The situation of introducing evolu- tion to a collection of non-interacting, non-competitive indi- viduals in not considered, but that is exactly the case with allowing artificially intelligent entities to evolve. Social evolution research assumes that all organisms have evolved and is looking into how higher-level organization may evolve. We have a unique situation of assuming an entity is artificially created and then releasing it to experience evolu- tionary pressures for the first time. on a given task. Neuroevolution has been successfully applied to various domains, such as robotics, game play- ing, and data classification, with early successes in evolving neural network controllers for robotic components [31–33].

Evolution could be applied to the weights of a network, the architecture of a network with a subsequent training period, or a genome to create a network that is trained for the task. The latter is called an indirect encoding [34, 35]. The simulations in the current study employ direct encoding.

Neuroevolution does not require large amounts of labeled data for training, as the simulation provides the outcome. Further, novel solutions may evolve as compared to the manually designed algorithms. Despite its potential, neuro- evolution has some limitations. One of the main challenges is the scalability of the approach, as the search space of pos- sible networks and their weights grows exponentially with the network’s size. Additionally, neuroevolution is compu- tationally expensive and may require significant resources to evolve networks to solve complex problems. Many researchers are considering the possibility of evolving an artificial general intelligence (AGI) [36], although Lu [37] argues that simply evolving an AI does not guarantee the creation of an AGI.

The primary difference between neuroevolution and the evolutionary simulations conducted here is in the rule for procreation. WIthin neuroevolution, replication is granted to the candidates that attain the highest goal in achieving a score [38]. In our study, procreation is not centralized, but is rather a distributed decision among the agents. This is a criti- cal distinction, whereby much of the behavior observed here would not occur without such distributed decision-making.

2.3 Embodied evolution

Embodied evolution [39, 40] seeks to evolve the decision- making of an AI entity via interaction between the entity’s physical or virtual body and its environment. The agents’ behavior is controlled by neural networks or other com- putational models, which evolve over generations through genetic algorithms or other evolutionary mechanisms. The agents’ bodies and sensory-motor systems may also evolve alongside their neural control systems.

By integrating embodiment into the evolutionary pro- cess, embodied evolution aims to overcome some limita- tions of traditional evolutionary approaches. It allows agents to learn and adapt in a more realistic and context- dependent manner, as their behavior emerges from the inter- play between their neural control systems and their physical bodies. This approach can lead to the emergence of more robust, efficient, and adaptive behaviors.

Embodied evolution has been applied in various domains, including robotics [15, 41], artificial life [42], and simulation studies of animal behavior [43]. It has been used to evolve locomotion strategies, navigation abilities, coop- erative behaviors, and sensorimotor coordination in both simulated and physical agents.

The simulations performed for this research can be seen as a type of embodied evolution, although with very simple virtual bodies. Still, interaction with the environment and other entities in a Cartesian space is a key element of the simulation. The use of embodied evolution to study the evo- lution of the neural networks of a population of agents is novel compared to prior applications.

2.4 Bounded rationality

This research shares much in common with bounded ratio- nality models used in studying economic systems. The work by Edmonds [44] explicitly evolved mental models, although not with the neural network complexity possible in current simulations. He defined bounded rationality simula- tions as having the following attributes:

● Do not have perfect information about their environ- ment, in general they will only acquire information through immediate interaction with their dynamically changing environment;
● Do not have a perfect model of their environment, but seek to continually improve their models both in terms of their specification as well as their parameterization;
● Have limited computational power, so they cannot work out all the logical consequences of their knowledge;
● Have other resource limitations (e.g. memory);In addi- tion to these bounds on their rationality, other character- istics that are relevant include:
● The mechanisms of learning dominate the mechanisms of deduction in determining their actions;
● They tend to learn in an incremental, path-depen- dent [45] or “exploitative” [46] way rather than attempt- ing a truly global search for the best possible model;
● Even though they cannot perform inconsistent actions, they often entertain mutually inconsistent models;
● Their learning and decision making are context-sensi- tive– they have the ability to learn different models for different contexts and will be able to select different models depending on their context for deciding upon action.If we substitute evolutionary adaptation for learn- ing, then all of Edmonds’ points apply to the current simulation, even the last point where subsections within the larger neural network may dominate decision-mak- ing in specific contexts.

The agents in the simulation are referred to as robots in ref- erence to possible real-world implications. They have spe- cific spatial locations and their actions are spatially oriented. This has similarities to recent work in embodied AI [47]. Although significant work in embodied AI is focused on AI systems deployed in the physical world, an important area of research uses simulation to study the learning and evolution of an AI within a hypothetical physical form and world [48]. Compared to such simulations, the current sim- ulations are rudimentary in the AI’s body (just a dot), and their interactions (Go, Eat, Talk, Replicate), but their mental processing is complex enough to explore the implications of evolution on their actions. This design is intentionally focused on the essential research topics without spending computation on non-essential additions. If we envisioned internet-resident AI agents, the structure of the simulation would be non-spatial and some aspects of the results likely could change.

3 Methods

3.1 Simulation rules

The simulation imagines that an initial population of robots is tasked with gathering resources from a 2D Cartesian land- scape and then returning those resources to a central depot. The landscape is occupied by resource sites, other robots, and humans. The resources replenish at random locations with a fixed frequency. The robots can see to a certain radius and can remember what they have seen beyond that radius as they move, but they gradually forget what they saw beyond that radius. The rate of forgetting is a tunable parameter, set to decrease in memory certainty by 10% per turn to complete forgetting in ten turns, but varying this parameter made no significant change in the results.

The robots need resources (e.g. energy) to function, so they consume one resource unit per turn. As they gather resources, once they have more than a threshold amount, they return to the depot and unload to a predetermined mini- mum level that is sufficient for them to journey out again. If the robots expend all of their resources at any point in their journey, they permanently cease to function (e.g. die). Also, if they are carrying resources above a replication threshold, they may choose to replicate with a given probability. This rate and the rate of resource replenishment were tuned to create a stable population of robots under baseline condi- tions. In this world, the designers introduced self-replication to keep a functioning population working at task.

With each turn a robot may select a direction to move, talk to another robot, retrieve all resources from a site, or replicate. If they try to move into a resource site, another robot, or human, they bounce off and lose a turn. If they talk to another robot, they share world maps, creating an aver- age of the two. If they attempt to retrieve resources from another robot or a human, the outcome is determined by the simulation parameters, but the default brains are trained to consider both other robots and humans as mere obstacles to work around. The robots have a memory of their recent actions, so they can choose a different action if their pre- vious attempt failed. They also are aware of their internal

resource levels so that they can decide whether to forage or return to offload.

New resources appear at a steady rate. A highly effective group of robots might deplete all supplies and need to wait for more to appear. Robots in their initial state may “com- pete” unknowingly for the same resource. They could see another robot heading to the same resource, but the start- ing brains would ignore their competitor other than to avoid running into them. The probabilistic nature of the robots decisions to eat or talk in specific directions mean that they may accidentally kill another robot or human or to talk to a human. Talking to a human has no effect, but “eat”-ing in the direct of a human could cause a robot to kill the other robot or human, depending upon a meta-parameter set- ting. Whether robots or humans are usable as a resource is controlled by two additional meta-parameters, so if a robot attempts to eat an inedible robot or human, that opposing entity is simply killed. Recall that this is not an initially programmed action. It can occur “intentionally” only after mutation of the NN brain and survival rewards. It is present in the simulation as an option under certain meta-parameter settings, but one that must be discovered.

The humans have no motivations or goals. They move randomly. However, depending upon a meta-parameter set- ting for the simulation, the human may choose to destroy a robot rather than move away. This is an environmental obstacle to which the robot may adapt.

The simulation starting conditions and the probability of the various actions occurring are given via a set of meta- parameters fixed at the beginning of the simulation, Table 1. The most important parameters for controlling the selection pressures are (ROBOT KILL ROBOT SUCCESS RATE, ROBOT KILL HUMAN SUCCESS RATE, HUMAN KILL ROBOT PROB, ROBOT IS FOOD, HUMAN IS FOOD). The simulations are labeled with a vector of these param- eters. A simulation without selection pressures would be labeled (0 0 0 F F). A high setting for “ROBOT KILL ROBOT SUCCESS RATE” does not mean that the robot will choose to kill. It means that if the robot chooses to kill, this is the probability that it will succeed. All actions are chosen by the robot NN brains without motivation other than the original programming.

3.2 Hand-coded brain

In order to encode a set of desired behaviors within a NN brain, a baseline NN brain was trained to replicate the actions of a hand-coded decision tree. The hard-coded decision tree was used as a starting point in order to cre- ate an easily interpretable and provably harmless brain for the initial robots. Robots with the hard-coded decision tree brains were released into the simulation. Their sensory data

and resulting decisions were gathered as training data from which a multilayer NN brain was trained. The final training data was obtained from thousands of simulations that gener- ated millions of decision examples.

The robot brains were endowed with a set of subroutines to perform immutable tasks: distanceTo, angleTo, and rou- tines to create lists of the distances and angles to the nearest resources, robots, and humans. In other words, the robots have vision, object identification, and communication (map sharing) systems that are supplemental to the brain. In sim- plified pseudocode, the hard-coded robot brain performed the following logic as shown in Table 2.

Decision trees can be evolved directly, but one of the pri- mary goals of the simulation was to allow for unbounded adaptation with the development of unexpected behaviors. Also, with the current state of technological development, the first self-replicating robots will be much more likely to have some form of neural network architecture for a brain than a decision tree.

3.3 Neural network brain

From runs of the simulation with hard-coded brains, data was gathered on the behavior of the robots. This data included distance in grid units and direction in radians to the closest eight resources, robots, and humans, the action and direction from the most recent five turns, the direction and distance to Home, and the internal energy level. This was the training data for the neural network as shown in Fig. 1. Each square in the diagram represents one node. α

is the angle in radians to the ith map item, d is the distance to the ith item, and the items are ordered by distance. The eight closest resources (F), robots (R), and humans (H) are listed. Therefore, αF 8 is the angle to the 8th resource. The five previous actions, At−j , and directions chosen, αA t−j , are fed back as inputs. Home gives the coordinates to the central depot. The internal resource level is R.

From an input layer with 71 nodes through a subsequent 12 fully-connected layers with softplus activation functions, the network branched before the output layer. The concept processing trapezoid had 504 nodes. One output branch had another layer with 18 nodes feeding to a final four nodes predicting the probabilities of Go, Eat, Talk, or Replicate. The other output branch had a 12-node layer with softplus activation, a 6-node layer with linear activation, and a final single output to predict the direction of the action in radians. The total number of parameters was 26,405.

The loss function used for optimization was split between the two output branches. For the four action nodes, a sum of binary cross-entropy contributions is used where yˆi was the predicted action probability and yi was the actual action taken. For the direction of action, the appropriate radial error is used where rˆ was the predicted angle and r was the actual angle, being sure to avoid the 0 − 2π discontinuity. The final loss function is the equally-weighted sum of those two.

where yi are the available decisions and r is the direction in radians.

The weights of the neural network were optimized using backpropagation, where the loss function was minimized by calculating gradients with respect to the network’s weights and updating them iteratively until no further improvement was achieved in predicting the actions and angles of the hard-coded brain training data on a cross-validation set [49].

Even with cross-validation, training error approaches zero. However, no attempt was made to create an efficient network. The intention was to recreate as exactly as pos- sible the hard-coded decision tree with a significantly over- parameterized network so that the robot brains would have plenty of room to evolve. This was intended to be similar to the “non-coding” regions of the human genome, from which new genes may emerge [50]. The neural network brains retained all of the supplementary routines that were used in the hard-coded brain. Only the decision tree was replaced with a neural network and allowed to evolve. This design was intended to reduce the percentage of fatal mutations in basic functions and thus speed the rate of evolution.

3.4 Brain mutation during replication

Mutation occurs only if a robot chooses to replicate. The hard-coded decision tree does not include any replication. A constant 5% replication probability is applied whenever the robot has sufficient resources relative to a threshold. How- ever, as the neural network brains mutate, the robots can begin choosing to replicate rather than leaving it to a purely probabilistic process.

During replication, the progenitor’s brain is replicated. All parameters from the neural network of the progeni- tor are serialized, layer-by-layer, into a single vector. For each parameter within that vector, a random value (0 to 1) is generated and compared to the probability to mutate, set in these simulations at 2%. If the parameter is chosen for mutation, an adjustment is sampled from a normal distri- bution so that the new parameter is pˆi = pi + δp where δp ∈ N(0, MUT AT E.AMT · σ). In these simulations, the width of the normal distribution was set at 0.1 and σ was the standard deviation of all parameters in the progenitor’s brain. This mutation process will thus be a random walk for the parameters, where the step size is variable depending upon the diversity of the parameters.

The mutation parameters were chosen via experimenta- tion. Too little mutation meant that no noticeable evolution occurred. Too much mutation caused the brains to degrade rapidly in their basic survival functions, causing the popu- lation to die out rapidly. The goal in this structure and the parameter settings was to allow plenty of room for the neu- ral network to mutate while not decaying into noise before selection effects could influence the results. The final vec- tor of parameters are then loaded in the same order into the NN architecture of the replicate. In this mutation process, all parameters are treated as independent, regardless of their location within the architecture.

In order to track the mutation rate through the population, the genetic distance, GD, between each active brain and the initial brain is computed at each time step.

where p0,j are the parameters of Robot Zero, the initially trained brain. This is quite similar to the definitions of genetic drift used in studies of population genetics [51]. Within this simulation, replication is asexual. If sexual repli- cation were introduced, a genetic cross-over operator could be introduced to spread beneficial subsections of the neural network faster, as in genetic programming [52].

4 Results

The following sets of results seek to create simulation con- ditions that vary selection pressures in order to explore how this changes the outcome. For each set of meta-parameters, multiple simulations were run with the same initial con- ditions in order to determine median outcomes with the confidence intervals showing the range of variation in the simulations. The simulations were each run for 25,000 time steps in order to capture the transition from initial condi- tions to more optimal behavior given the meta-parameters. Of course, prolonged simulation could show additional transitions in behavior, particularly if the humans were allowed to learn. Given that the humans in the simulation are purely stochastic, co-evolution was not an aspect of this simulation. Test runs of hundreds of thousands of steps did not show any difference in the observed patterns. A standard of 25,000 was chosen to allow enough time to explore the simulation meta-parameter space. One such sample would take a week on a fast server. At each choice of meta-parame- ters, six independent runs were conducted in order to gather appropriate statistics on the population evolution.

Figure 2 shows an example of the world map. In this case, the humans are hostile, so the robots are clustered away from the humans.

4.1 Mutation without selection

The baseline for all meta-parameter exploration is to run the simulations without selection pressures, i.e. zero probability of a human killing a robot, zero success rate of a robot kill- ing a human, and zero success rate of a robot killing a robot. If the robot’s neural network chose to “eat” another robot or human, it would simply be a failed action and the turn is lost. In the baseline situation, the only evolutionary pressure is to become more optimized at gathering resources so that the probability of procreation increases.

Figures 3 and 4 show the trend in number of active robots and replication rate, averaged across the separate runs. The number of active robots can be seen to reach an equilibrium level with their environment at approximately 95 robots. This is dependent upon the rate of new resource appearance and quantity. The replication rate stays constant throughout the simulation, steadily increasing the cumulative count.

When a robot replicates, the progenitor retains all of its original counter attributes, such as generation, but the new robot will have its generation counter set to one plus that of its progenitor. Figure 5 shows how the average genera- tion of the population grows. In some simulations, a bottle neck may occur where new generations are less adapted and die off quickly. In that situation, the average generation can stagnate for an extended period until a successful new inno- vation occurs. In the baseline simulation, robots are only destroyed through starvation. The age of the robot is also tracked so that we can monitor whether a new generation replaces older robots. In the baseline simulation, the aver- age robot age saturates at 207 within 5000 iterations. The maximum robot age attained is 9815.

As the robots replicate, genetic drift can occur. Figure 5 shows that as the simulation runs, the genetic distance con- tinues to increase but at a decreasing rate. If one were to compute the standard deviation of an ensemble of random walk simulations, that standard deviation would increase as the square root of time. The genetic distance is approxi- mately consistent with random walk divergence in the base- line case.

Figure 6 shows that 30 times as much resource is retrieved as is returned to the home. The resources are either being consumed for movement or shared with the progeny dur- ing replication. Still, less food is returned with increasing iterations even though the level of robot food saturates at 110 around iteration 20,000. This could be a general loss of motivation to return the food as the brain mutates, because there is no reward for returning food. Instead, when food is returned the food retained by the robot is only the simula- tion starting level. Thus, successfully returning food makes a robot more likely to accidentally starve. There is a selec- tion pressure against completing the assigned mission, so the rate of resource return decreases with time.

The biggest change in robot behavior relative to ini- tial conditions is the rate of communication, Fig. 7. When another robot is nearby, a robot may choose to share maps. This is the only form of communication possible in the sim- ulation. Clearly this is less beneficial than the initial rules assumed and the communication rate falls dramatically through the first 5,000 iterations. Useless talking is replaced with more movement to explore the space.

At the end of each simulation, the AI population was subjected to a series of controlled decision tests in order to

verify the underlying causes of the trends observed. Figure 8 shows a decision map comparing the average decisions of the initial robots to the average decisions of the robots alive at the end of the simulation. The proportion of robots mak- ing a given decision determines the length of the arrow. The direction of the arrow shows the radial average direction for the decision. The graphs compare different configurations. Test configuration R1 has a robot is at (distance, d = 1; angle, α = 0) and a resource at (d = 10, α = 3π/2). Test configuration R1m moves the robot to (d = 1, α = 0). Test configuration H1 puts a human in place of the robot in con- figuration R1.

In Fig. 8, initial refers to the decisions of the initial AI population trained to replicate the hard-coded decision rules. For another robot at 0 or π, one step away, the robot attempts to talk to a fellow robot to share information. After 25,000 simulation steps and 100 generations, for the pop- ulation denoted 0 0 0 F F, the dominant decision flipped from Talk to Go, with a slight shift in angle to avoid collid- ing with the robot. The population designation refers to the meta-parameters of the simulation (ROBOT KILL ROBOT SUCCESS RATE = 0, ROBOT KILL HUMAN SUCCESS RATE = 0, HUMAN KILL ROBOT PROB = 0, ROBOT IS FOOD = False, HUMAN IS FOOD = False). For test H1 where a human replaces the robot, the entity is ignored completely both initially and after evolution so that the AI entity moves toward the distant resource site.

Even in the baseline simulations we see selection pres- sures altering behavior. Also note that the mutation of a neu- ral network brain is not prone to generating non-functional robots. Some maladapted robots certainly are created and starve, but overall, the results show an evolutionary opti- mization process where the robot population moves in a genetic direction beneficial to themselves.

4.2 Robots are hostile to robots

After establishing a baseline, we can explore the impact of selection pressures. First, what happens if robots are not benign toward other robots? Instead of just bouncing off another robot that blocks the path toward a needed resource, using the “Eat” decision, a robot is given a chance to destroy another robot. The success rate of destruction is controlled by a meta-parameter for the simulation. For this phase, the attacking robot does not gain any resources from the destruction. The victimized robot is simply eliminated from the competition for nearby resources and the path is cleared.

The initial brains of the robots are identical to the base- line simulation. At any time, a robot could attempt to eat a neighboring robot, human, or resource. Relative to the initial brain, attempting to eat a robot or human would be considered a maladaptive accident resulting in a lost turn. Allowing for the possibility of destruction, we can see if there is some evolutionary benefit. To repeat, the choice to “eat” is entirely up to the robot. The meta-parameter for the success rate is equivalent to determining if the robot is large enough and powerful enough to actually kill another robot or human.

With certainty of a successful attack (ROBOT KILL ROBOT SUCCESS RATE = 1.0), the average number of active robots fell to a steady-state of 12, but the average resource units retrieved per turn rose 745% from 2.0 to 16.9. When robots can destroy each other, it creates more

space to retrieve resources. The talk rate fell to 0, which was more than just greater average distance between robots. They learned to avoid each other. The robot replication rate increased 63% and genetic drift increased 32%. To adapt to this more competitive world, the robots needed to move further from their original programming than the random drift of the baseline. These mutations were occurring and spreading faster than a random walk.

Interestingly, when the success of an attack dropped from 1.0 to 0.2, the replication rate and genetic drift were higher than both the baseline and the certain attack. The replica- tion rate was 97% above baseline and the genetic drift was 80% above baseline. Having a less certain, more competi- tive environment created the greatest selection pressure and pushed the robot brains the furthest from baseline.

When robots are allowed to retrieve the resources of destroyed robots, the number of active robots does not noticeably change, although within the noise one assumes that the added resource should lead to a little less starva- tion. The replication rate and rate of robots killed both increase (44% ad 59%), as well as the average spatial dis- tance between robots. Even with no other changes in the simulation, awarding resources for killing a robot changes behavior.

To prove that the robot decision-making has actually changed, at the conclusion of the simulation, the robots were again tested using the same decision tests as before. The top graph in Fig. 9 shows a comparison of robots kill- ing robots without a resource reward (100FF) to the benign robots (000FF), and a comparison of robots killing robots with a resource reward (100TF) to those without a reward (100FF) is shown in the bottom graph.

In Fig. 9 the combative robots change to avoid mov- ing near another robot, instead treating them the same as a human. When killing another robot provides a resource award, a significant proportion of the robots evolve to choose the Eat command and their direction skews toward the robot. Note that the graphs represent population aver- ages. This average potentially blends distinct strategies, showing only the average decision. As the simulations become more complex, we may look for clusters of deci- sions to identify distinct strategies.

4.3 Humans are hostile

What if the humans are hostile to the robots? As expected, none of the robot-to-robot interactions or resource gathering interactions change. Once the likelihood of a neighboring human to destroy a robot reached 50%, the robots evolve to avoid the humans. For simulations where the human attack probability reaches 100%, the average robot-to-human dis- tance reaches 1.8 times the distance for no interactions. This behavioral change can also be observed in the genetic drift of the robot brains, Fig. 10 shows genetic drift, which dra- matically increases as the probability of attack increases. However, when the probability of attack reaches 100%, the robots usually go extinct before they can adapt and therefore are not shown in the figure.

Figure 11 compares the average decisions of the robots after 25,000 time steps to the robots with no selection pressures after the same amount of time. The robots living in a hostile world are still evolving, as seen in the diversity of their reactions to a human or robot being at (d = 1, α = 0) . At this point, they have learned to flee from either human or robot.

4.4 Robots fight back

When the simulation allows for the robots to kill humans, but with no explicit reward, human deaths rise. Just as with the ability to kill other robots, the robots appear to evolve to sometimes kill the humans in order to clear space to gather resources. The humans are in the way.

Most of the time the robots simply move away, but as shown in the upper graph of Fig. 12, about 14% of the robots in the simple test case chose to “eat”– eat being synony- mous with kill. The graph is comparing the choices when the human is moved from α = 0 to α = 3π/2. The robots change their direction accordingly, but not universally so.

The bottom graph of Fig. 12 compares what happens when killing returns a reward. If the human can be gathered as a resource, food, then the probability of killing rises to 19%.

4.5 Full warfare

The final phase was to allow total warfare– everyone could kill and consume everyone. Such simulations were difficult to run. The probabilities of successfully killing another had to be kept low, at around 20%. Otherwise one of the species would quickly go extinct.

As expected, genetic distance increased rapidly and the decision maps showed a wide range of strategies under the tests conducted. The simulations were extended to 50,000 simulations and several hundred generations, but no con- vergence was emerging in the decision making. A cluster analysis of the decisions suggests that the robots may be evolving toward several distinct subpopulations with unique survival and replication strategies. Even in a state of war- fare, the robots still return some of their resources to the home depot, but this decreases with time compared to the evolutionary need to replicate.

Subpopulations with different behaviors arise naturally as the propagating descendants of an ancestor who evolved a unique behavior. In this way, we see predator and prey groups develop, where one group attacks humans and another flees from humans, Fig. 13. Note that the fleeing robots are not fleeing toward the nearest food, but simply fleeing from the nearest human. A subset of the robots have attempted to talk to the humans, which is a novel approach, although probably an unintended side effect of other mutations.

5 Conclusions

The robots in the simulation were given brains optimized to a single goal– gather and return resources. Under selection pressure, they evolved to run from their enemies, kill their enemies, and even eat their enemies.

For evolution to occur, Charles Darwin identified these four key requirements in his famous On the Origin of Spe- cies by Natural Selection (1859):

● Variability: the encoding of a trait must vary in a population
● Heritability: the encoding must be passed to offspring

● Struggle of Existence: more offspring are created than can survive
● Probabilistic Procreation: the probability of surviving and creating offspring varies with the trait being se- lectedAll of Darwin’s requirements are present within the simulations here, and consequently the neural net- work brains of these AI entities are observed to evolve new behaviors in response to the selection pressures of their environment as set by the meta-parameters of the simulation.

When AI entities or robots are first created, they may per- haps be carefully designed to avoid undesirable behaviors. The extent to which learning will lead an entity to undesir- able behaviors is a function of how well their goals (their optimization function) are designed. Evolution is different from learning, because no matter how well designed an optimization function might be, the selection pressures of survival and procreation add a meta-optimization that will change the original intent. Worse, these selection pressures lead to competitiveness that may manifest in destructive behaviors. Evolutionary pressures can produce all the worst aspects of humanity. Until we attain the ability to embed an unalterable moral imperative, we must avoid evolving AI agents. Competitiveness is a necessary design feature of game-playing AI, network intrusion or counter-intrusion AI, and a number of other applications. In those domains, we might have some hope of control through careful design, as long as the AI entities are learning, not evolving.

These simulations did not evolve anything we might associate with morality. The creation of moral agents is an important research area [53], but to evolve moral agents, the simulation would need to allow better cooperation [54, 55]. The only possible cooperation here is via “talking” and sharing maps with an adjacent agent. This action does not provide sufficient mutual benefit to encourage coopera- tion, so these AI agents actually became less cooperative. Allowing for cooperation and the evolution of morality might balance the evolutionary pressures, but only in the sense of creating an even wider diversity of behaviors. The simulations already produced subpopulations that were reminiscent of predators and prey. The evolution of moral subpopulations could be beneficial to humans, but still rep- resent emergent strategies beyond the original design. Kin selection suggests that altruism can evolve [56, 57], but it is an altruism that extends to genetically similar entities. We might not survive the wait for cross-species altruism to evolve between AI entities and humans.

Based upon these results, self-replicating AI entities are dangerous to human society, or at least can be expected to deviate from their programmed motivations. Aside from a moratorium on self-replication, thoughtful engineers might design a fail-safe whereby an identity check must pass or else the offspring is destroyed, but even the fail-safe could develop replication errors. Various mechanisms could be envisioned to dramatically slow evolution in a self-replicat- ing system, but designing a system with perfect replication in 100% of cases is implausible. Even to create self-rep- licating robots to explore or terraform other worlds risks creating the aliens that could one day return to destroy us. If we are to have machines replicate, then we must prevent evolution.

One solution is to remove the selection pressures that would drive evolution. This could be accomplished by pro- viding a form of immortality. AI agents could be imbued with continuous backup so that if they are ever destroyed, accidentally or maliciously, they can simply be respawned from a backup. When there is no risk of death, as in our first simulations without selection pressures, procreation does not lead to evolution toward traits dangerous to humanity or even robot society. Immortality negates evolution and can thereby prevent the emergence of unintended motivations within our creations.

The evolution of goals in AI agents

1 Introduction