Artificial selection methods from evolutionary computing show promise for directed evolution of microbes

Alexander Lalejini
Emily Dolson
Anya E Vostinar
Luis Zaman

Curated by eLife

Evaluation Summary:

This paper addresses a very notable gap that exists between evolutionary computing and experimental evolution. While artificial and computational approaches have long been used as an analogy for biological systems (with studies that have produced findings relevant for evolutionary theory), few studies have directly used methods and results from evolutionary computing to directly inform the shape and structure of experimental evolution studies. This study's approach is creative, and its approaches and results may be of use to both computational and experimental audiences. Lastly, this study can spawn future ones that draw even more connections between evolutionary computation/artificial life and evolutionary theory.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)

Abstract

Directed microbial evolution harnesses evolutionary processes in the laboratory to construct microorganisms with enhanced or novel functional traits. Attempting to direct evolutionary processes for applied goals is fundamental to evolutionary computation, which harnesses the principles of Darwinian evolution as a general-purpose search engine for solutions to challenging computational problems. Despite their overlapping approaches, artificial selection methods from evolutionary computing are not commonly applied to living systems in the laboratory. In this work, we ask whether parent selection algorithms—procedures for choosing promising progenitors—from evolutionary computation might be useful for directing the evolution of microbial populations when selecting for multiple functional traits. To do so, we introduce an agent-based model of directed microbial evolution, which we used to evaluate how well three selection algorithms from evolutionary computing (tournament selection, lexicase selection, and non-dominated elite selection) performed relative to methods commonly used in the laboratory (elite and top 10% selection). We found that multiobjective selection techniques from evolutionary computing (lexicase and non-dominated elite) generally outperformed the commonly used directed evolution approaches when selecting for multiple traits of interest. Our results motivate ongoing work transferring these multiobjective selection procedures into the laboratory and a continued evaluation of more sophisticated artificial selection methods.

Version published to 10.7554/elife.79665 on eLife
Aug 2, 2022
eLife
Jul 5, 2022

Author Response

Reviewer #2 (Public Review):

The authors develop a computational framework for the in silico evolution of "digital organisms" -- in short, programs capable of executing instructions (reading inputs, performing operations with them and producing outputs) and replicating, potentially generating variation ("mutations") in the set of instructions of their offspring. They use this framework to compare the success of various selection algorithms in producing populations of digital organisms capable of carrying out a set of functions (Boolean logic and basic math operations). They study whether different treatments yield different results, focusing on whether selection algorithms from evolutionary computing could outperform strategies typically applied in artificial selection experiments in the laboratory.

The authors' …

Author Response

Reviewer #2 (Public Review):

The authors develop a computational framework for the in silico evolution of "digital organisms" -- in short, programs capable of executing instructions (reading inputs, performing operations with them and producing outputs) and replicating, potentially generating variation ("mutations") in the set of instructions of their offspring. They use this framework to compare the success of various selection algorithms in producing populations of digital organisms capable of carrying out a set of functions (Boolean logic and basic math operations). They study whether different treatments yield different results, focusing on whether selection algorithms from evolutionary computing could outperform strategies typically applied in artificial selection experiments in the laboratory.

The authors' idea is original and intriguing. Their framework of "digital organisms directed evolution" could represent a powerful tool to further explore the potential transfer of strategies from the field of evolutionary computing to the field of microbial evolution. The inclusion of a "no selection" and a "random selection" control is very valuable (and not common in other studies on artificial selection at the population level). The sharp differences they find between selection schemes commonly used in the laboratory (elite, top-10% selection) and algorithms from evolutionary computing (lexicase, non-dominated elite, tournament selection) are interesting and could support the claim that the latter might be well suited for application to microbial evolution. However, I think there are some confounding factors that could be biasing these results, and these should be addressed so that the specific claims of the paper can be fully supported by the data.

My main concern has to do with the observation that some selection protocols (elite, top-10% and tournament) are unable to maintain diversity in the task profiles. I am left wondering whether this is truly a limitation of those protocols, or if it is a (perhaps a bit trivial) consequence of the more general experimental design. Specifically, when selecting the populations to propagate into the next "meta-generation", a sample of organisms is taken. This sample is of only 10 individuals (1% of the maximum population size of 1000). In my mind, this could mean that populations where all (or most) organisms can perform multiple functions (say, populations of "generalists") are favored against populations of "specialists" where, even if all (or most) functions were covered at the population level, this coverage relied on the coexistence of multiple "strains" that performed only a few functions each with little overlap across strains. In other words, the experimental design could be introducing a (perhaps unacknowledged) selective pressure favoring populations of generalists. In fact, the observation that lexicon and non-dominated elite selection schemes seem to be able to overcome this potential bias and maintain a high diversity and spread of task profiles is interesting. However, I am not sure whether the relatively modest performance of the elite, top-10% and tournament protocols could be improved by lifting the selective pressure introduced at sampling.

As a more minor comment, I think the paper could be made more easily accessible to readers outside of the field of evolutionary computing. I think a clearer analogy should be established early on between the behavior of the "digital organisms" in this work and that of real microbes. Although some aspects are straightforward (organisms are born, "execute a genetic program" and divide more or less efficiently depending on the instructions within that program), some details were difficult for me to understand. There are two problems with this: first, it is hard to create an intuition regarding what it means to "perform a function" or "mutate" in the context of a digital organism evolutionary process. It was also unclear to me whether the choice of giving functions a benefit at the population vs. at the individual level was arbitrary, or if it was somehow related with the intrinsic dynamics of the system. The meaning of "the environment" is also somewhat obscure: what exactly are "inputs"? Are the same inputs provided to every organism in every population and in every generation/meta-generation? How can a same program perform multiple functions? These questions were not obvious to me, and I had to carefully go through sections of the Supplementary Material to gain a sense of how these digital organisms behaved in practice. I think providing a more general intuition in this regard, even if at the expense of some details and technicalities, would help make the text more accessible to a broad audience. The second problem with this is that it makes it difficult to extrapolate the conclusions to a microbial evolution context. The authors themselves acknowledge multiple limitations, particularly the lack of ecological interactions and the simplicity of the environment. While these are reasonable minimal assumptions, they most likely affect the results. In microbial populations, interactions are common even in the simplest environments. The environment itself is modified by the organisms, leading to the creation of new niches into which additional species can be selected or evolve. These processes are critical for the diversity and function of microbial populations -- and in fact, it could be argued that many collective functions emerge from individuals' interspecific interactions and are not necessarily present at any single organism level. I understand that including these more complex mechanisms falls out of the scope of this work, and I believe that the simpler model presented here is a valuable starting point. However, I do think that specific claims in the text such as "our experiments suggest that steering evolution at the population-level is more challenging than steering at the individual-level" should be avoided, since one could easily imagine that this is a result of the assumptions of this specific model. And, again, I think establishing a more clear analogy between digital organisms and microbes would make it easier for a broader audience to understand these limitations.

Thank you for your detailed summary and kind remarks. We very much appreciate all of your constructive feedback. In particular, thank you for identifying areas of our manuscript that could be made more accessible for a broader audience.

In addition to the changes we made to address your specific recommendations (below), we made edits throughout the manuscript to address the general feedback/concerns from your summary:

● I think a clearer analogy should be established early on between the behavior of the 'digital organisms' in this work and that of real microbes.

We made a number of edits to the description of digital organisms to help clarify this connection (see below). We also made edits throughout the manuscript in an effort to make it more easily accessible to audiences outside of evolutionary computing.

● …what it means to "perform a function" or "mutate" in the context of a digital organism evolutionary process.

We edited our description of how digital organisms perform functions, adding an example to improve clarity ("Digital Organisms" subsection of the Digital Directed Evolution section). We expanded our description of how digital organisms mutate. We also included a reference to [Wilke and Adami, 2002], which nicely overviews the "biology" of the type of digital organisms (self-replicating computer programs) used in our model.

● It was also unclear to me whether the choice of giving functions a benefit at the population vs. at the individual level was arbitrary, or if it was somehow related with the intrinsic dynamics of the system.

We agree that this was unclear. Roughly, the individual-level functions are simpler (i.e., require fewer instructions to encode) than the population-level functions. We clarified this in the caption for Table 1.

● What exactly are "inputs"? Are the same inputs provided to every organism in every population and in every generation/meta-generation? How can a same program perform multiple functions?

When a digital organism is "born", we randomly generate a set of numeric values that the organism can access by executing an 'input' instruction, which will load the input value into one of the digital organism's memory registers. The same inputs are not provided to every organism in every generation/population. Programs perform multiple functions by performing the requisite computations (Table 1) on values it received via executing 'input' instructions and then by executing an 'output' instruction. Each time an organism produces an output (by executing the 'output' instruction), we check to see if that output is the correct result to one of the 22 designated functions (Table 1) given the set of inputs available to the organism. We further clarified how inputs work and how programs can perform multiple functions in the "Digital Organisms" subsection of the Digital Directed Evolution section.

● The second problem with this is that it makes it difficult to extrapolate the conclusions to a microbial evolution context. The authors themselves acknowledge multiple limitations, particularly the lack of ecological interactions and the simplicity of the environment. While these are reasonable minimal assumptions, they most likely affect the results.

We absolutely agree that our simplifications influence our results and that adding the capacity for more interactions is a critical next step for this work. We would not be surprised if more sophisticated artificial selection protocols were even more useful in the context of more complex ecological interactions than in the simple environments we evaluated. For example, if we had a measure for community stability, we could directly select on stability as an independent objective while simultaneously selecting on community functions.

● I do think that specific claims in the text such as "our experiments suggest that steering evolution at the population-level is more challenging than steering at the individual-level" should be avoided.

This is fair. We intended to argue that steering evolution at the population-level (as is often done in directed microbial evolution) is more challenging overall than steering evolution in conventional evolutionary computing systems where each individual in a population can be independently evaluated and the selection protocol has access to high resolution about the individual's phenotype/genome. We narrowed the scope of this statement to the following: "While results across these two contexts are not directly comparable, we found steering evolution at the population-level to be more challenging than steering at the individual-level (as in conventional evolutionary computing)."

Read the original source
eLife
Jun 23, 2022

Evaluation Summary:

This paper addresses a very notable gap that exists between evolutionary computing and experimental evolution. While artificial and computational approaches have long been used as an analogy for biological systems (with studies that have produced findings relevant for evolutionary theory), few studies have directly used methods and results from evolutionary computing to directly inform the shape and structure of experimental evolution studies. This study's approach is creative, and its approaches and results may be of use to both computational and experimental audiences. Lastly, this study can spawn future ones that draw even more connections between evolutionary computation/artificial life and evolutionary theory.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the …

Evaluation Summary:

This paper addresses a very notable gap that exists between evolutionary computing and experimental evolution. While artificial and computational approaches have long been used as an analogy for biological systems (with studies that have produced findings relevant for evolutionary theory), few studies have directly used methods and results from evolutionary computing to directly inform the shape and structure of experimental evolution studies. This study's approach is creative, and its approaches and results may be of use to both computational and experimental audiences. Lastly, this study can spawn future ones that draw even more connections between evolutionary computation/artificial life and evolutionary theory.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

Read the original source
eLife
Jun 23, 2022

Reviewer #1 (Public Review):

In directed microbial evolution, separate populations of microbes are evolved in the laboratory and evaluated for their ability to exhibit one or more desirable properties. High-performing populations are then selected and subdivided into new populations, which are allowed to further evolve. This process is generally costly in both time and laboratory expenses, making it difficult to optimize the process, such as the type of selection that is employed. In contrast, evolutionary computation is a type of optimization where solutions to computational problems are evolved in silico, through imperfect reproduction of selected individual candidate solutions over multiple generations. Because evolutionary computation is much cheaper and faster that microbial evolution, there has been considerable research studying …

Reviewer #1 (Public Review):

In directed microbial evolution, separate populations of microbes are evolved in the laboratory and evaluated for their ability to exhibit one or more desirable properties. High-performing populations are then selected and subdivided into new populations, which are allowed to further evolve. This process is generally costly in both time and laboratory expenses, making it difficult to optimize the process, such as the type of selection that is employed. In contrast, evolutionary computation is a type of optimization where solutions to computational problems are evolved in silico, through imperfect reproduction of selected individual candidate solutions over multiple generations. Because evolutionary computation is much cheaper and faster that microbial evolution, there has been considerable research studying how different types of selection impact the evolutionary process. The goal of the current study is to see if selection mechanisms that have been shown to perform well in evolutionary computation experiments may also improve directed microbial evolution.

To date, microbial evolution experiments have used forms of truncation selection, where one or more of the "best" performing populations are selected for subdivision and further evolution. However, truncation selection, where some percent of the best performing individuals are selected, is known to result in rapid loss of diversity and poor performance in evolutionary computation. In this paper, the authors compare various types of individual selection methods from evolutionary computation to simulations of multi-objective microbial selection at the population level, where 22 distinct binary (pass/fail) objectives are evaluated and contribute to the fitness of a population in various ways. Specifically, they compare 5 selection methods: (1) elitism (where only the best population is selected); (2) truncation selection (where the top 10% of populations are selected); (3) tournament selection (in each of multiple tournaments, the best population of 4 random populations is selected); (4) lexicase multi-objective selection (where each of the objectives is evaluated sequentially, in a randomized order, and only those populations that can solve the current objective are retained and evaluated on the next objective); and (5) non-dominated multi-objective elitism (where any population that is not Pareto dominated by another population is selected). The first two of these are the methods commonly employed in directed microbial evolution, and the last three are simple versions of selection methods known to perform well in evolutionary computation. For controls, they also compare to random population selection and no selection (where all populations are retained).

The authors clearly explain the methods for simulating microbial evolution, how population fitness and diversity are evaluated, how the various forms of selection are implemented, and how results are compared through rigorous and appropriate statistical methods. The results are clearly displayed in informative graphs, which are also textually described to help the reader understand what the graphs are showing.

The results convincingly show that the multi-objective selection methods, in particular lexicase selection, out-perform the other selection and control methods tested in simulated directed microbial evolution of populations evolved to successfully perform multiple objectives. Although these results are not particularly surprising, they are an important demonstration that multi-objective selection mechanisms known to perform well in selecting individuals in evolutionary computation also work well when used to select populations of individuals in simulated microbial evolution, and may thus be strong candidates for helping to optimize evolutionary processes in real directed microbial evolution.

The authors candidly acknowledge limitations of the current study and describe future research that will address these limitations (e.g., using more sophisticated versions of the selection mechanisms tried here, and ultimately transferring successful methods to laboratory experiments of directed microbial evolution).

The paper is well-written and well-organized, including sufficient details for the reader to conceptually understand what was done, while including additional nitty-gritty details needed for reproducibility in the supplement and in open-source code.

This paper will be of interest to researchers working in directed microbial evolution as well as those in evolutionary computation. The authors compare various selection methods from the field of evolutionary computation to simulations of directed microbial population-level evolution. They convincingly demonstrate that multi-objective population-level selection outperforms the truncation selection of populations that is currently the norm in directed microbial evolution.

Read the original source
eLife
Jun 23, 2022

Reviewer #2 (Public Review):

The authors develop a computational framework for the in silico evolution of "digital organisms" -- in short, programs capable of executing instructions (reading inputs, performing operations with them and producing outputs) and replicating, potentially generating variation ("mutations") in the set of instructions of their offspring. They use this framework to compare the success of various selection algorithms in producing populations of digital organisms capable of carrying out a set of functions (Boolean logic and basic math operations). They study whether different treatments yield different results, focusing on whether selection algorithms from evolutionary computing could outperform strategies typically applied in artificial selection experiments in the laboratory.

The authors' idea is original and …

Reviewer #2 (Public Review):

The authors develop a computational framework for the in silico evolution of "digital organisms" -- in short, programs capable of executing instructions (reading inputs, performing operations with them and producing outputs) and replicating, potentially generating variation ("mutations") in the set of instructions of their offspring. They use this framework to compare the success of various selection algorithms in producing populations of digital organisms capable of carrying out a set of functions (Boolean logic and basic math operations). They study whether different treatments yield different results, focusing on whether selection algorithms from evolutionary computing could outperform strategies typically applied in artificial selection experiments in the laboratory.

The authors' idea is original and intriguing. Their framework of "digital organisms directed evolution" could represent a powerful tool to further explore the potential transfer of strategies from the field of evolutionary computing to the field of microbial evolution. The inclusion of a "no selection" and a "random selection" control is very valuable (and not common in other studies on artificial selection at the population level). The sharp differences they find between selection schemes commonly used in the laboratory (elite, top-10% selection) and algorithms from evolutionary computing (lexicase, non-dominated elite, tournament selection) are interesting and could support the claim that the latter might be well suited for application to microbial evolution. However, I think there are some confounding factors that could be biasing these results, and these should be addressed so that the specific claims of the paper can be fully supported by the data.

My main concern has to do with the observation that some selection protocols (elite, top-10% and tournament) are unable to maintain diversity in the task profiles. I am left wondering whether this is truly a limitation of those protocols, or if it is a (perhaps a bit trivial) consequence of the more general experimental design. Specifically, when selecting the populations to propagate into the next "meta-generation", a sample of organisms is taken. This sample is of only 10 individuals (1% of the maximum population size of 1000). In my mind, this could mean that populations where all (or most) organisms can perform multiple functions (say, populations of "generalists") are favored against populations of "specialists" where, even if all (or most) functions were covered at the population level, this coverage relied on the coexistence of multiple "strains" that performed only a few functions each with little overlap across strains. In other words, the experimental design could be introducing a (perhaps unacknowledged) selective pressure favoring populations of generalists. In fact, the observation that lexicon and non-dominated elite selection schemes seem to be able to overcome this potential bias and maintain a high diversity and spread of task profiles is interesting. However, I am not sure whether the relatively modest performance of the elite, top-10% and tournament protocols could be improved by lifting the selective pressure introduced at sampling.

As a more minor comment, I think the paper could be made more easily accessible to readers outside of the field of evolutionary computing. I think a clearer analogy should be established early on between the behavior of the "digital organisms" in this work and that of real microbes. Although some aspects are straightforward (organisms are born, "execute a genetic program" and divide more or less efficiently depending on the instructions within that program), some details were difficult for me to understand. There are two problems with this: first, it is hard to create an intuition regarding what it means to "perform a function" or "mutate" in the context of a digital organism evolutionary process. It was also unclear to me whether the choice of giving functions a benefit at the population vs. at the individual level was arbitrary, or if it was somehow related with the intrinsic dynamics of the system. The meaning of "the environment" is also somewhat obscure: what exactly are "inputs"? Are the same inputs provided to every organism in every population and in every generation/meta-generation? How can a same program perform multiple functions? These questions were not obvious to me, and I had to carefully go through sections of the Supplementary Material to gain a sense of how these digital organisms behaved in practice. I think providing a more general intuition in this regard, even if at the expense of some details and technicalities, would help make the text more accessible to a broad audience. The second problem with this is that it makes it difficult to extrapolate the conclusions to a microbial evolution context. The authors themselves acknowledge multiple limitations, particularly the lack of ecological interactions and the simplicity of the environment. While these are reasonable minimal assumptions, they most likely affect the results. In microbial populations, interactions are common even in the simplest environments. The environment itself is modified by the organisms, leading to the creation of new niches into which additional species can be selected or evolve. These processes are critical for the diversity and function of microbial populations -- and in fact, it could be argued that many collective functions *emerge* from individuals' interspecific interactions and are not necessarily present at any single organism level. I understand that including these more complex mechanisms falls out of the scope of this work, and I believe that the simpler model presented here is a valuable starting point. However, I do think that specific claims in the text such as "our experiments suggest that steering evolution at the population-level is more challenging than steering at the individual-level" should be avoided, since one could easily imagine that this is a result of the assumptions of this specific model. And, again, I think establishing a more clear analogy between digital organisms and microbes would make it easier for a broader audience to understand these limitations.

Read the original source
Version published to 10.1101/2022.04.01.486727 on bioRxiv
Apr 2, 2022

From Mendelian Inheritance to Machine Learning: The Evolution of Genetic Improvement in Rabbits: A Review

This article has 3 authors:
1. Samuel Ayeh Ofori
2. Richard Asante Botwe
3. Bismark Yeboah
This article has no evaluationsLatest version Feb 2, 2026
The heterogeneous selection landscape of genome evolution in prokaryotes

This article has 5 authors:
1. Eugene Koonin
2. Sofiya Garushyants
3. Svetlana Karamycheva
4. Nash Rochman
5. Yuri Wolf
This article has no evaluationsLatest version Dec 12, 2025
A Gödelian Perspective on Target-Directed Fitness in Cumulative-Selection Models

This article has 1 author:
1. David Chernoguz
This article has no evaluationsLatest version Dec 22, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

From Mendelian Inheritance to Machine Learning: The Evolution of Genetic Improvement in Rabbits: A Review

The heterogeneous selection landscape of genome evolution in prokaryotes

A Gödelian Perspective on Target-Directed Fitness in Cumulative-Selection Models