Sequence-free landscape inference for directed evolution
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Directed evolution is a method for engineering biological systems or components, such as proteins, wherein desired traits are optimised through iterative rounds of mutagenesis and selection of fit variants. The process of protein directed evolution can be envisaged as navigation over high-dimensional landscapes with numerous local maxima. The performance of any strategy in navigating such a landscape is dependent on the ruggedness of that landscape. However, this information is generally unavailable at the outset of an experiment, and cannot currently be computed using analytical methods. Here we propose SLIDE, S equence-free L andscape I nference for D irected E volution, which consists of two parts. First, SLIDE provides an estimation for landscape ruggedness from a mutating population using only population-level phenotypic data and an estimation of mutation rate. Ruggedness information in itself is valuable in protein design, for instance in predicting evolutionary stability. Second, SLIDE offers a framework for using the estimated ruggedness metric to select high-performing parameters for directed evolution control. Using theoretical NK landscapes and four real-world protein fitness landscapes, we demonstrate improvement upon the performance of standard selection strategies, particularly on rugged landscapes, using a pipeline that could also be combined with emerging AI-based methods for driving direction evolution.