Accelerating protein directed evolution via reinforcement learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the advancement of artificial intelligence, the protein fitness landscape becomes predictable, providing reliable guidance in the selection of advantageous mutations for the directed evolution of proteins. In practice, however, simply combining a small number of advantageous single mutations is unlikely to produce variants with global superiority, while exhaustive exploration of the astronomical mutational combinations is highly challenging. In this study, we introduce a virtual directed evolution pipeline, RelaVDEP, to facilitate functional optimization of the target protein in silico. By developing a reward model to balance the efficiency and accuracy of protein functional prediction and by designing a model-based reinforcement learning framework to explore the vast combinatorial space of protein mutations, this pipeline is capable of automatically identifying diversified multiple mutational variants with notable improvement in desired functional properties for a broad spectrum of proteins (including highly engineered targets like eGFP and PETase), as evidenced by experimental validation