Accelerating protein directed evolution via reinforcement learning

Haipeng Gong
Tianyu Mi
Yu-Xiang Wang
Wanze Wang
Jingyu Zhao
Yunhao Shen
Nan Xiao
Ligong Chen
Guo-Qiang Chen
Shuyi Zhang
Wen-Bin Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

With the advancement of artificial intelligence, the protein fitness landscape becomes predictable, providing reliable guidance in the selection of advantageous mutations for the directed evolution of proteins. In practice, however, simply combining a small number of advantageous single mutations is unlikely to produce variants with global superiority, while exhaustive exploration of the astronomical mutational combinations is highly challenging. In this study, we introduce a virtual directed evolution pipeline, RelaVDEP, to facilitate functional optimization of the target protein in silico. By developing a reward model to balance the efficiency and accuracy of protein functional prediction and by designing a model-based reinforcement learning framework to explore the vast combinatorial space of protein mutations, this pipeline is capable of automatically identifying diversified multiple mutational variants with notable improvement in desired functional properties for a broad spectrum of proteins (including highly engineered targets like eGFP and PETase), as evidenced by experimental validation

Version published to 10.21203/rs.3.rs-8907793/v1 on Research Square
Mar 4, 2026

GrAdaBeam: Combining model gradients with evolutionary search for generalizable nucleic acid design

This article has 1 author:
1. Joel Shor
This article has no evaluationsLatest version Apr 8, 2026
Constructing the ensemble of representative structures for a protein via neural-surrogate-guided MSA recombination

This article has 4 authors:
1. Haipeng Gong
2. Hanyang Zhou
3. Hongyu Yu
4. Stephen Yau
This article has no evaluationsLatest version Mar 5, 2026
Machine Learning-Based Prediction of Base Editor sgRNA fitness score

This article has 11 authors:
1. Alessandro Orro
2. Arianna Consiglio
3. Maria Ilaria Curci
4. Martina Scichilone
5. Faiza Hasin
6. Michele Minervini
7. Corrado Mencar
8. Gianluca De Bellis
9. Cinzia Cocola
10. Paride Pelucchi
11. Tommaso Selmi
This article has no evaluationsLatest version Apr 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

GrAdaBeam: Combining model gradients with evolutionary search for generalizable nucleic acid design

Constructing the ensemble of representative structures for a protein via neural-surrogate-guided MSA recombination

Machine Learning-Based Prediction of Base Editor sgRNA fitness score