Integrating Machine-learning and Ultra-high-throughput Screening for Enzyme spaces exploration

Yitao Ke
Yanzhe Zhang
Minchao Fang
Jingyang Zhao
Hongli Zhu
Zehui Xu
Longxing Cao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The systematic navigation of biocatalyst space is constrained by elusive structure-activity rules and a lack of evolutionary history. Here, we present IMUSE, a strategy integrating machine learning with ultra-high-throughput screening. By screening millions of droplet-encapsulated de novo enzymes, we generated massive synthetic sequence-structure datasets to train models that capture their complex fitness landscapes and biophysical principles. These models effectively guide functional exploration across both sequence and novel structure spaces. IMUSE identified synergistic triple mutations yielding ∼5-fold activity improvements and discovered active second-generation designs with novel catalytic pockets, boosting the experimental success rate >4.9-fold (∼30%). This work demonstrates how synthetic fitness landscapes bridge the data gap in de novo enzyme space, transforming stochastic search into deterministic navigation to unlock highly proficient biocatalysts beyond natural boundaries.

Version published to 10.64898/2026.06.23.733994 on bioRxiv
Jun 24, 2026

Learning the structural diversity in random protein sequence space

This article has 9 authors:
1. Filip Buchel
2. Tereza Neuwirthova
3. Theodora Tureckiova
4. Gustavo Fuertes
5. Ales Benda
6. Dalibor Panek
7. Matus Fricek
8. Mohammed AlQuraishi
9. Klara Hlouchova
This article has no evaluationsLatest version May 5, 2026
SynFit: Synergistic Contrastive Learning for Multi-Objective Protein Fitness Prediction and Optimization

This article has 6 authors:
1. Tony Tu
2. Wei Huang
3. Ziang Li
4. Kerr Ding
5. Yang Yang
6. Yunan Luo
This article has no evaluationsLatest version May 26, 2026
Genuine Directed Evolution In Test Tube (GENie)

This article has 3 authors:
1. Lilin Feng
2. Maochao Mao
3. Ulrich Schwaneberg
This article has no evaluationsLatest version May 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Learning the structural diversity in random protein sequence space

SynFit: Synergistic Contrastive Learning for Multi-Objective Protein Fitness Prediction and Optimization

Genuine Directed Evolution In Test Tube (GENie)