Integrating Machine-learning and Ultra-high-throughput Screening for Enzyme spaces exploration

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The systematic navigation of biocatalyst space is constrained by elusive structure-activity rules and a lack of evolutionary history. Here, we present IMUSE, a strategy integrating machine learning with ultra-high-throughput screening. By screening millions of droplet-encapsulated de novo enzymes, we generated massive synthetic sequence-structure datasets to train models that capture their complex fitness landscapes and biophysical principles. These models effectively guide functional exploration across both sequence and novel structure spaces. IMUSE identified synergistic triple mutations yielding ∼5-fold activity improvements and discovered active second-generation designs with novel catalytic pockets, boosting the experimental success rate >4.9-fold (∼30%). This work demonstrates how synthetic fitness landscapes bridge the data gap in de novo enzyme space, transforming stochastic search into deterministic navigation to unlock highly proficient biocatalysts beyond natural boundaries.

Article activity feed