Mechanism-informed rules tunably balance novelty and feasibility of predicted enzymatic reactions

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Enzymes catalyze reactions with remarkable specificity and can unlock recalcitrant feedstocks that are dilute, complex, and variable in their constituent molecules. While characterized enzymatic reactions cover a wide range of chemistries, there are an undetermined number of cryptic activities for every known one. These cryptic activities can be elicited through rational design, adaptive laboratory evolution, and increasingly, generative models of proteins. However, prior to tuning a catalyst one must efficiently predict viable novel reactions. In this work we leverage the growing amount of mechanistic enzyme information, specifically the Mechanism and Catalytic Site Atlas, to construct a set of reaction rules that can meet this demand. By explicitly utilizing mechanistic information, the rule sets developed here more accurately identify molecular structures required for catalysis compared to existing curated and heuristically constructed rules. The 899 Distilled rules are constructed directly from characterized mechanisms and cover 62.5% of reactions from Rhea. The Learned rule set is generated from a classifier trained on mechanistic data, allowing full coverage of Rhea and precise identification of mechanism-required atoms (ROC-AUC = 0.98). Additionally, our Learned rules exhibit a more favorable tradeoff between novelty and feasibility and provide users with fine-grained control over this tradeoff. The rules are compatible with all SMARTS-based reaction network expansion and retrosynthesis software.

Article activity feed