RC-GNN: A predictive model of enzyme-reaction pairs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Uncharacterized functions of enzymes represent untapped opportunity to develop therapeutics, unlock the sustainable synthesis of materials, and understand the evolution of life-sustaining metabolic networks. Uncharacterized enzymes and reactions, generated by protein language models and computer-aided synthesis tools, respectively, make up a large part of this opportunity. Given the technical complexity of high-throughput enzymatic activity screens, predictive models are needed that can pre-screen enzyme-reaction pairs in silico . We present Reaction-Center Graph Neural Network, (RC-GNN) a model capable of predicting whether an enzyme, represented by an amino acid sequence, can significantly catalyze a given reaction, represented by its full set of reactants and products. We explicitly evaluated RC-GNN’s generalization to queries highly dissimilar from those present in the training dataset. In the most difficult conditions tested, our models achieve 0.88 and 0.84 ROC-AUC on classification tasks featuring globally selected and synthetic negatives, respectively. On a time-based split an RC-GNN achieved 0.91 ROC-AUC. The ability to successfully make predictions on enzymes and reactions distinct from those used during training makes RC-GNN especially useful for both metabolic engineers and evolutionary biologists who need to reason about uncharacterized enzymatic reactions.