Enhancing Protein Binding Site Residue Prediction with Graph Neural Networks: Impacts of Cutoff Distance and Feature Selection

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Identifying protein binding site residues is critical for understanding molecular interactions but challenging due to incomplete knowledge of features that determine binding. The increasing volume and accessibility of protein sequences and structures presents an opportunity to systematically decipher their intricate relationships with binding. Here, we investigate the effect of cutoff distance between residues and the importance of sequence-based and structure-based features on binding in four eukaryotic model organisms with graph neural networks. Our results indicate that sparse graphs generated by an 8 Å cutoff are most effective for binding site residue prediction, and that increasing cutoff distance introduces noise from non-binding site residues, impairing model performance. Yet, using hybrid cutoffs that enrich connectivity between predicted binding site residues has the potential for enhancing performance. In addition, sequence-based features learned by a pre-trained protein language model carry substantial information to binding. In contrast, structure-based features derived from protein backbone geometry are inadequate due to overprediction of binding site residues. These findings are consistent for the proteins across the four organisms, providing evidence for evolutionary conservation of binding site residues.

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

Article activity feed