Ligand Binding Prediction Using Protein Structure Graphs and Residual Graph Attention Networks

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Computational prediction of ligand–target interactions is a crucial part of modern drug discovery as it helps to bypass high costs and labor demands of in vitro and in vivo screening. As the wealth of bioactivity data accumulates, it provides opportunities for the development of deep learning (DL) models with increasing predictive powers. Conventionally, such models were either limited to the use of very simplified representations of proteins or ineffective voxelization of their 3D structures. Herein, we present the development of the PSG-BAR (Protein Structure Graph-Binding Affinity Regression) approach that utilizes 3D structural information of the proteins along with 2D graph representations of ligands. The method also introduces attention scores to selectively weight protein regions that are most important for ligand binding. Results: The developed approach demonstrates the state-of-the-art performance on several binding affinity benchmarking datasets. The attention-based pooling of protein graphs enables identification of surface residues as critical residues for protein–ligand binding. Finally, we validate our model predictions against an experimental assay on a viral main protease (Mpro)—the hallmark target of SARS-CoV-2 coronavirus.

Article activity feed

  1. SciScore for 10.1101/2022.04.27.489750: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    First, we retrieved all protein-ligand pairs with associated dissociation constant (Kd) from BindingDB database37.
    BindingDB
    suggested: (BindingDB, RRID:SCR_000390)
    The bioassay record (AID 1706)42 by Scripps Research Institute provides PubChem Activity Score normalized to 100% observed primary inhibition.
    PubChem
    suggested: (PubChem, RRID:SCR_004284)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    PSG-BAR alleviates these limitations by using entire protein structural graph and learning attention scores to selectively weight useful regions of the protein based on its interaction with the drug molecule. As a result, our method outperforms state-of-the-art affinity prediction methods on several benchmarking datasets. As such, the integration of protein structures helps to achieve better predictive results. This is mainly because 3D structures contain relevant information on actual configuration of the binding pockets, which have immediate implications for the ligand binding. These methods are bottlenecked by the availability of experimentally derived protein structures; however, with the advancement of NMR Xray crystallography and cryo-EM techniques, more high resolution PDBs are being deposited than ever before. Furthermore, as a result of Alphafold, even more predicted protein structures became available. These developments enable effective advancement of deep learning based approaches; in this work we validate this hypothesis by predicting experimentally determined measures of binding affinity on several protein targets across standard benchmark datasets. Particularly for the KIBA dataset, we show that the augmentation with Alphafold structures improves MSE by 11.1%. It should also be emphasized that augmentation of 3D protein structure information with 2D sequence descriptors can further improve model performance. Since protein sequences capture some level of structu...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.