Deep Geometric Framework to Predict Antibody-Antigen Binding Affinity

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

In drug development, the efficacy of an antibody depends on how the antibody interacts with the target antigen. The strength of these interactions gives an indication of how successful an antibody is in neutralizing an antigen. Therefore, the strength, measured by “binding affinity”, is a critical aspect of antibody engineering. In theory, the higher the binding affinity, the higher the chances are that the antibody is successful against the target antigen. Currently, techniques such as molecular docking and molecular dynamics are utilized in quantifying the binding affinity. However, owing to the computational complexity of the aforementioned techniques, running simulations for large antibodies/antigens remains a daunting task. Despite the commendable improvements in deep learning-based binding affinity prediction, such approaches are highly dependent on the quality of the antibody-antigen structures and they tend to overlook the importance of capturing the evolutionary details of proteins upon mutation. Further, most of the existing datasets for the task only include antibody-antigen pairs related to one antigen variant and, thus, are not suitable for developing comprehensive data-driven approaches. To circumvent the said complexities, we first curate the largest and most generalized datasets for antibody-antigen binding affinity prediction, consisting of both protein sequences and structures. Subsequently, we propose a deep geometric neural network comprising a structure-based model and a sequence-based model that considers both atomistic and evolutionary details when predicting the binding affinity. The proposed framework exhibited a 10% improvement in mean absolute error compared to the state-of-the-art models while showing a strong correlation between the predictions and target values. We release the datasets and code publicly ( https://drug-discovery-entc.github.io/p2pxml/ ) to support the development of antibody-antigen binding affinity prediction frameworks for the benefit of science and society.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/12661726.

    The goal of the study was to develop a general deep model that is not confined to a specific family of antigens. The proposed network was trained on a curated dataset comprising antibody-antigen pairs for HIV, MERS, flu virus, etc. The authors observed a significant improvement in the mean absolute error compared to existing state-of-the-art models.

    Major issues

    • The statistical analysis section that includes the tests used for comparison and correlation, the program used to evaluate these tests, and the level of significance should be mentioned.

    • In Table 2: the mean absolute error (MAE) and mean squared error (MSE) are presented but no statistical comparisons were performed to assess if the differences are statistically significant.

    • Page3, 2nd column last paragraph:  Authors said "From Table 2, it is evident that our final Combined-V2 model outperforms all the considered state-of-the-art approaches at least by a margin of 10.6% while improving the performance of our individual sequence-based and structure-based models by 5.6% and 6.8%, respectively." The mentioned percentages are not presented in the table

    • Page 3, 2nd column last paragraph: why authors examined both the Pearson correlation coefficient and Spearman's correlation coefficient. Also, the p-value should be mentioned to assess the significance.

    • Page 4, 2nd column, Table 3: the mean absolute error (MAE) is presented as an indicator of performance but no statistical comparisons were performed to assess if the differences are statistically significant. This is also for table 4,5,7,8

    Minor issues

    • Page1, Introduction section, 2nd column, 2nd paragraph: identify the abbreviation (IC50) as half-maximal inhibitory concentration

    • Page2, 1st column, 2nd paragraph:  the name of the author of reference (6) should be added as follows: In the study done by Shan et al, (6)

    • The correct Citation of reference 1: Lu RM, Hwang YC, Liu IJ, Lee CC, Tsai HZ, Li HJ, Wu HC. Development of therapeutic antibodies for the treatment of diseases. J Biomed Sci. 2020 Jan 2;27(1):1. doi: 10.1186/s12929-019-0592-z.

    • The correct citation of reference 2: Ferreira LG, Dos Santos RN, Oliva G, Andricopulo AD. Molecular docking and structure-based drug design strategies. Molecules. 2015 Jul 22;20(7):13384-421. doi: 10.3390/molecules200713384.

    • The previous citation should be applied to all references

    • Reference 23 is a repetition of reference 7

    • Reference 35 is a repetition of reference 30

    Competing interests

    The author declares that they have no competing interests.