Prospective evaluation of structure-based simulations reveal their ability to predict the impact of kinase mutations on inhibitor binding

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Small molecule kinase inhibitors are critical in the modern treatment of cancers, evidenced by the existence of over 80 FDA-approved small-molecule kinase inhibitors. Unfortunately, intrinsic or acquired resistance, often causing therapy discontinuation, is frequently caused by mutations in the kinase therapeutic target. The advent of clinical tumor sequencing has opened additional opportunities for precision oncology to improve patient outcomes by pairing optimal therapies with tumor mutation profiles. However, modern precision oncology efforts are hindered by lack of sufficient biochemical or clinical evidence to classify each mutation as resistant or sensitive to existing inhibitors. Structure-based methods show promising accuracy in retrospective benchmarks at predicting whether a kinase mutation will perturb inhibitor binding, but comparisons are made by pooling disparate experimental measurements across different conditions. We present the first prospective benchmark of structure-based approaches on a blinded dataset of in-cell kinase inhibitor affinities to Abl kinase mutants using a NanoBRET reporter assay. We compare NanoBRET results to structure-based methods and their ability to estimate the impact of mutations on inhibitor binding (measured as ΔΔG). Comparing physics-based simulations, Rosetta, and previous machine learning models, we find that structure-based methods accurately classify kinase mutations as inhibitor-resistant or inhibitor-sensitizing, and each approach has a similar degree of accuracy. We find that physics-based simulations are best suited to estimate ΔΔG of mutations that are distal to the kinase active site. To probe modes of failure, we investigate two clinically significant mutations poorly predicted by our methods, T315A and L298F, and find that starting configurations and protonation states significantly alter the accuracy of our predictions. Our experimental and computational measurements provide a benchmark for estimating the impact of mutations on inhibitor binding affinity for future methods and structure-based models. These structure-based methods have potential utility in identifying optimal therapies for tumor-specific mutations, predicting resistance mutations in the absence of clinical data, and identifying potential sensitizing mutations to established inhibitors.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/14492921.

    This manuscript provides a baseline comparison between current physics-based computational methods with machine-learning (ML) methods in predicting key thermodynamics properties in drug discovery – binding energy in presence of inhibitor (ΔΔG). Here three non-ML algorithms are studied against one existing ML model — Random Forest (RF), which was directly trained to predict this physical property in a previous study. The results suggest physics-based simulations in general provide better estimates compared to the ML/structural-based methods when benchmarking to experimental measures, especially against distal mutations. Overall the manuscript is well-written and provides a sufficient amount of detail in each methodology used.

    Here are our comments:

    1. The generalizability of the trained ML model is not immediately clear to us. It would be helpful if the author can include some data analyses on this model with the train/validation/test datasets in this manuscript to show the audience the performance of the model. How was the dataset different from Aldeghi et al. ACS central science 5, no. 8 (2019): 1468-1474? Will RF model predict better if it is trained on the Platinum database? Will a combination of Platinum database to data used in this manuscript improve the predicting ability of the model?

    2. Following the previous comment, similar to neural nets, deep tree-based methods can be easily overfitted to the training data, is this also expected here?The NanoBRET vs. measurements in Hauser el al (2018) scatter plot and NanoBRET vs. RF plot look very similar to each other. Could that be evidence of overfitting to the dataset? Again it would be beneficial to present some results on model training.

    3. As the authors summarized in the end, the physics-based simulation methods outperform structural/ ML-based methods. Does this mean by introducing structural descriptors to machine learning models, the predictions can be largely improved. This could be easily validated by retraining a RF model with additional (distal) features, which can be a valuable ML-based benchmark for future study.

    4. While the authors provide a detailed investigation of the effects of forcefield selection in non-equilibrium perturbation (NEQ), the free energy calculations (FEP+) method does not seem well studied under various forcefield parameters. Is there a reason why OPLS was chosen for FEP+ and GAFF/CGenFF were selected for NEQ? If so, please elucidate.

    5. The authors may want to comment on the utility of the training dataset from Hauser et al. (2018) given the importance of measuring the ΔΔG values for each TKI from a single measurement as demonstrated and alluded to in this paper.

    6. Can the authors comment on the structural basis for the impact of distal mutations (Supp. Fig. 5, 6) which have a significant impact on inhibitor binding? It might also be useful to make a separate list for the identity of the mutated residue, their ΔΔG values, distance from the active site, RMSE and correlation values for these interesting mutations.

    7. The RF model has been trained with mostly nearly-neutral point mutations. Is it expected to perform well for large-effect resistant or sensitive mutations for which experimental ΔΔG values are not within the 土1 kcal/mol range?

    8. The authors assign a mutation as resistant if both the NanoBRET and computational approach predict an increase in ΔΔG by ΔΔG > +1 kcal/mol . What about the performance of the models on mutations for which the predicted -1 < ΔΔG < +1 kcal/mol, i.e., nearly neutral, but the clinically observed phenotype is cancer resistance or sensitive?

    9. It is unclear if there are H-bond interactions between the water molecules within 0.4 nm from T315 and other chemical groups in the vicinity, such as backbone amides or a ligand atom. Are there features in the electron density map for the crystal structure of Abl kinase that indicate other water- mediated contacts in this site that might be disrupted by the T315A mutation?

    Minor points:

    1. Term "singular point" in PRAUC plots is only mentioned in the caption of Figure 5, it would be good to have it mentioned and defined in the main text, followed by a discussion on its significance.

    2. What is the shaded region around the diagonal line in Fig. 2B?

    3. What is the provenance of the "sensitizing" mutation L298F, it is unclear if this is patient derived or engineered?

    Competing interests

    The authors declare that they have no competing interests.