Identify Non-Mutational p53 Functional Deficiency in Human Cancers

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment:

    This study by Li et al describes an interesting attempt to predict the functional status of the p53 tumor suppressor in tumors where no DNA mutations in p53 could be identified. To this end, the authors employed SVM models to train the algorithm for the detection of 'p53 inactivation' features contrasting normal and tumor tissues. The approach could be a valuable tool for attributing tumors with unknown p53 status. The authors provide solid evidence supporting their findings and the concept of the study is solid, but in its current formulation, some of the bioinformatic analyses are incomplete, particularly related to the selection of associated genes and the potential mechanism(s).

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

An accurate assessment of TP53 ’s functional status is critical for cancer genomic medicine. However, there is a significant challenge in identifying tumors with non-mutational p53 inactivations that are not detectable through DNA sequencing. These undetected cases are often misclassified as p53-normal, leading to inaccurate prognosis and downstream association analyses. To address this issue, we build the support vector machine (SVM) models to systematically reassess p53’s functional status in TP53 wild-type ( TP53 WT ) tumors from multiple TCGA cohorts. Cross-validation demonstrates the excellent performance of the SVM models with a mean AUC of 0.9822, precision of 0.9747, and recall of 0.9784. Our study reveals that a significant proportion (87-99%) of TP53 WT tumors actually have compromised p53 function. Additional analyses uncovered that these genetically intact but functionally impaired (termed as predictively reduced function of p53 or TP53 WT -pRF) tumors exhibit genomic and pathophysiologic features akin to p53 mutant tumors: heightened genomic instability and elevated levels of hypoxia. Clinically, patients with TP53 WT -pRF tumors experience significantly shortened overall survival or progression-free survival compared to those with TP53 WT -pN (predictive normal function of p53) tumors, and these patients also display increased sensitivity to platinum-based chemotherapy and radiation therapy.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    Li et al. have designed a study that examines specific mechanisms for how different DNA sequence variants in the common cancer gene p53 (also known as TP53) influence the sensitivity of tumors to a variety of common cancer treatments. Specifically, they examine a handful of p53 variants with respect to glioblastoma and its response to platinum-based chemotherapy and to radiation therapy. The authors begin by mentioning that looking at DNA variants in cancer is useful but also incomplete: methylation, PTMs, and non-DNA sequence variants can also be critical. They then mention that they have created a model showing that nearly all cancers with p53 mutations have loss-of-function variants and that many cancers with "normal" wildtype p53 in fact have variants causing LOF. These p53 LOF tumors lead to worse patient outcomes, but the authors here show that these tumors appear to be more susceptible to radiation and platinum-based chemotherapy, which they say they have validated in glioblastoma xenografts. This potentially opens up a new avenue for precision medicine for many different sources of cancer that share common p53 LOF variants. The authors have taken a modern approach towards cancer diagnosis and shown how this can improve targeted treatments across a large array of cancer types. They have provided a reasonably convincing proof of concept of this approach for n = 35 PDXs in one cancer type. By and large, the approach and results are reasonable, although many of the exact results concerning the genes and pathways identified that covary with the various treatments and p53 variants are unclear. For instance, the feature selection seems to be somewhat ad hoc, e.g. the method used to determine p53 LOF from p53 WT in the TCGA data was not the same method used for determining p53 LOF from p53 WT in the PDX data.

    Thanks for the positive comments. In our study, we used the same method for feature selection (i.e., p53 targets identification), and for calculating CES in different cancer types. This is described in Materials and Methods. However, the methods used to identify the LOF of WT TP53 in TCGA and PDX data are different. For TCGA LUNG, BRCA, COAD, ESCA cohorts, we used the SVM models built from the same cancer type to predict TP53 status. For PDX samples derived from the glioblastoma patients, we used the unsupervised clustering approach. This is because:

    1. To train an SVM model, we need a large number of “normal” samples (to represent p53 normal status) and “tumor samples with TP53 truncating mutation” (to present p53 LOF status). In this PDX cohort (n = 35), we have no “normal” samples and only one p53-truncating mutation (Fig. 4f, Table S6). Technically, it is impossible to build an SVM model from this PDX cohort.

    2. The TCGA GBM cohort also has very limited “normal” samples (n = 5) which prevents us from training an SVM model for glioblastoma prediction.

    1. The TCGA pan-cancer SVM model is not a good choice since GBM was not included into the pan-cancer cohort due to its limited training sample size. Although the pan-cancer model achieved a high AUROC, its performances varied significantly across cancer types. This is most likely due to the imbalanced sample size, since the pan-cancer model is biased by cancer types (e.g., lung and breast) with the larger sample sizes.

    2. Even we were able to build a new SVM model from the TCGA pan-cancer with GBM samples included, applying this SVM model to predict non-TCGA samples is still very challenging because of batch effects.

    Therefore, we first used the unsupervised clustering as an alternative to the SVM model to classify samples, and then we manually annotate the PDX clusters into “p53-pN” and “p53-pLOF” according to the composite expression score.

    We agree with the reviewer that the underlying pathways/mechanisms that can potentially explain the different treatment effects and p53 non-mutational LoF are still unclear and warrant further investigation.

    The TCGA AUROCs were incredibly good - over 99% - versus more like 75% for the actual proof of concept. While any significant p-value is fine for basic research, it would be nice to know how this could be improved and bring the results in Figure 4 from ~75% to the >99% that would be necessary for use as a medical diagnostic or for treatment selection for precision medicine.

    Thanks for your suggestion. Precision cancer medicines that target TP53 mutations are currently being evaluated in clinical trials. Developing a robust model to predict p53 functional status for medical diagnosis or treatment selection is the primary goal of our study. However, there is still a long way to go to bring the model trained from external data into medical practice. To minimize the biological, clinical and technological heterogeneities and bias, the best approach is to train an SVM model from the same cancer type in the same institute; this requires:

    1. The sample sizes of both normal and tumors harboring TP53 truncating mutation should be sufficient to train the SVM model. Take the TCGA lung cancer dataset (n_tumor = 1003) as an example, we built an excellent SVM model from 108 normal samples and 254 tumor samples with TP53 truncating mutations. A much larger sample size is needed if the TP53 truncating mutation frequency is low.

    2. Matched data including whole-exome or whole-genome sequencing (to determine TP53 mutation status), RNA-seq (for gene expression), and treatment response.

    If one plans to use public data such as TCGA to train the model, the major challenge is integrating data from different sources (i.e., remove batch effects arising from different patients’ cohorts, tumor samples storage and processing, library preparation, sequencing, and bioinformatics analyses).

    However, there are significant questions regarding the specific findings uncovered: do the gene pathways identified through bioinformatic analysis fit in with the many highly-studied mechanistic roles of p53? Do the cohort selections - which vary by an order of magnitude in sample size, and come from different locations and different tissues - make statistical sense for cross-validation?

    According to our analysis, p53 targets shared by four selected cancer types are significantly enriched in “cell cycle control” and “DNA damage response” pathways, which are the canonical functions of p53 (PMID: 9039259, PMID: 36183376).

    For the four TCGA cancer cohorts selected in our study, cross-validations were independently performed for each cancer type. For the pan-cancer cohort, we agree with the reviewer that the samples come from different locations and different tissues, and the pan-cancer SVM model could be potentially biased by a few cancer types with larger number of samples. Building a pan-caner SMV model is a compromised strategy when each cancer type alone does not have sufficient samples to train its own SVM model, and more rigorous evaluations (by independent datasets) are needed. This is why we put the pan-cancer results into the supplementary materials. We have revised the manuscript to make this point clear (Page 9).

  2. eLife assessment:

    This study by Li et al describes an interesting attempt to predict the functional status of the p53 tumor suppressor in tumors where no DNA mutations in p53 could be identified. To this end, the authors employed SVM models to train the algorithm for the detection of 'p53 inactivation' features contrasting normal and tumor tissues. The approach could be a valuable tool for attributing tumors with unknown p53 status. The authors provide solid evidence supporting their findings and the concept of the study is solid, but in its current formulation, some of the bioinformatic analyses are incomplete, particularly related to the selection of associated genes and the potential mechanism(s).

  3. Reviewer #1 (Public Review):

    Li et al. have designed a study that examines specific mechanisms for how different DNA sequence variants in the common cancer gene p53 (also known as TP53) influence the sensitivity of tumors to a variety of common cancer treatments. Specifically, they examine a handful of p53 variants with respect to glioblastoma and its response to platinum-based chemotherapy and to radiation therapy. The authors begin by mentioning that looking at DNA variants in cancer is useful but also incomplete: methylation, PTMs, and non-DNA sequence variants can also be critical. They then mention that they have created a model showing that nearly all cancers with p53 mutations have loss-of-function variants and that many cancers with "normal" wildtype p53 in fact have variants causing LOF. These p53 LOF tumors lead to worse patient outcomes, but the authors here show that these tumors appear to be more susceptible to radiation and platinum-based chemotherapy, which they say they have validated in glioblastoma xenografts. This potentially opens up a new avenue for precision medicine for many different sources of cancer that share common p53 LOF variants.

    The authors have taken a modern approach towards cancer diagnosis and shown how this can improve targeted treatments across a large array of cancer types. They have provided a reasonably convincing proof of concept of this approach for n = 35 PDXs in one cancer type. By and large, the approach and results are reasonable, although many of the exact results concerning the genes and pathways identified that covary with the various treatments and p53 variants are unclear. For instance, the feature selection seems to be somewhat ad hoc, e.g. the method used to determine p53 LOF from p53 WT in the TCGA data was not the same method used for determining p53 LOF from p53 WT in the PDX data. The TCGA AUROCs were incredibly good - over 99% - versus more like 75% for the actual proof of concept. While any significant p-value is fine for basic research, it would be nice to know how this could be improved and bring the results in Figure 4 from ~75% to the >99% that would be necessary for use as a medical diagnostic or for treatment selection for precision medicine. However, there are significant questions regarding the specific findings uncovered: do the gene pathways identified through bioinformatic analysis fit in with the many highly-studied mechanistic roles of p53? Do the cohort selections - which vary by an order of magnitude in sample size, and come from different locations and different tissues - make statistical sense for cross-validation?

  4. Reviewer #2 (Public Review):

    The Tp53 gene is deemed as one of the most critical tumor suppressors in humans. Not surprisingly, the latter is found inactivated or mutated in the majority (if not all) of human cancers. The present study by Q. Li et al describes an attempt to predict the functional status of p53 in those tumors where no mutations on the DNA sequencing level were identified. To this end, the authors employed SVM models to train the algorithm for the detection of the 'p53 inactivation' features using normal and tumor tissues, respectively. It turned out that the 'p53 loss of function' phenotype was associated not with DNA methylation but rather with yet unknown mechanisms. Based on the fact that the p53LoF-containing tumors are similar to the p53 mutant-expressing ones with respect to platinum-based therapy, they subsequently used their SVM model on the glioblastoma samples to predict their chemosensitivity.