Individualized discovery of rare cancer drivers in global network context

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    In this work, Petrov and Alexeyenko present a novel network-based method, NEADriver, aimed at the identification of mutational (point mutations and copy number variants) driver genes across tumors. The authors evaluate ten large cancer cohorts and assess the overlap of their results with established cancer genes or datasets that are enriched for cancer genes. This manuscript addresses a topic of high interest in the cancer genomics community.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Late advances in genome sequencing expanded the space of known cancer driver genes several-fold. However, most of this surge was based on computational analysis of somatic mutation frequencies and/or their impact on the protein function. On the contrary, experimental research necessarily accounted for functional context of mutations interacting with other genes and conferring cancer phenotypes. Eventually, just such results become ‘hard currency’ of cancer biology. The new method, NEAdriver employs knowledge accumulated thus far in the form of global interaction network and functionally annotated pathways in order to recover known and predict novel driver genes. The driver discovery was individualized by accounting for mutations’ co-occurrence in each tumour genome – as an alternative to summarizing information over the whole cancer patient cohorts. For each somatic genome change, probabilistic estimates from two lanes of network analysis were combined into joint likelihoods of being a driver. Thus, ability to detect previously unnoticed candidate driver events emerged from combining individual genomic context with network perspective. The procedure was applied to 10 largest cancer cohorts followed by evaluating error rates against previous cancer gene sets. The discovered driver combinations were shown to be informative on cancer outcome. This revealed driver genes with individually sparse mutation patterns that would not be detectable by other computational methods and related to cancer biology domains poorly covered by previous analyses. In particular, recurrent mutations of collagen, laminin, and integrin genes were observed in the adenocarcinoma and glioblastoma cancers. Considering constellation patterns of candidate drivers in individual cancer genomes opens a novel avenue for personalized cancer medicine.

Article activity feed

  1. Author Response

    Reviewer #3 (Public Review):

    The study by Petrov and colleagues examined whether rare cancer drivers can be examined in a network context. For this purpose, the authors develop a new computational tool that is based on two "channels" (MutSet and PathReg) to provide evidence on whether a gene might reflect a driver gene. Based on these channels, they evaluate ten large cancer cohorts and assess the overlap of their results with established cancer genes or datasets that are enriched for cancer genes. Based on this comparison, they find a strong enrichment for known cancer genes.

    In my opinion, the study addresses an important point. Indeed, many discovery algorithms have been based on mutational recurrence. While these strategies robustly identify the most frequently mutated cancer genes, they yield diminishing returns for rare driver genes so that several magnitudes of large datasets would be required for identification of rare driver genes. Therefore, network-based identification of rare driver genes could be a useful criterion to identify rare driver genes, for instance, based on their interaction with canonical drivers. If could have an important impact on diagnostics and therapeutic decision making.

    While this idea is intriguing, it is not entirely novel. For more than a decade, mutation data in TCGA have been viewed in networks and many previous studies have tried to identify driver genes based on networks. I think a critical point would be to compare the authors' methods against these previous approaches and to demonstrate that it overcomes the limitations that previous studies reported in this field.

    Indeed, we included results from five network-based methods in the analysis (see the last para of “Estimation of discovery rates”). While the results significantly overlap, we cannot comprehensively evaluate performance in terms of e.g. false positive rates. Instead, it is the data context that distinguished NEAdriver: it uses only mutation lists per sample, can work on individual samples, and does not require information on transcription, methylation etc.

  2. Evaluation Summary:

    In this work, Petrov and Alexeyenko present a novel network-based method, NEADriver, aimed at the identification of mutational (point mutations and copy number variants) driver genes across tumors. The authors evaluate ten large cancer cohorts and assess the overlap of their results with established cancer genes or datasets that are enriched for cancer genes. This manuscript addresses a topic of high interest in the cancer genomics community.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

  3. Reviewer #3 (Public Review):

    The study by Petrov and colleagues examined whether rare cancer drivers can be examined in a network context. For this purpose, the authors develop a new computational tool that is based on two "channels" (MutSet and PathReg) to provide evidence on whether a gene might reflect a driver gene. Based on these channels, they evaluate ten large cancer cohorts and assess the overlap of their results with established cancer genes or datasets that are enriched for cancer genes. Based on this comparison, they find a strong enrichment for known cancer genes.

    In my opinion, the study addresses an important point. Indeed, many discovery algorithms have been based on mutational recurrence. While these strategies robustly identify the most frequently mutated cancer genes, they yield diminishing returns for rare driver genes so that several magnitudes of large datasets would be required for identification of rare driver genes. Therefore, network-based identification of rare driver genes could be a useful criterion to identify rare driver genes, for instance, based on their interaction with canonical drivers. If could have an important impact on diagnostics and therapeutic decision making.

    While this idea is intriguing, it is not entirely novel. For more than a decade, mutation data in TCGA have been viewed in networks and many previous studies have tried to identify driver genes based on networks. I think a critical point would be to compare the authors' methods against these previous approaches and to demonstrate that it overcomes the limitations that previous studies reported in this field. Also, it was unclear to me whether the authors were able to achieve their goal to identify genes based on network contexts - i.e., is there a new class of driver genes that can be identified based on their approach that could not be understood based on previous studies? Alternatively, could this method/strategies be expanded to predict other phenotypes than driver genes.

    In sum, the study provides a very interesting approach to the discovery of rare driver genes. The authors have invested a lot of work to perform many technical validation analyses of their approach.

  4. Reviewer #2 (Public Review):

    Petrov et al present NEAdriver, a network-based method aimed at the identification of mutational (point mutations and copy number variants) driver genes across tumors. This is a timely subject, which constitutes one of the main aims of cancer genomics.

    I have two main lines of criticism to the paper. The first concerns the algorithm itself, which I feel is not thoroughly explained and described. The second concern is results which I find insufficiently described and incompletely validated from my point of view.

    My main concern about the algorithm is that as far as I can tell, the authors don't correct for the background mutation rate of genes in the calculation of the MutSet. This may result in identifying false positive driver genes with abnormally high nuber of mutations across cohorts due to known covariates of the mutation rate, such as the replication time or the level of transcription. Correctly estimating the background mutation rate of genes across tumors is a key tenet of methods that search for signals of positive selection in genes. For further explanation on this subject see https://doi.org/10.1038/nature12213, https://doi.org/10.1016/j.cell.2017.09.042. This may explain why known highly mutated non driver genes like TTN, RYR1 and others appear as significant recurrently across different cohorts. This issue is key for any method that uses mutation data to identify driver genes and must be addressed by the authors.

  5. Reviewer #1 (Public Review):

    Here, Petrov and Alexeyenko tackle one of the main questions in cancer genomics: Given that there are tumours that do not have any known driver mutations, how can we find novel, undiscovered cancer genes? For this purpose, they develop a method to identify these drivers that is not based on mutational frequency (which is commonly used) and which instead relies on functional networks. The method seems interesting and potentially useful, however, at the moment it is hard to follow the details as it is written very technically and, to this reader, is confusing at times. Additionally, it seems that there are some important details missing in the explanation of the methodology, as well as efforts to validate the results. In my opinion this can be a valuable addition to the literature, but needs to be more clearly explained and at least a few novel genes validated experimentally if possible (I am not saying the authors need to perform these experiments, it could be from data of other papers, but where these genes were found to play a relevant role).