Most cancers carry a substantial deleterious load due to Hill-Robertson interference

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    Cancers have frequently been found to show little evidence for purifying selection in their patterns of mutations. The key observation here is that tumors with low mutation burden show compelling evidence of efficient selection, but that tumors with high mutation burden do not. This is an important finding. The broader implication is that high mutation load tumors carry a substantial deleterious mutation load and may use common strategies to tolerate them, possibly providing a therapeutic target. Overall this work makes important observational and conceptual contributions to cancer genomics.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Cancer genomes exhibit surprisingly weak signatures of negative selection (Martincorena et al., 2017; Weghorn, 2017). This may be because selective pressures are relaxed or because genome-wide linkage prevents deleterious mutations from being removed (Hill-Robertson interference; Hill and Robertson, 1966). By stratifying tumors by their genome-wide mutational burden, we observe negative selection ( dN / dS ~ 0.56) in low mutational burden tumors, while remaining cancers exhibit dN / dS ratios ~1. This suggests that most tumors do not remove deleterious passengers. To buffer against deleterious passengers, tumors upregulate heat shock pathways as their mutational burden increases. Finally, evolutionary modeling finds that Hill-Robertson interference alone can reproduce patterns of attenuated selection and estimates the total fitness cost of passengers to be 46% per cell on average. Collectively, our findings suggest that the lack of observed negative selection in most tumors is not due to relaxed selective pressures, but rather the inability of selection to remove deleterious mutations in the presence of genome-wide linkage.

Article activity feed

  1. Evaluation Summary:

    Cancers have frequently been found to show little evidence for purifying selection in their patterns of mutations. The key observation here is that tumors with low mutation burden show compelling evidence of efficient selection, but that tumors with high mutation burden do not. This is an important finding. The broader implication is that high mutation load tumors carry a substantial deleterious mutation load and may use common strategies to tolerate them, possibly providing a therapeutic target. Overall this work makes important observational and conceptual contributions to cancer genomics.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    Tilk et al investigate why cancer genomes show weak negative selection. They set out to differentiate between two scenarios whether selective pressures are relaxed during the course of tumor progression or whether selection is inefficient because of evolution without recombination thus genome-wide linkage resulting in interference among mutations referred to as Hill-Robertson interference. They calculate dN/dS for driver and passenger mutations in 50 cancer types with different levels of genome-wide mutational burden and show that passenger mutations exhibit negative selection and driver mutations show positive selection in low mutational burden tumors. The strength of selection is weakened in tumors with increasing mutational burden. The findings that the selection for passenger mutations is weak in high mutational burden tumors is novel and has not been shown before. Authors show this to be true for somatic copy number aberrations containing drivers versus passengers. Clonal mutations showed stronger selection than sub-clonal mutations. The accumulation of deleterious passenger mutations is buffered by upregulation of expression of genes encoding chaperones and the proteasome. Authors conclude that Hill-Robertson Interference can largely explain the weakened selection in drivers and passengers, which is also supported by their evolutionary model and they predict that cancer cells obtain fitness advantage from drivers by 130% and fitness cost from passengers of 40% conferring on cancer cells a net fitness advantage of 90%. This is an elegant study and the manuscript is well-written and logical. However, some aspects of the analyses require clarification.

    1. Figure panels should be called out sequentially. For example, Fig. 2G is called out before Fig. 2D. This happens throughout the text, including main and supplementary figures, and should be corrected.
    2. Fig. 2G shows that mean gene expression of genes encoding chaperones and the proteasome increases with increasing mutational burden. What about protein abundance? Is this in agreement with gene expression?
    3. Fig. 2 mentions error bars in the figure legend, but no panel displays error bars. This is also true for Fig. S13 and other figures. Authors should display the error bars to which they are referring to make their analysis more convincing.
    4. Pg. 9 line 295 describes results of the analysis across genes belonging to different GO terms. However, Fig. S13 only shows 3 categories: chromosome segregation, transcription and translation. How were these categories chosen? What about other categories? Such cherry picking doesn't convincingly support the conclusions that no specific GO functions are enriched. Also, translational regulation shows higher dN/dS in low mutation tumors suggesting that there is positive selection for passengers in this category. Authors should discuss in their manuscript why this is the case.
    5. Fig. S15 shows the attenuation in selection of CNAs across cancer subtypes and broad cancer groups. However, HNSC and kidney cancer appear to be the exceptions. Authors should provide an explanation for these observations in the main text.
    6. Generally, copy number variations are considered to be > 50 bp. Is there a rationale as to why authors chose 100 kb to be their cut-off in Fig. 2C? If the size of CNA is an important parameter, then authors should explain why that is.
    7. Non-allelic recombination and non-homologous recombination mechanisms involving replication accidents that lead to chromosome breakage occur with some frequency in somatic cells. How does the frequency of these events impact the selection efficiency in cancer as it relates to drivers and passengers? Can this also be incorporated in their evolutionary model?
    8. Authors mentioned that haploinsufficiency was not used in the model. What about loss of heterozygosity which is extensive in cancer genomes? Can this parameter be included in the evolutionary model and how would it impact the results?

  3. Reviewer #2 (Public Review):

    In this work, the authors aimed to investigate the dependence of estimated strengths of negative and positive selection in human cancer tumors on the mutation rate. The underlying hypothesis is that interference selection affects tumor evolution by reducing the efficacy of selection when mutation rates are high enough. The authors present analyses of DNA sequencing data from tumor whole-exome and whole-genome experiments combined with ABC modeling of tumor evolution and inference of the tumor mode of evolution.

    The authors make use of all the existing largest DNA sequencing datasets for human tumors. They apply their version of an established statistic to estimate selection, dN/dS, and derive a proxy measure for tumor mutation rate, namely the total number of nonsynonymous + synonymous mutations. The authors also repeat their analysis using a different method that computes dN/dS. They find that tumors with three protein-coding-sequence mutations or less show a significantly reduced dN/dS ratio, compatible with negative selection. This is a rare occasion of the detection of a signal of negative selection in cancer. The authors continue to demonstrate that, according to their ABC simulations, the main cause for attenuated negative selection in all other mutation count bins is Muller's ratchet evolutionary dynamics rather than hitchhiking of deleterious passenger mutations. They then speculate that also the positive selection signal is primarily affected by deleterious passenger mutations. I think the paper contributes an important and timely piece of analysis to a topic of intensive research in cancer genomics. At the same time, some of the methodology might need to be revised and the conclusions drawn about the tumor mode of evolution are not fully convincing.

    1. The fundamental hypothesis about cancer evolutionary dynamics that the authors put forward depends on mutation rates (i.e. number of mutations per cell division), which cannot be measured per se. Mutation rate is instead approximated by the total number of observed protein-coding mutations (nonsynonymous + synonymous), which I will denote by n+s. To first approximation, n+s is proportional to the product of the mutation rate and the number of cell divisions that took place before the tumor was sequenced. The latter is dependent on many factors, most importantly patient age and tissue-specific cell division rates per year. These confounding factors of using n+s as a proxy for mutation rate have not been accounted for. However, at least patient age is a readily available piece of information for most if not all tumor samples. Estimates for tissue-specific stem cell turnover rates (which will be highly variable across different n+s bins due to the heterogeneous cancer type composition) also exist, but might be subject to larger uncertainties.

    2. As the authors mention, not every cancer type is represented in each n+s bin. Following up on the previous point, which cancer types are enriched in the bins that exhibit the negative selection signal? Are they on average similar to the cancer types in the remainder of the n+s range with respect to expected tissue-specific mutation rate and age distribution?

    3. The authors state that they are excluding tumors with either n=0 or s=0. Since nonsynonymous and synonymous variants occur in a ratio of about 3:1, this exclusion of tumors will lead to an inflation of the signal of negative selection in the first bins.
    - First, in l. 184 they write that in each n+s bin and for a given gene set, all mutations in the two functional categories (nonsynonymous, synonymous) were first pooled across tumors and then the ratio was taken. Is this correct? In that case, I do not see the need to exclude tumors with n=0 or s=0 from the analysis.
    - Second, the authors state that in Fig. 5D they have investigated the impact of this on the main result and shown that it is negligible, yet Fig. 5D looks substantially different from Fig. 2A. Most surprisingly, the negative selection signal appears to have even increased compared to Fig. 2A. In addition, Fig. 5D shows that there is a striking signal of negative selection across the entire range of n+s, which seems to even increase for very high n+s (similarly in Fig. 2D as well as in Figs. S6, 8). This appears to be incompatible with the main result of the paper (and Fig. 2A) that negative selection is overwhelmed by interference selection effects for high n+s. Similarly, is there a good explanation for why even cancer driver genes exhibit significant negative selection for high n+s in Fig. 5D?

    4. In my opinion, the most compelling argument for the decline of dN/dS on driver genes is ascertainment bias: We never observe tumor cells that did not at least have some selective growth advantage compared to a healthy cell. If I understood correctly, the authors argue that the shape of the decline of dN/dS in drivers cannot be explained by ascertainment bias, as this would lead to a steeper decline. Instead, they conclude that the effect is driven primarily by deleterious passenger load. However, there is a multitude of factors influencing the derivation of the curve. These include, as the authors also mention, the possibility that the pan-cancer set of used driver genes contains many genes on which mutations are in fact neutral passengers for the cancer types in a given n+s bin. It is well known that what constitutes a driver is highly tissue-specific (see e.g. the COSMIC cancer gene census). In addition, several cancer types are partially driven by translocation or fusion events, which would not be picked up in an approach based on point mutations. Hence, to exclude the ascertainment bias hypothesis as the primary explanatory factor for the shape of the function dN/dS(n+s) on driver genes in favor of deleterious passenger load does not seem reasonable to me.

    5. Further related to the point above, in l. 967 the authors state that driver mutations increase with n+s and argue that this is indicative of the necessity to compensate deleterious passengers. However, e.g. in Martincorena et al. (2017), it is shown that the rate of acquisition of driver mutations decreases with increasing mutation load, i.e. their number increases sublinearly with n+s, compatible with the multi-hit model.

  4. Reviewer #3 (Public Review):

    The authors explored the net patterns of selection in cancers as measured from tumor:normal exome and whole genome sequencing data. They found that by stratifying tumors on total mutation load, tumors with a low mutation burden exhibited net diversifying selection on previously identified oncogenic driver genes and net purifying selection on non-driver genes. Somewhat counter-intuitively both of these patterns decayed with increasing total mutation burden to the point where for tumors with the highest mutation burden, no net selection signals were identifiable. These findings were replicated using two dN/dS based approaches (with distinct means of defining the null expectation) and also using structural rearrangements as an orthogonal approach. The findings seem well demonstrated.

    The proposed explanation for these observations is that of Hill-Roberson interference, where the (almost) perfect linkage disequilibrium of the whole genome in a clonally expanding population of cells provides little opportunity to separate mutations of opposing fitness effects leading to the accumulation of deleterious mutations without opportunity for their removal by selection. An important implication of this conclusion is that tumors, particularly those with a high mutation load, carry a high burden of deleterious mutations.

    The modelling of clonal evolution demonstrates that Hill-Robertson like processes can in principal explain the decay of selection signals wither a high mutation burden, though this modelling by the authors own admission has lax parameter constraints and are gross simplifications of reality. As a proof of principal this modelling seems sufficient, and the estimated fitness effects appropriately qualified as "highly provisional".

    The authors present the up-regulation of heat-shock/chaperone/protein-degradation pathways as a plausible mechanism through which cancers could manage the accumulation of many deleterious mutations and provide correlative evidence for increased expression of such genes in tumors with higher mutation burdens (Fig 2G). By considering only one such scenario the authors are perhaps placing too much emphasis on that one mechanistic hypothesis for (amino acid changing) mutational tolerance. Other plausible mechanisms include suppression of epitope presentation (adaptive immune evasion), replication stress etc.

    Understanding that tumors carry substantial deleterious mutation loads and some prelimiary quantitative estimates of that will be of broad interest to the cancer genomics and also wider fields. The preprint is already being cited and found to be useful. The work also raises an important question - what are the main mechanisms employed to tolerate that deleterious mutation load, if there are predominant mechanisms such as the proposed protein-misfolding response, they become interesting targets for therapeutic suppression in a broad spectrum of cancers.