Integrative analysis of large-scale loss-of-function screens identifies robust cancer-associated genetic interactions

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

Genetic interactions, including synthetic lethal effects, can now be systematically identified in cancer cell lines using high-throughput genetic perturbation screens. Despite this advance, few genetic interactions have been reproduced across multiple studies and many appear highly context-specific. Here, by developing a new computational approach, we identified 220 robust driver-gene associated genetic interactions that can be reproduced across independent experiments and across non-overlapping cell line panels. Analysis of these interactions demonstrated that: (i) oncogene addiction effects are more robust than oncogene-related synthetic lethal effects; and (ii) robust genetic interactions are enriched among gene pairs whose protein products physically interact. Exploiting the latter observation, we used a protein–protein interaction network to identify robust synthetic lethal effects associated with passenger gene alterations and validated two new synthetic lethal effects. Our results suggest that protein–protein interaction networks can be used to prioritise therapeutic targets that will be more robust to tumour heterogeneity.

Article activity feed

  1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    Reviewer #1 (Evidence, reproducibility and clarity (Required)):

    ***Summary:**

    Reproducibility of genetic interactions across studies is low. The authors identify reproducible genetic interactions and ask the question of what are properties of robust genetic interactions. They find that 1. oncogene addiction tends to be more robust than synthetic lethality and 2. genetic interactions among physically interacting proteins tend to be more robust. They then use protein-protein interactions (PPIs) to guide the detection of genetic interactions involving passenger gene alterations.

    **Major comments:**

    The claims of the manuscript are clear and well supported by computational analyses. My only concern is the influence of (study) bias on the observed enrichment of physical protein interactions among genetic interactions. 1. Due to higher statistical power the here described approach favors genetic interactions involving frequently altered cancer genes (as acknowledged by the authors). 2. Also some of the libraries in the genetic screens might be biased towards better characterized screens. 3. PPI networks are highly biased towards well studied proteins (in which well studied proteins - in particular cancer-related proteins - are more likely to interact). The following tests would help to clarify if and to which extend these biases contribute to the described observations:*

    Our response: We thank the reviewer for the positive assessment of our manuscript and have addressed the issue of study bias in response to the specific queries below.

    1 . The authors should demonstrate that the PPI enrichment in reproducible vs non-reproducible genetic interactions is not solely due to the biased nature of PPI networks. One simple way of doing so would be to do the same analysis with a PPI network derived from a single screen (eg PMID: 25416956). I assume that due to the much lower coverage the effect will be largely reduced but it would be reconfirming to see a similar trend in addition to the networks on which the authors are already testing. Another way would be to use a randomized network (with the same degree distribution as the networks the authors are using and then picking degree matched random nodes) in which the observed effect should vanish.

    Our response: We appreciate the reviewer’s point and have now assessed both of the suggested approaches.

    The overlap with unbiased yeast two-hybrid (y2h) screens, even the recent HuRI dataset (Luck et al, Nature 2020), was too small in scale to draw any conclusions. Among the ~140,000 interactions tested for genetic interactions, only 51 overlap with y2h interactions. Two of the discovered genetic interactions were supported by a y2h interaction, while one of the robust genetic interactions was supported by a y2h interaction. While this is actually more than would be expected based on the overlap of interactions in the test space the numbers are not especially convincing.

    We therefore focused on two alternative assessments. We first compared our results with the network derived from the systematic AP-MS mapping of protein interactions in HEK293 cells (BioPlex 3.0, Huttlin et al, Biorxiv 2020). We restricted our analysis of genetic interactions to gene pairs that could conceivably be observed in the BioPlex dataset (i.e. between baits screened and preys expressed in HEK293T). We found that although the numbers were small, the same pattern of enrichment was observed:

    This analysis has now been added to the revised manuscript as Supplementary Table S4 and Figure S3E (shown below):

    We next compared the results we observed with the real STRING protein-protein interaction network to 100 degree-matched randomisations of this network. We observed that the number of discovered and validated genetic interactions observed using the real STRING interaction network was greater than that observed using the randomised networks. With this in mind, we have now revised the manuscript to state:

    ‘Previous work has demonstrated that the protein-protein interaction networks aggregated in databases are subject to significant ascertainment bias – some genes are more widely studied than others and this can result in them having more reported protein-protein interaction partners than other genes(Rolland et al., 2014). As cancer driver genes are studied more widely than most genes, they may be especially subject to this bias. To ensure the observed enrichment of protein-protein interactions among genetically interacting pairs was not simply due to this ascertainment bias, we compared the results observed for the real STRING protein-protein interaction network with 100 degree-matched randomised networks and again found that there was a higher than expected overlap between protein-protein interactions and both discovered and validated genetic interactions (Supplemental Fig. S4).’

    __Supplemental Figure S4. Genetic interactions are more enriched in real protein-protein interaction networks than randomised networks. __Histograms showing the overlap between 100 degree matched randomisations of the STRING medium confidence protein-protein interaction and discovered (a and b) and validated (c and d) genetic interactions. The observed overlap with the real STRING protein interaction are highlighted with the orange lines.

    2 . What's the expected number of robust genetic interactions involving passenger gene alterations? Is it surprising to identify 11 interactions? This question could be addressed with some sort of randomization test: When selecting (multiple times) 47,781 non-interacting random pairs between the 2,972 passenger genes and 2,149 selectively lethal genes, how many of those pairs form robust genetic interactions?

    Our response: We have now addressed this as follows:

    “At an FDR of 20% we found 11 robust genetic interactions involving passenger gene alterations (Supplemental Table S6). To assess whether this is more than would be expected by chance we randomly sampled 47,781 gene pairs from the same search space 100 times. The median number of robust genetic interactions identified amongst these randomly sampled gene pairs was 1 (mean 1.27, min 0, max 6) suggesting that the 11 robust genetic interactions observed among protein-protein interacting pairs was more than would be expected by chance.”

    **Minor comments:**

    Two additional analyses would add in my opinion value to the manuscript:

    -The authors state that reasons for irreproducibility of genetic interactions are of technical or biological nature. Is it possible to disentangle the contribution of the two factors given the available data? Eg how many genetic interactions are reproducible in two different screening platforms using the same cell line vs how similar are results of screens from two different cell lines in the same study?

    __Our response: __We are also very interested in this question, but with the available data, we are not confident that we could draw solid conclusions.

    -The authors state that "some of the robust genetic dependencies could be readily interpreted using known pathway structures" and argue that they recover for example MAPK or Rb pathway relationships. Is this a general trend? Do genes forming a robust genetic interactions have a higher tendency to be in the same pathway as opposed to different pathways? *

    __Our response: __We have now systematically tested the robust genetic interactions for each driver gene for enrichment in specific pathways. Relevant text is as follows:

    ‘To test if this enrichment of pathway members among the robust dependencies associated with specific driver genes was a common phenomenon, for each driver gene with at least three dependencies we asked if these dependencies were enriched in specific signalling pathways (see Methods). Of the twelve driver genes tested, we found that five of these were enriched in specific pathways and in all five cases found that the driver gene itself was also annotated as a member of the most enriched pathway (Table SX). As expected* RB*1 (most enriched pathway ‘G1 Phase’) and *BRAF *(most enriched pathway ‘Negative feedback regulation of MAPK pathway’) were among the five driver genes, alongside *PTEN *(‘PI3K/AKT activation’), *CDKN2A *(‘Cell cycle’), and *NRAS *(‘Ras signaling pathway’).’

    Details in the methods are as follows:

    ‘Pathway enrichment was assessed using gProfiler (Raudvere et al., 2019) with KEGG (Kanehisa et al., 2017) and Reactome (Jassal et al., 2020) as annotation databases and the selectively lethal genes as the background list.’

    I think the pathway topic could be in general better exploited: eg does pathway (relative) position play a role?*

    __Our response: __We agree that pathway position, especially distance from driver gene in an ordered pathway, would be very interesting to tease out but we don’t think that current pathway annotations are reliable enough nor the set of robust genetic interactions large enough to analyse this properly.

    Reviewer #1 (Significance (Required)):*

    Personalized cancer medicine aims at the identification of patient-specific vulnerabilites which allow to target cancer cells in the context of a specific genotype. Many oncogenic mutations cannot be targeted with drugs directly. The identification of genetic interactions is therefore of crucial importance. Unfortunately, genetic interactions show little reproducibility accross studies. The authors make an important contribution to understanding which factors contribute to this reproducibility and thereby providing means to also identify more reliable genetic interactions with high potential for clinical exploitation or involving passenger gene alterations (which are otherwise harder to detect for statistical reasons).

    REFEREES CROSS COMMENTING

    Reviewer 2 raises a few valid points, which if addressed would certainly increase the clarity of the paper. In particular addressing the first point (the self interactions of tumor suppressors) seems important to me. From what I can see all of reviewer 2's comments can be addressed easily.

    End of Reviewer 1 comments

    __ __

    Reviewer #2 (Evidence, reproducibility and clarity (Required)):

    *In this manuscript, Lord et al. describe the analysis of loss-of-function (LOF) screens in cancer cell lines to identify robust (i.e., technically reproducible and shared across cell lines) genetic dependencies. The authors integrate data from 4 large-scale LOF studies (DRIVE, AVANA, DEPMAP and SCORE) to estimate the reproducibility of their individual findings and examine their agreement with other types of functional information, such as physical binding. The main conclusions from the analyses are that: a) oncogene-driven cancer cell lines are more sensitive to the inhibition of the oncogene itself than any other gene in the genome; b) robust genetic interactions (i.e., those observed in multiple datasets and cell lines driven by the same oncogene/tumour suppressor) are enriched for gene pairs encoding physically interacting proteins.

    **Main comments:**

    I think this study is well designed, rigorously conducted and clearly explained. The conclusions are consistent with the results and I don't have any major suggestions for improving their support. I do, however, have a few suggestions for clarifying the message.

    Our response: We thank the reviewer for this positive assessment of our manuscript and have addressed the requests for clarity below.

    -Could the authors provide some intuitive explanation (or speculation) about the 2 observed cases of tumour suppressor "addiction" (TP53 and CDKN2A)? While the oncogene addiction cases are relatively easy to interpret, the same effects on tumour suppressors are less clear. Is it basically an epistatic effect, which looks like a relative disadvantage? For example, if we measure fitness: TP53-wt = 1, TP53-wt + CRISPR-TP53 = 1.5, TP53-mut = 1.5, TP53-mut + CRISPR-TP53 = 1.5. That is, inhibiting TP53 in TP53 mutant cells appears to be disadvantageous (relative to WT) only because inhibiting TP53 in wild-type cells is advantageous?

    __Our response: __The reviewer is correct – the TP53 / TP53 dependency is similar to an epistatic effect. In a TP53 mutant background targeting TP53 with shRNA or CRISPR has a neutral effect, while in a TP53 wild type background targeting TP53 with shRNA or CRISPR often causes an increase in cell growth. We have clarified this in the text below (new text in bold)

    ‘We also identified two (2/9) examples of ‘self vs. self’ dependencies involving tumour suppressors -*TP53 *(aka p53) and CDKN2A (aka p16/p14arf) (Supplemental Fig. S2c). This type of relationship has previously been reported for TP53: *TP53 *inhibition appears to offer a growth advantage to TP53 wild type cells but not to *TP53 *mutant cells(Giacomelli et al., 2018). __Inhibiting TP53 in TP53 mutant cells has a largely neutral effect, while on average inhibiting TP53 in TP53 wild type cells actually increases fitness growth. __Consequently, we observed an association between TP53 status and sensitivity to TP53 inhibition. A similar effect was observed for CDKN2A, although the growth increase resulting from inhibiting CDKN2A in wild-type cells is much lower than that seen for TP53 (Supplemental Fig. S2c).;

    *-In the analysis of overlap between genetic and physical interactions, the result should be presented more precisely. Currently, the text reads "when considering the set of all gene pairs tested, gene pairs whose protein products physically interact were more likely to be identified as significant genetic interactors". However, the referenced figure (Fig. 5a) shows an orthogonal perspective: relative to all gene pairs tested, those that have a significant genetic interaction are more likely to have a physical interaction as well. In other words, in the text, we are comparing the relative abundance of genetic interactions in 2 sets: tested and physically interacting. However, in the figure, we are comparing the relative abundance of protein interactions in 2 sets -- tested and genetically interacting. The odds ratio and the p-values stay the same but the result would be more clear if the figure matched the description in the text.

    __Our response: __Due to the fact that genetic interactions are rare (~1% of all gene pairs tested have a discovered genetic interaction, ~0.1% have a validated genetic interaction) it’s hard to convey the enrichment effectively. This is demonstrated in the below figure – it’s clear that there are more discovered / validated genetic interaction pairs among the protein-protein interaction pairs but the scale is hard to appreciate:

    Focusing only on the discovered/validated genetic interactions makes the picture a little clearer but does not effectively show that the discovered pairs themselves are enriched among protein-protein interaction pairs

    As we feel the original figures convey the main message most effectively, we have altered the text rather than the images as follows:

    “We found that, when considering the set of all gene pairs tested, gene pairs identified as significant genetic interactors in at least one dataset are more likely to encode proteins that physically interact (Fig. 5a)”

    ***Minor comments:**

    There're a few places where the more explicit explanation would improve the readability of the manuscript.

    -Page 5: The multiple regression model used to identify genetic interactions is briefly mentioned in the text (and described more extensively in the methods). I think it would be better to explicitly describe the dependent and independent variables of the model in the text, so that the reader can intuitively understand what is being estimated*.

    Our response: We have added additional information to the main text as follows:

    ‘This model included tissue type, microsatellite instability and driver gene status as independent variables and gene sensitivity score as the dependent variable (Methods). Microsatellite instability was included as a covariate as it has previously been shown to be associated with non-driver gene specific dependencies (Behan et al., 2019), while tissue type was included to avoid confounding by tissue type.’*

    -Page 5: "Using this approach, we tested 142,477 potential genetic dependencies..." -- could the authors provide a better explanation of where that number is coming from? E.g., 142,477 = ... driver genes x 2470 selectively lethal genes?*

    Our response: Because not every selectively lethal gene is tested in every dataset (e.g. DRIVE only screened ~8,000 genes instead of the whole genome) the 142,477 number does not correspond to a simple multiplication of number of driver genes times number of selectively lethal gene. However, we have added additional information in bold as follows:

    ‘Using this approach, we tested 142,477 potential genetic dependencies between 61 driver genes and 2,421 selectively lethal genes. We identified 1,530 dependencies that were significant in at least one discovery screen (Fig. 2a, Supplemental Fig. S1). All 61 driver genes had at least one dependency that was significant in at least one discovery screen while less than half of the selectively lethal genes (1,141 / 2,421) had a significant association with a driver gene. Of the 1,530 dependencies that were significant in at least one discovery screen, only 229 could be validated in a second screen (Supplemental Table S3, Fig. 2a). For example, in the AVANA dataset TP53 mutation was associated with resistance to inhibition of both MDM4 and CENPF, but only the association with MDM4 could be validated in a second dataset (Fig. 2b, 2c). Similarly, in the DEPMAP dataset *NRAS *mutation was associated with increased sensitivity to the inhibition of both NRAS itself and ERP44, but only the sensitivity to inhibition of NRAS could be validated in a second dataset (Fig. 2b, 2c).

    The 229 reproducible dependencies involved 31 driver genes and 204 selectively lethal genes.’

    -Page 5: Repeating the number of findings of each type would help understanding the landscape of the genetic dependencies (suggested numbers in brackets): "Of the (229?) reproducible genetic dependencies nine were 'self vs self' associations". "The majority (7/9?) of these ... were oncogene addiction effects". "We also identified 2 (2/9?) examples of 'self vs self' dependencies involving tumour suppressors".

    __Our response: __We have taken the reviewer’s advice and added these figures to the main text for clarity

    -Page 12: "Three of these interactions involve genes frequently deleted with the tumour suppressor CDKN2A (CDKN2B and MTAP) and mirror known associations with CDKN2A". It is not clear what "mirror" means -- do they recapitulate known interactions?

    Our response: Yes, we meant to indicate that they recapitulate known CDKN2A interactions and have now replaced ‘mirror’ with ‘recapitulate’.

    -Page 15: "Although we have not tested them here, other features predictive of between-species conservation may also be predictive of robustness to genetic heterogeneity" -- could the authors explicitly list the features?*

    __Our response: __We have now explicitly listed these features as follows:

    “Previous work has also shown that genetic interactions between gene pairs involved in the same biological process, as indicated by annotation to the same gene ontology term, are more highly conserved across species (Ryan et al., 2012; Srivas et al., 2016). Similarly, genetic interactions that are stable across experimental conditions (e.g. that can be observed in the presence and absence of different DNA damaging agents) are more likely to be conserved across species (Srivas et al., 2016). Although we have not tested them here, these additional features predictive of between-species conservation may also be predictive of robustness to genetic heterogeneity.”

    *Reviewer #2 (Significance (Required)):

    The identification of a significant overlap between genetic and physical interactions in cancer cell lines is an interesting and promising observation that will help understanding known genetic dependencies and predicting new ones. However, similar observations have been made in other organisms and biological systems. These past studies should be referenced to provide a historical perspective and help define further analyses in the cancer context. In particular, studies in yeast S. cerevisiae have shown that, not only there is a general overlap between genetic interactions (both positive and negative) and physical interactions, but at least 2 additional features are informative about the relationship: a) the relative strength of genetic interactions and b) the relative density of physical interactions (i.e., isolated interaction vs protein complexes). Here's a sample of relevant studies: 1) von Mering et al., Nature, 2002; 2) Kelley & Ideker, Nat Biotechnol, 2005; 3) Bandyopadhyay et al., PLOS Comput Biol, 2008; 4) Ulitsky et al., Mol Syst Biol, 2008; 5) Baryshnikova et al., Nat Methods, 2010; 6) Costanzo et al., Science, 2010; 7) Costanzo et al., Science, 2016.

    Similar observations have also been made in mammalian systems: e.g., in mouse fibroblasts (Roguev et al., Nat Methods, 2013) and K562 leukemia cells (Han et al., Nat Biotech, 2017). I don't think that past observations negate the novelty of this manuscript. The analysis presented here is more focused and more comprehensive as it is based on a large integrated dataset and is driven by a series of specific hypotheses. However, a reference to previous publications should be made.

    As a frame of reference: my expertise is in high-throughput genetics of model organisms, including mapping and analyzing genetic interactions.

    __Our response: __We thank the reviewer for highlighting this point.

    We have attempted to provide better context for our work in the discussion as follows:

    ‘In budding and fission yeast, multiple studies have shown that genetic interactions are enriched among protein-protein interaction pairs and vice-versa (Costanzo et al., 2010; Kelley and Ideker, 2005; Michaut et al., 2011; Roguev et al., 2008). Pairwise genetic interaction screens in individual mammalian cell lines have also revealed an enrichment of genetic interactions among protein-protein interaction pairs (Han et al., 2017; Roguev et al., 2013). Our observation that discovered genetic interactions are enriched in protein-protein interaction pairs is consistent with these studies. However, these studies have not revealed what factors influence the conservation of genetic interactions across distinct genetic backgrounds, i.e. what predicts the robustness of a genetic interaction. In yeast, the genetic interaction mapping approach relies on mating gene deletion mutants and consequently the vast majority of reported genetic interactions are observed in a single genetic background (Tong et al., 2001). In mammalian cells, pairwise genetic interaction screens across multiple cell lines have revealed differences across cell lines but not identified what factors influence the conservation of genetic interactions across cell lines(Shen et al., 2017). While variation of genetic interactions across different strains or different genetic backgrounds has been poorly studied, previous work has analysed the conservation of genetic interactions across species and shown that genetic interactions between gene pairs whose protein products physically interact are more highly conserved (Roguev et al., 2008; Ryan et al., 2012; Srivas et al., 2016). Our analysis here suggests that the same principles may be used to identify genetic interactions conserved across genetically heterogeneous tumour cell lines.’

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    In this manuscript, Lord et al. describe the analysis of loss-of-function (LOF) screens in cancer cell lines to identify robust (i.e., technically reproducible and shared across cell lines) genetic dependencies. The authors integrate data from 4 large-scale LOF studies (DRIVE, AVANA, DEPMAP and SCORE) to estimate the reproducibility of their individual findings and examine their agreement with other types of functional information, such as physical binding. The main conclusions from the analyses are that: a) oncogene-driven cancer cell lines are more sensitive to the inhibition of the oncogene itself than any other gene in the genome; b) robust genetic interactions (i.e., those observed in multiple datasets and cell lines driven by the same oncogene/tumour suppressor) are enriched for gene pairs encoding physically interacting proteins.

    Main comments:

    I think this study is well designed, rigorously conducted and clearly explained. The conclusions are consistent with the results and I don't have any major suggestions for improving their support. I do, however, have a few suggestions for clarifying the message.

    -Could the authors provide some intuitive explanation (or speculation) about the 2 observed cases of tumour suppressor "addiction" (TP53 and CDKN2A)? While the oncogene addiction cases are relatively easy to interpret, the same effects on tumour suppressors are less clear. Is it basically an epistatic effect, which looks like a relative disadvantage? For example, if we measure fitness: TP53-wt = 1, TP53-wt + CRISPR-TP53 = 1.5, TP53-mut = 1.5, TP53-mut + CRISPR-TP53 = 1.5. That is, inhibiting TP53 in TP53 mutant cells appears to be disadvantageous (relative to WT) only because inhibiting TP53 in wild-type cells is advantageous?

    -In the analysis of overlap between genetic and physical interactions, the result should be presented more precisely. Currently, the text reads "when considering the set of all gene pairs tested, gene pairs whose protein products physically interact were more likely to be identified as significant genetic interactors". However, the referenced figure (Fig. 5a) shows an orthogonal perspective: relative to all gene pairs tested, those that have a significant genetic interaction are more likely to have a physical interaction as well. In other words, in the text, we are comparing the relative abundance of genetic interactions in 2 sets: tested and physically interacting. However, in the figure, we are comparing the relative abundance of protein interactions in 2 sets -- tested and genetically interacting. The odds ratio and the p-values stay the same but the result would be more clear if the figure matched the description in the text.

    Minor comments:

    There're a few places where the more explicit explanation would improve the readability of the manuscript.

    -Page 5: The multiple regression model used to identify genetic interactions is briefly mentioned in the text (and described more extensively in the methods). I think it would be better to explicitly describe the dependent and independent variables of the model in the text, so that the reader can intuitively understand what is being estimated.

    -Page 5: "Using this approach, we tested 142,477 potential genetic dependencies..." -- could the authors provide a better explanation of where that number is coming from? E.g., 142,477 = ... driver genes x 2470 selectively lethal genes?

    -Page 5: Repeating the number of findings of each type would help understanding the landscape of the genetic dependencies (suggested numbers in brackets): "Of the (229?) reproducible genetic dependencies nine were 'self vs self' associations". "The majority (7/9?) of these ... were oncogene addiction effects". "We also identified 2 (2/9?) examples of 'self vs self' dependencies involving tumour suppressors".

    -Page 12: "Three of these interactions involve genes frequently deleted with the tumour suppressor CDKN2A (CDKN2B and MTAP) and mirror known associations with CDKN2A". It is not clear what "mirror" means -- do they recapitulate known interactions?

    -Page 15: "Although we have not tested them here, other features predictive of between-species conservation may also be predictive of robustness to genetic heterogeneity" -- could the authors explicitly list the features?

    Significance

    The identification of a significant overlap between genetic and physical interactions in cancer cell lines is an interesting and promising observation that will help understanding known genetic dependencies and predicting new ones. However, similar observations have been made in other organisms and biological systems. These past studies should be referenced to provide a historical perspective and help define further analyses in the cancer context. In particular, studies in yeast S. cerevisiae have shown that, not only there is a general overlap between genetic interactions (both positive and negative) and physical interactions, but at least 2 additional features are informative about the relationship: a) the relative strength of genetic interactions and b) the relative density of physical interactions (i.e., isolated interaction vs protein complexes). Here's a sample of relevant studies: 1) von Mering et al., Nature, 2002; 2) Kelley & Ideker, Nat Biotechnol, 2005; 3) Bandyopadhyay et al., PLOS Comput Biol, 2008; 4) Ulitsky et al., Mol Syst Biol, 2008; 5) Baryshnikova et al., Nat Methods, 2010; 6) Costanzo et al., Science, 2010; 7) Costanzo et al., Science, 2016.

    Similar observations have also been made in mammalian systems: e.g., in mouse fibroblasts (Roguev et al., Nat Methods, 2013) and K562 leukemia cells (Han et al., Nat Biotech, 2017). I don't think that past observations negate the novelty of this manuscript. The analysis presented here is more focused and more comprehensive as it is based on a large integrated dataset and is driven by a series of specific hypotheses. However, a reference to previous publications should be made.

    As a frame of reference: my expertise is in high-throughput genetics of model organisms, including mapping and analyzing genetic interactions.

    REFEREES CROSS COMMENTING

    I agree with the questions raised by reviewer #1. And I think the authors should be able to address them (either through analyses or reasoning) within 1-3 months.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Summary:

    Reproducibility of genetic interactions across studies is low. The authors identify reproducible genetic interactions and ask the question of what are properties of robust genetic interactions. They find that 1. oncogene addiction tends to be more robust than synthetic lethality and 2. genetic interactions among physically interacting proteins tend to be more robust. They then use protein-protein interactions (PPIs) to guide the detection of genetic interactions involving passenger gene alterations.

    Major comments:

    The claims of the manuscript are clear and well supported by computational analyses. My only concern is the influence of (study) bias on the observed enrichment of physical protein interactions among genetic interactions. 1. Due to higher statistical power the here described approach favors genetic interactions involving frequently altered cancer genes (as acknowledged by the authors). 2. Also some of the libraries in the genetic screens might be biased towards better characterized screens. 3. PPI networks are highly biased towards well studied proteins (in which well studied proteins - in particular cancer-related proteins - are more likely to interact). The following tests would help to clarify if and to which extend these biases contribute to the described observations:
    1 . The authors should demonstrate that the PPI enrichment in reproducible vs non-reproducible genetic interactions is not solely due to the biased nature of PPI networks. One simple way of doing so would be to do the same analysis with a PPI network derived from a single screen (eg PMID: 25416956). I assume that due to the much lower coverage the effect will be largely reduced but it would be reconfirming to see a similar trend in addition to the networks on which the authors are already testing. Another way would be to use a randomized network (with the same degree distribution as the networks the authors are using and then picking degree matched random nodes) in which the observed effect should vanish.

    2 . What's the expected number of robust genetic interactions involving passenger gene alterations? Is it surprising to identify 11 interactions? This question could be addressed with some sort of randomization test: When selecting (multiple times) 47,781 non-interacting random pairs between the 2,972 passenger genes and 2,149 selectively lethal genes, how many of those pairs form robust genetic interactions?

    Minor comments:

    Two additional analyses would add in my opinion value to the manuscript:

    -The authors state that reasons for irreproducibility of genetic interactions are of technical or biological nature. Is it possible to disentangle the contribution of the two factors given the available data? Eg how many genetic interactions are reproducible in two different screening platforms using the same cell line vs how similar are results of screens from two different cell lines in the same study?

    -The authors state that "some of the robust genetic dependencies could be readily interpreted using known pathway structures" and argue that they recover for example MAPK or Rb pathway relationships. Is this a general trend? Do genes forming a robust genetic interactions have a higher tendency to be in the same pathway as opposed to different pathways? I think the pathway topic could be in general better exploited: eg does pathway (relative) position play a role?

    Significance

    Personalized cancer medicine aims at the identification of patient-specific vulnerabilites which allow to target cancer cells in the context of a specific genotype. Many oncogenic mutations cannot be targeted with drugs directly. The identification of genetic interactions is therefore of crucial importance. Unfortunately, genetic interactions show little reproducibility accross studies. The authors make an important contribution to understanding which factors contribute to this reproducibility and thereby providing means to also identify more reliable genetic interactions with high potential for clinical exploitation or involving passenger gene alterations (which are otherwise harder to detect for statistical reasons).

    REFEREES CROSS COMMENTING

    Reviewer 2 raises a few valid points, which if addressed would certainly increase the clarity of the paper. In particular addressing the first point (the self interactions of tumor suppressors) seems important to me. From what I can see all of reviewer 2's comments can be addressed easily.