Expression-Driven Genetic Dependency Reveals Targets for Precision Medicine
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (GigaScience)
Abstract
Cancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC , while revealing new targets confirmed by both mRNA- and protein-expression driven dependency. Notably, the identified genes show an overall 3.8-fold enrichment for approved drug targets and enrich for druggable oncology targets by 7 to 10-fold. We experimentally validate that the depletion of GRHL2 , TP63 , and PAX5 effectively reduce tumor cell growth and survival in their dependent cells. Overall, we present the catalog of express-driven dependency targets as a resource for identifying novel therapeutic targets in precision oncology.
Article activity feed
-
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while …
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while revealing new targets confirmed by both mRNA- and protein-expression driven dependency. Notably, the identified genes show an overall 3.8-fold enrichment for approved drug targets and enrich for druggable oncology targets by 7 to 10-fold. We experimentally validate that the depletion of GRHL2, TP63, and PAX5 effectively reduce tumor cell growth and survival in their dependent cells. Overall, we present the catalog of express-driven dependency targets as a resource for identifying novel therapeutic targets in precision oncology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag011), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 4
Reproducibility report for: Expression-Driven Genetic Dependency Reveals Targets for Precision Oncology Journal: Gigascience ID number/DOI: GIGA-D-25-00147 Reviewer(s): Laura Caquelin, Department of Clinical Neuroscience, Karolinska Institutet, Sweden
- Summary of the Study The authors developed a Bayesian method called BEACON to integrate multi-omics data. The method was tested on cancer cell lines across 17 tissue types to identify expression- driven dependencies. The method recovered known drug targets and identified novel candidates. The study concludes this method provides a systematic approach to identify precision oncology targets.
- Scope of reproducibility According to our assessment the primary objective is: to identify expression-driven dependencies across cancer cell lines from multiple lineages enabling the discovery of genes whose expression levels correlate with cancer cell dependency scores.
- Outcome: Identification of genes with significant expression-driven dependencies across pan-lineage cancer cell lines.
- Analysis method outcome: "BEACON calculated the Bayesian correlation between the gene's expressions and CERES cancer dependency scores 25 across the pan-lineage cell lines. BEACON modeled expression levels and dependency scores as the bivariate Gaussians and used Markov Chain Monte Carlo (MCMC) sampling to estimate the correlation coefficient rho between them. Given the null hypothesis that the uncorrelated expression and dependency of a gene has the 0 rho coefficient, we statistically tested each gene's rho estimate obtained from the MCMC simulation as follows. Assume that the MCMC sampling is carried out for a null gene's expression and dependency, then we expect that the distribution of the rho estimate accumulated over the MCMC iterations will be centered at zero. Based on this rationale, we computed the z-score of i-th gene as the deviation of the MCMC estimate of rho from the expected (null) value (i.e., zero) in terms of the standard deviation observed in the simulated distribution, i.e., z(i) = rhoMCMC(i) / SDMCMC(i). Since the z-values, by nature, follow a normal distribution with zero-mean and unit-variance, then we computed the p- value for each gene's rho estimate as the probability of observing a value as extreme as the computed z-value for that gene. We multi-testing corrected the resulting p-values using the BH procedure for FDR." (page 19 -Methods section / mRNA expression-driven dependency (GED))
- Main result: "We first analyzed the pan-lineage GED by using mRNA levels and the corresponding dependency scores from 854 cell lines with available data across 17 lineages and identified 244 genes showing significant association (correlation coefficient, rho < -0.25, FDR < 0.05)" (page 7 - Results section / Cancer vulnerability targets showing gene expression-driven dependency (GED))
- Availability of Materials a. Data
- Data availability: Open
- Data completeness: Complete, all data necessary to reproduce main results are available.
- Access Method: Repository
- Repository: https://doi.org/10.6084/m9.figshare.19700056.v2 -Data quality: Structured
b. Code
- Code availability: Open
- Programming Language(s): R
- Repository link: https://github.com/Huang-lab/BEACON - License: MIT license
- Repository status: Public
- Documentation: Readme file
- Computational environment of reproduction analysis
- Operating system for reproduction: MacOS 15.5
- Programming Language(s): R
- Code implementation approach: Using shared code
- Version environment for reproduction: R version 4.5.0/RStudio 2025.05.1
- Results 5.1 Original study results
- Results 1: Supplementary table S2 5.2 Steps for reproduction -> Run the code PanLineageMCMC.R
- Issue 1: File import paths and incorrect file name -- Resolved: In the original code, there were fixed file paths that only worked on one specific computer. This caused problems when running the code on other computers. To fix this, I recommended to use relative paths, which are based on where the script is located. This way, the code can be run on any computer without needing to change the paths each time.
------------------ Start of script ------------------ sam.dep = read.csv(file.path(getwd(), "DepMap_data", "sample_info.csv")) ------------------- End of script -------------------
Issue 2: Missing function "intsect" at line 162 -- Resolved: The script called a function intsect that was not defined, leading to an error. Upon request, the authors provided the missing function and added it to the main script (PanLineageMCMC.R).
Issue 3: Output directory not created. -- Resolved: The script attempted to write output files to a directory that was not created beforehand. This caused errors during the loop execution when trying to save results. A directory check and automatic creation script was added. If the output folder does not exist, it is now created automatically before the loop runs.
------------------ Start of script ------------------ dir_path <- paste0('../out/jags.nadapt',n.adapt,'.update',n.update,'.mcmc ',n.iter,'.simulation_SD_22Q2') if (!dir.exists(dir_path)) { dir.create(dir_path, recursive = TRUE) } ------------------- End of script -------------------
5.3 Statistical comparison Original vs Reproduced results
- Results: Table.mRNA.dependency.Bayesian.pancancer file attached
- Comments: The Bayesian PanCancer analysis was re-run, but only on the 244 significant genes listed in Supplementary Table S2, not on the full set of 17 285 genes. This choice was made due to limited computational resources, as running the full model would have required an estimated 100 hours.
- Errors detected: -
- Statistical Consistency: Among the 244 significant genes originally reported, the reproduced analysis confirmed the statistical significance of these same genes. However, the exact numerical values (Mean, standard deviation, Z value, P-value and adjusted P-value) differed slightly. These discrepancies are expected due to the nature of Bayesian inference, the absence of a random seed, and the relatively low number of MCMC iterations used (n.iter = 500). These settings may not be sufficient to ensure full convergence or reproducibility of posterior estimates and should be interpreted with caution. We were unable to compare the rho values because they were not available in the provided Supplementary table S2, nor extracted in the R code to be include in the resulting output files.
- Conclusion
Summary of the computational reproducibility review The results of the Supplementary table S2 in the original study was partially reproduced. We were able to confirm the statistical significance of the 244 genes reported in Supplementary Table S2 using the Bayesian PanCancer model in the provided code. However, the numerical results were not always identical. This is expected because Bayesian methods involve random sampling, the original code did not set a fixed random seed, and the number of iterations used was relatively low. Furthermore, the rho values were not available for comparison, limiting a full reproducibility assessment. Several technical issues were also fixed during the reproduction process, such as hardcoded file paths, a missing function, and the absence of output directories, which were resolved to allow the code to run correctly on a different system. Due to computational limitations, running the full model on all 17,285 genes was not performed.
Recommendations for authors While the original analysis code was successfully used to confirm the statistical significance of the 244 genes, we recommend several improvements to enhance reproducibility: -- Code annotation: Adding more detailed comments within the scripts would help users understand the logic behind each step and the purpose of specific commands or operations. -- Set a random seed: Include set.seed() in all scripts to improve reproducibility across different runs. -- Specify R and package versions: Provide the R version and exact package versions needed to run the code, via a requirements file for example. -- Use relative file paths: Ensure that all necessary folders and functions are created or included by default to avoid path issues. -- Increase MCMC robustness: Use a higher number of iterations and appropriate parameter settings to ensure better convergence and stability of posterior estimates. -- Inform users about computation time: Clearly indicate in the README or publication the expected runtime of the code, especially if it requires several hours or days to complete.
-
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while …
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while revealing new targets confirmed by both mRNA- and protein-expression driven dependency. Notably, the identified genes show an overall 3.8-fold enrichment for approved drug targets and enrich for druggable oncology targets by 7 to 10-fold. We experimentally validate that the depletion of GRHL2, TP63, and PAX5 effectively reduce tumor cell growth and survival in their dependent cells. Overall, we present the catalog of express-driven dependency targets as a resource for identifying novel therapeutic targets in precision oncology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag011), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 3
The authors develop a method for correlating gene and protein expression with cellular dependencies using the resources of DepMap. The innovation appears to be a Bayesian approach to the correlation analysis. They use this approach to identify potential therapeutic targets and evaluate some top candidates using in vitro experiments. The paper is fairly straightforward to follow.
Major comments:
Benchmarking - given the non-linear relationships shown in Fig 2, is a comparison with the Pearson method the most appropriate? Would a Spearman's be better?
The analysis identifies dependencies that are proposed as therapeutic targets, however while the proteins can be druggable, what about normal tissue effects? Some of these are likely lineage-defining proteins that could be highly expressed in normal tissues. Is is notable that in Fig 5B, C that the existing drug targets have a lower association strength than other GEDs identified. Does this suggest that the strongest correlations might be lineage-crucial genes that are too important for normal tissue function to make good drug targets? This needs further consideration in the discussion. Are there any pathways differences between these groups (known drug targets vs others)? For example you might expect more tissue lineage Tfs in the "other" category, while the approved drug targets perhaps more cell surface receptors.
The cell assays performed should effectively be replicating the results of the dependencies on which BEACON is based (DepMap), so why do you get different results? Is it because of the different methods used ie shRNA (not seeing the correlation between expression and dependency) vs CRISPR (replicating the correlation)? If you look at older DepMap scores when they used knockdown rather than CRISPR can you replicate your results?
Although mycoplasma testing was done, were the cell lines re-authenticated by STR profiling at any point?
QPCR is mentioned n the methods but not provided in the results that I can find. Did this validate gene knockdown by shRNA? Any correlation between % KD and proliferation/colony forming effect?
In the discussion it should be acknowledged that cancer subtypes exist within lineages that are molecularly and clinically distinct and so the method might be missing targets specific for these eg ER+ and ER- breast cancer.
Minor comments:
Results para 1 "especially in multiple lineages where the number of cell lines." Missing something in this sentence?
Needs some grammar review
3, Please italicise all gene names (when referring to gene, not protein) eg CCNE1 amplification etc
Fig S5A - legend or axis labels for N and T needed.
Fig S5C, D - these are proliferation not colony forming assays as stated in the text.
Please include number of replicates and type of error bars in figure legends for cell assays
-
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while …
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while revealing new targets confirmed by both mRNA- and protein-expression driven dependency. Notably, the identified genes show an overall 3.8-fold enrichment for approved drug targets and enrich for druggable oncology targets by 7 to 10-fold. We experimentally validate that the depletion of GRHL2, TP63, and PAX5 effectively reduce tumor cell growth and survival in their dependent cells. Overall, we present the catalog of express-driven dependency targets as a resource for identifying novel therapeutic targets in precision oncology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag011), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2
*The authors introduce BEACON, a Bayesian correlation approach designed to identify expression-driven dependency in cancer. Their hypothesis suggests that cancer cells with elevated expression of specific genes demonstrate increased vulnerability to the knockout of those same genes, thereby unveiling a promising new category of targets in precision oncology—particularly valuable for targeting cancer cells lacking druggable mutations. *BEACON models expression levels and dependency scores as bivariate Gaussians and employs Markov Chain Monte Carlo (MCMC) sampling to estimate the correlation coefficient between them. They then compute p-values followed by rigorous multiple testing correction (BH based FDR correction). *A notable strength of their approach lies in the integration of mass spectrometry proteomics data alongside transcriptomic and perturbation screening data, enhancing the robustness of their findings. *Their work highlights some key insights:
- Gene expression-driven dependency (GED) candidates identified across lineages demonstrate enrichment for "DNA-binding transcription activator activity" and "DNA-binding transcription activator activity, RNA polymerase II-specific" pathways.
- The analysis successfully identifies compelling candidates with robust signals in both GED and PED (FERMT2, GRHL2, KLF5, CDK6, and CCND1), which are well-supported by existing drug evidence or established literature
- Clustering analyses reveal that cancer cells from pancreas and biliary tract tissues, as well as kidney and urinary tract tissue lineages, exhibit remarkably similar expression-driven dependency profiles. Additionally, lineage-specific genes such as transcription factors, cluster together in a manner consistent with existing literature
- Through Fisher's exact test, the authors demonstrate significant enrichments of druggable gene lists from DrugBank with expression-driven dependency patterns at both proteomic and transcriptomic levels
- Experimental validation shows that PAX5 is essential for PAX5-high B cell lymphoma cell growth, while TP63 and GRHL2 are essential for LSCC cell growth.
However, I have several principal concerns about the study that should be addressed to demonstrate the robust and superior performance of this proposed approach.
Major Comments:
1.Quantitative benchmarking: While the authors present a valuable contribution, the concept of correlating gene dependency scores to expression has been explored previously through approaches like Project DRIVE (E. Robert McDonald, III et al.) and APSIC (Montazeri et al.). BEACON demonstrates strong correlations across multiple lineages, representing broader scope compared to existing methods that appear more lineage-restricted. However, establishing BEACON's comparative advantages requires more rigorous evaluation. Notably, Project DRIVE—a foundational paper in this field—already identified several BEACON candidates in their "Expression Correlation Analysis Identifies Oncogenes and Lineage-Specific Transcription Factors" section, while APSIC characterized many lineage-specific discoveries as tumor effector genes. BEACON's strength lies in integrating proteomic data with transcriptomic and perturbation screens, enabling identification of additional candidates like PAX5 for hematopoietic and lymphoid tissue. To demonstrate the method's impact, I recommend systematic quantitative benchmarking against existing approaches.
Importantly, BEACON utilizes richer/complementary datasets than previous studies. Disentangling contributions of data richness versus methodological innovation would provide valuable insights into whether enhanced performance stems from improved data availability or genuine method improvements.
Overall for benchmarking, the authors are strongly encouraged to utilize any comprehensive datasets that best demonstrate their method's competitive advantage and are not limited to the specific comparisons recommended above.
2.Correlation method comparisons: Figure S2 shows that BEACON exhibits higher MSE at extremes, and the claimed advantage over Pearson for small sample sizes is difficult to quantify from the current visualization. While the theoretical expectation that BEACON should outperform Pearson in small samples is reasonable, the practical significance remains unclear from these simulations. I recommend demonstrating BEACON's advantage using real data by creating a curated list of established GEDs/PEDs and comparing performance between the two methods. This is particularly important since several of BEACON's hits were previously reported by Project DRIVE using simple Pearson correlations. Alternatively, if BEACON's advantage is indeed significant, please elaborate on the simulation results to better justify this claim with clearer quantitative metrics.
3.Validation experiments: I'm seeking clarification on the validation experiments for TP63 and GRHL2. These candidates were not sensitive to predicted dependency and the authors say that "pan-lineage targets may represent universal vulnerability and their inhibition may lead to undesired off-target effects on other cells". Are the authors positioning them as weaker candidates to illustrate the superiority of lineage-specific predictions like PAX5? Additionally, why were different experimental approaches used—CRISPR for PAX5 versus shRNA for TP63 and GRHL2? For a method aimed at identifying druggable targets, would drug based experiments be more relevant than knockdown approaches to better demonstrate clinical applicability?
Minor comments
- In Figure 4A, the caption refers to the plot as a heatmap, but the visualization appears to be a scatterplot. Please clarify whether the heatmap is missing or modify the caption appropriately. Additionally, I recommend using a different shade of green, as the current color choice makes some gene names difficult to read.
- In Fig S5A, please add a legend for tumor and normal
- For the TP63 and GRHL2 validation experiments, please include results for all four cell lines. The current manuscript is missing HCC15-shTP63, HCC15-shGRHL2, and HARA-shGRHL2 plots.
- How many replicates were the experiments performed on? Is it N= 3 for all experiments?
- Missing some text here - "BEACON offers the unique advantage of utilizing prior distributions that are less susceptible to outliers, especially in multiple lineages where the number of cell lines."
-
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while …
AbstractCancer cells are heterogeneous, each harboring distinct molecular aberrations and are dependent on different genes for their survival and proliferation. While successful targeted therapies have been developed based on driver DNA mutations, many patient tumors lack druggable mutations and have limited treatment options. Here, we hypothesize that new precision oncology targets may be identified through “expression-driven dependency”, whereby cancer cells with high expression of a targeted gene are more vulnerable to the knockout of that gene. We introduce a Bayesian approach, BEACON, to identify such targets by jointly analyzing global transcriptomic and proteomic profiles with genetic dependency data of cancer cell lines across 17 tissue lineages. BEACON identifies known druggable genes, e.g., BCL2, ERBB2, EGFR, ESR1, MYC, while revealing new targets confirmed by both mRNA- and protein-expression driven dependency. Notably, the identified genes show an overall 3.8-fold enrichment for approved drug targets and enrich for druggable oncology targets by 7 to 10-fold. We experimentally validate that the depletion of GRHL2, TP63, and PAX5 effectively reduce tumor cell growth and survival in their dependent cells. Overall, we present the catalog of express-driven dependency targets as a resource for identifying novel therapeutic targets in precision oncology.
This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag011), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 1
The authors present BEACON, a method for identifying associations between the expression of a gene and sensitivity to the CRISPR knockout of that gene across a panel of cancer cell lines. These 'oncogene like' dependencies represent potential therapeutic targets that might be exploited for the development of new precision medicines in cancer. The issue that BEACON aims to address is the limited sample size (cell line count) in some specific cancer lineages and experimental noise that might result in spurious correlations between expression and CRISPR sensitivity. The authors demonstrate, using a modelling approach, that BEACON is more reliable for estimating correlation than simple Pearson's correlation when there is high-noise in the measurements. The majority of the manuscript focuses on analyses of dependencies systematically identified using the BEACON approach and their enrichment in drug targets and biological pathways. There is some experimental testing of three potential expression driven dependencies presented. The rationale for the overall approach and analyses are clear.
Major comments
Previous efforts have systematically associated gene/protein expression with CRISPR sensitivity across the same or related datasets (e.g. Pacini et al, Cancer Cell 2024 and Rohde et al, Molecular Systems Biology 2025 using CRISPR; McDonald et al, Cell 2017 and Tsherniak et al, Cell 2017 using RNAi) and so the primary contribution of this paper can be considered the development of the BEACON method. It is thus somewhat surprising that there is no real assessment of the improvements offered by BEACON when compared to simpler methods (Pearson correlation, Spearman correlation) or more more complex recent approaches (Rohde et al's BACON approach). The modelling approach suggests some improvements in specific circumstances (especially high noise) but it is not clear that this leads to improved dependency identification in the real data. Does BEACON identify known oncogene addictions better than these methods? Are the associations identified more reproducible (e.g. across alternative CRISPR screens or RNAi screens)?
The experimental validation and the conclusions drawn from it are somewhat confusing. The authors assess three potential expression associated dependencies - two pan-cancer dependencies (GRHL2 and TP63) and one lineage specific dependency (PAX5 in myeloid cells). Only the lineage-specific dependency validated in the way that might be expected, with higher expression associated with increased dependency, leading the authors to conclude that lineage-specific dependencies may be more suitable targets than pan-cancer ones. Given the numbers analysed (3 genes) this suggestion is not well supported. Moreover the perturbation was performed using distinct approaches - CRISPR for PAX5 and shRNA for the other two genes - and only the knockdown of PAX5 was validated by Western blot. It is very hard to know what phenotypes might be a false positive from off-target shRNA effects or false-negatives from variable shRNA knockdown of the target. The results in S5C suggest that the two shRNAs for each gene cause somewhat discordant phenotypes, suggesting there may be some issues with knockdown efficiency. This could potentially be addressed by adding additional shRNAs for GRHL2 / TP63 or testing them using CRISPR perturbation as was done for PAX5. Validation of the knockdown of the intended target could also shed some light here. The manuscript also mentions experiments in an additional cell line (HCC15) but I cannot see these results presented in the main figures or supplement. It would be useful if all results for these two genes were presented in a single figure, with high and low expressing cell lines clearly marked,
Minor:
Previous work has established that in some cases lower expression of a gene can make cells more vulnerable to its perturbation (CYCLOPS genes, Nijhawan et al, Cell 2012). While these are not the focus of this manuscript, it would be useful for the authors to comment on the utility of BEACON for their identification.
p14 "Moreover, GED/PED targets were depleted of genes that were Essential In Culture" - it's not clear what this means or where the data comes from. By definition the gene set analysed are at least somewhat essential in culture
-
