Extending differential gene expression testing to handle genome aneuploidy in cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Genome aneuploidy, characterized by copy number variations (CNVs), profoundly alters gene expression in cancer through direct gene dosage effects and indirect compensatory regulatory mechanisms. However, existing differential gene expression (DGE) testing methods do not differentiate between these mechanisms, conflating all expression changes, limiting biological interpretability and obscuring key genes involved in tumor progression.
To address this, we developed DeConveil, a computational framework that extends traditional DGE analysis by integrating CNV data. Using a generalized linear model with a negative binomial distribution, DeConveil models RNA-seq expression counts while accounting for copy number gene dosage effects. We proposed a more fine-grained gene decomposition into dosage-sensitive (DSGs), dosage-insensitive (DIGs), and dosage-compensated (DCGs), which explicitly de-couples changes due to CNVs and bona fide changes in transcriptional regulation. Analysis of TCGA datasets from aneuploid solid cancers resulted in notable reclassification of genes, refining and expanding upon the results from conventional methods. Functional enrichment analysis identified distinct biological roles for DSGs, DIGs, and DCGs in tumor progression, immune regulation, and cell adhesion. In a breast cancer case study, DeConveil’s CN-aware analysis facilitated the identification of both known and novel prognostic biomarkers, including long non-coding RNAs, linking gene expression signatures to survival outcomes. Utilizing these biomarkers for each gene group significantly improved patient risk stratification, yielding more accurate predictions compared to conventional methods.
These results highlight DeConveil’s ability to disentangle CNV-driven from regulatory transcriptional changes, enhancing gene classification and biomarker discovery. By improving transcriptomic analysis, DeConveil provides a powerful tool for cancer research, precision oncology, with potential applications in therapeutic target identification.
Author Summary
Identifying genes whose expression changes in cancer is fundamental to understand disease aetiology and to propose therapeutic targets. However, alterations to the copy number of genes, due to amplification or deletion events, can represent a significant confounder to differential expression quantification. Here we propose a simple model to correct for this confounder, identifying a finer characterization of coordinated changes in gene expression and copy number. We show on several data sets that this new characterization has prognostic value and sheds light on gene regulation in cancer.