Tumor cell specific total mRNA expression informed neural networks predicts cancer progression
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Inferring tumor molecular phenotypes from high-dimensional multi-omic data is a fundamental challenge in computational biology. Current methods for estimating tumor cell–specific total mRNA expression (TmS) require matched DNA and RNA sequencing data and rely on computationally intensive deconvolution pipelines. We present TmSNet, a deep learning framework that predicts TmS using mRNA, DNA methylation, miRNA, and immune cell proportions as input features. TmSNet integrates structured feature selection (gradient boosting, LASSO, elastic net) with specialized neural architectures to predict continuous TmS. Across 12 TCGA cancer types, TmSNet achieved cross-validated performance up to concordance correlation coefficient (CCC) = 0.93 and correlation R² = 0.88 and generalized to external cohorts with correlations of 0.54 (SCAN-B) and 0.43 (FUSCC). Predicted TmS values effectively stratify patients by risk and preserve known transcriptional profiles across tumor subtypes. These results demonstrate that TmSNet can infer biologically meaningful phenotypes from multi-omic data and provide a scalable framework for modeling tumor transcriptional activity in heterogeneous cohorts.