Interpretable Deep Learning Reveals Biologically Relevant Spatial Gene Expression Patterns in Lung Tumors and their Microenvironment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Lung adenocarcinoma (LUAD), the most common subtype of non–small cell lung cancer (NSCLC) exhibits profound histological and molecular heterogeneity, hindering accurate prognosis and effective treatment. Current approaches to assess this heterogeneity, such as histopathology, molecular profiling, and spatial transcriptomics are constrained by high costs, long turnaround times, and limited tissue availability, making them challenging for widespread prognostic use. To address this gap, we developed XpressO-Lung, an explanatory deep learning model that predicts gene expression heterogeneity, spatially, in tumor and its microenvironment, on hematoxylin and eosin (H&E) based diagnostic (Dx) whole-slide images (WSIs) by learning associations between tissue morphology and the corresponding bulk-transcriptomic data. Utilizing 200 LUAD cases from The Cancer Genome Atlas (TCGA), XpressO-Lung predicted spatial expression patterns of NAPSA, SLC47A1, TP53I3, KLRB1 , FAM189A1, TICAM1, CD8A, CXCL13, TTF, CDH3 , KRT7 and CDKN2A genes (biomarkers) on the respective Dx-WSIs with AUCs ranging up to 0.92. More importantly, the predicted spatial gene expression patterns aligned with the known morphologic interactions of the tumor and its microenvironment, capturing biological events directly on Dx-WSIs. By coupling predictive performance with spatial interpretability of gene expression on Dx-WSIs, XpressO-Lung bridges histopathology and bulk-transcriptomics, enabling explainable morpho-genomic analyses to advance biomarker discovery and offer prognostic insights to inform precision oncology in LUAD, especially in low-resource settings.