Genomic Characterization of Lung Cancer in Never-Smokers Using Deep Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Despite promising results in using deep learning to infer genetic features from histological whole-slide images (WSIs), no prior studies have specifically applied these methods to lung adenocarcinomas from subjects who have never smoked tobacco (NS-LUAD) – a molecularly and histologically distinct subset of lung cancer. Existing models have focused on LUAD from predominantly smoker populations, with limited molecular scope and variable performance. Here, we propose a customized deep convolutional neural network based on ResNet50 architecture, optimized for multilabel classification for NS-LUAD, enabling simultaneous prediction of 16 molecular alterations from a single H&E-stained WSI. Key architectural modifications included a simplified two-layer residual block without bottleneck layers, selective shortcut connections, and a sigmoid-based classification head for independent prediction of each alteration, designed to reduce computational complexity while maintaining predictive accuracy. The model was trained and evaluated on 495 WSIs from the Sherlock- Lung study (70% training with 10% internal test set for 10-fold cross-validation, and 30% held-out validation set for final evaluation). For the held-out validation data, our model achieved high areas under the receiver operating characteristic curve [AUROC] values =0.84-0.93 for detecting 11 features: EGFR, KRAS, TP53, RBM10 mutations, MDM2 amplification, kataegis, CDKN2A deletion, ALK fusion, whole-genome doubling, and EGFR hotspot mutations (p.L858R and p.E746_A750del). Performance was low to moderate for tumor mutational burden (AUROC=0.67), APOBEC mutational signature (AUROC=0.57), and KRAS hotspot mutations (p.G12C: AUROC=0.74, p.G12V: AUROC=0.55, p.G12D: AUROC=0.43). Compared to results from established architectures such as Inception-v3 on the same WSIs, our model demonstrated significantly improved performance for most features. With further optimization, our model could support triaging for molecular testing and inform precision treatment strategies for NS-LUAD patients.

Article activity feed