Multimodal Gene Expression Deep Learning for Predicting Sentinel Lymph Node Macro-metastasis in Early Breast Cancer: Development and Validation in the SCAN-B Cohort
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose: This study evaluates deep learning (DL) using gene expression (GEX) and preoperatively available clinical data (PreopClinic) to predict sentinel lymph node macro-metastasis (SLNM), and explores their potential for guiding axillary surgery de-escalation and supporting prognostic assessment. Experimental Design: We retrospectively included 6,836 clinically node-negative (cN0) T1-T2 patients with invasive breast cancer who underwent primary surgery from the Swedish SCAN-B cohort. Three DL models—a multilayer perceptron, a pathway-informed sparse neural network, and a transformer—were developed using the development set (n=4,625) and evaluated against XGBoost in the independent test set (n=2,211). Results: The Transformer outperformed other methods for GEX modeling and minimized the need for prior gene selection. In the independent test set, the combined PreopClinic+GEX model significantly improved SLNM prediction (ROC AUC 0.693, P<0.001) and better identified low-risk patients who might avoid unnecessary SLNB (reduction rate 27.2% at a sensitivity of 92.1%, P=0.02) compared to the PreopClinic model alone. Notably, across-subtype training outperformed within-subtype training, improving nodal prediction, especially in TNBC (ROC AUC 0.734; 95% CI: 0.644-0.837), achieving a substantial SLNB reduction rate of 51.5% (95% CI: 43.2-59.9%). Importantly, the derived SLNM predictor showed prognostic significance (P=0.039), and provided complementary information to the established prognostic factors in the ER+HER2- patients recommended for SLNB under the 2025 ASCO guidelines. Conclusions: These findings highlight the Transformer's robustness against noise and effectiveness in capturing informative GEX features across scales, suggesting the potential of integrating GEX data and PreopClinic variables to enable further axillary surgical de-escalation, including for patients with tumor characteristics not reflected in current ASCO recommendations.