Multiple instance learning with spatial transcriptomics for interpretable patient-level predictions: application in glioblastoma

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate prediction of patient outcomes remains a major challenge in oncology. While recent machine learning (ML) approaches often rely on bulk omics lacking spatial resolution or histology-based multiple instance learning (MIL), spatial transcriptomics (SpT) provides a unique opportunity to capture both molecular content and tissue architecture. However, no generalizable ML framework has yet been established to exploit SpT for patient-level outcomes. We present SpaMIL, a flexible and interpretable MIL framework designed for SpT, with a distillation strategy that enables deployment for hematoxylin and eosin (H&E) slides alone. We evaluate the framework by predicting survival from glioblastoma (GBM) patients, a clinically compelling setting given its aggressiveness with a median survival of only 15 months and the lack of prognostic clinical variables. We analyzed 76 GBM cases from the MOSAIC dataset: 43 with matched SpT, H&E, single-nucleus RNA-seq (scRNA-seq), bulk RNA-seq, and clinical variables, and 33 with H&E for external validation. We developed two main architectures: abMIL, tailored to SpT’s spatial molecular structure, and MabMIL, which distills SpT-derived representations into H&E. Model interpretability was achieved through a Shapley-based framework linking prognostic predictions to cell-type compositions via SpT deconvolution. In benchmarking across the five GBM MOSAIC modalities, SpT-based abMIL achieved unprecedented prognostic accuracy (median C-index: 0.72, standard deviation: 0.04), outperforming all other modalities, including established clinical predictors. PCA and deconvolution-based SpT representations surpassed recent foundation models, suggesting the need for further research on SpT foundation models. Our interpretability analysis highlighted malignant and non-malignant cell subpopulations associated with favorable or poor prognosis, consistent with recent reports. Finally, MabMIL maintained strong performance while enabling H&E-only deployment, with improved condorance index over H&E-only baselines in both internal (0.59 vs. 0.57) and external (0.62 vs. 0.55) cohorts.

Article activity feed