Multiple instance learning with spatial transcriptomics for interpretable patient-level predictions: application in glioblastoma

Simon Grouard
Christian Esposito
Jean El Khoury
Valérie Ducret
Céline Thiriez
Loïc Herpin
Anaïs Chossegros
Caroline Hoffmann
Quentin Bayard
Genevieve Robin
Nicole Tay
Esther Baena
MOSAIC Consortium
Eric Durand
Almudena Espin Perez
Lucas Fidon

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate prediction of patient outcomes remains a major challenge in oncology. While recent machine learning (ML) approaches often rely on bulk omics lacking spatial resolution or histology-based multiple instance learning (MIL), spatial transcriptomics (SpT) provides a unique opportunity to capture both molecular content and tissue architecture. However, no generalizable ML framework has yet been established to exploit SpT for patient-level outcomes. We present SpaMIL, a flexible and interpretable MIL framework designed for SpT, with a distillation strategy that enables deployment for hematoxylin and eosin (H&E) slides alone. We evaluate the framework by predicting survival from glioblastoma (GBM) patients, a clinically compelling setting given its aggressiveness with a median survival of only 15 months and the lack of prognostic clinical variables. We analyzed 76 GBM cases from the MOSAIC dataset: 43 with matched SpT, H&E, single-nucleus RNA-seq (scRNA-seq), bulk RNA-seq, and clinical variables, and 33 with H&E for external validation. We developed two main architectures: abMIL, tailored to SpT’s spatial molecular structure, and MabMIL, which distills SpT-derived representations into H&E. Model interpretability was achieved through a Shapley-based framework linking prognostic predictions to cell-type compositions via SpT deconvolution. In benchmarking across the five GBM MOSAIC modalities, SpT-based abMIL achieved unprecedented prognostic accuracy (median C-index: 0.72, standard deviation: 0.04), outperforming all other modalities, including established clinical predictors. PCA and deconvolution-based SpT representations surpassed recent foundation models, suggesting the need for further research on SpT foundation models. Our interpretability analysis highlighted malignant and non-malignant cell subpopulations associated with favorable or poor prognosis, consistent with recent reports. Finally, MabMIL maintained strong performance while enabling H&E-only deployment, with improved condorance index over H&E-only baselines in both internal (0.59 vs. 0.57) and external (0.62 vs. 0.55) cohorts.

Version published to 10.1101/2025.10.13.682206 on bioRxiv
Oct 15, 2025

Inferring Clinically Relevant Molecular Subtypes of Pancreatic Cancer from Routine Histopathology Using Deep Learning

This article has 13 authors:
1. Abdul Rehman Akbar
2. Alejandro Levya
3. Ashwini Esnakula
4. Elshad Hasanov
5. Anne Noonan
6. Upender Manne
7. Vaibhav Sahai
8. Lingbin Meng
9. Susan Tsai
10. Anil Parwani
11. Wei Chen
12. Ashish Manne
13. Muhammad Khalid Khan Niazi
This article has no evaluationsLatest version Jan 16, 2026
Predicting gene expression from whole slide images in prostate cancer using deep learning

This article has 14 authors:
1. Anxuan Han
2. Bo Li
3. Chui Yan Mah
4. Jessica Logan
5. Yanan Wang
6. Ning Liu
7. Feargal Ryan
8. David Lynn
9. Darren Foreman
10. John O’Leary
11. Douglas Brooks
12. Jose Polo
13. Lisa Butler
14. Fuyi Li
This article has no evaluationsLatest version Feb 4, 2026
Deep Learning Paradigm for Precision Lung Cancer Therapy with AI-Driven Genotype-Phenotype Mining and Patient-Derived Organoid Validation

This article has 19 authors:
1. Zhongze Gu
2. Mingyue Li
3. Xiaoming Shi
4. Tianmu Hu
5. Juan Zhang
6. Ziliang Ye
7. Yuhan Cai
8. Qiwei Li
9. Linchong Liu
10. Wenlong Yu
11. Jiajia Jing
12. Qiuyin Zhang
13. Juanjuan Li
14. Xin Zhou
15. Nan Qiao
16. Jun Bao
17. Zaozao Chen
18. Lili Xu
19. Tao Wang
This article has no evaluationsLatest version Dec 23, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Inferring Clinically Relevant Molecular Subtypes of Pancreatic Cancer from Routine Histopathology Using Deep Learning

Predicting gene expression from whole slide images in prostate cancer using deep learning

Deep Learning Paradigm for Precision Lung Cancer Therapy with AI-Driven Genotype-Phenotype Mining and Patient-Derived Organoid Validation