End-to-end prediction of clinical outcomes in head and neck squamous cell carcinoma with foundation model-based multiple instance learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Foundation models (FMs) show promise in medical AI by learning flexible features from large datasets, potentially surpassing handcrafted radiomics. Outcome prediction of head and neck squamous cell carcinomas (HNSCC) with FMs using routine imaging remains unexplored.
Purpose
To evaluate end-to-end FM-based multiple instance learning (MIL) for 2-year overall survival (OS), locoregional control (LRC), and freedom from distant metastasis (FFDM) prediction and risk group stratification using pretreatment CT scans in HNSCC.
Materials and Methods
We analyzed data of 2485 patients from three retrospective HNSCC cohorts (RADCURE, HN1, HN-PET-CT), treated between 2004 and 2017 with available pre-treatment CTs and primary gross tumor volume (GTVp) segmentations. The RADCURE cohort was split into training (n=1464) and test (N=606), with HN1 (n=131) and HN-PET-CT (n=284) as additional test cohorts. FM-based MIL models (2D, multiview and 3D) for 2-year endpoint prediction and risk stratification wre evaluated based on area under the receiver operator curve (AUROC) and Kaplan-Meier (KM) with hazard ratios (HR), compared with radiomics and assessed for multimodal enhancement with clinical baselines.
Results
2D MIL models achieved 2-year test AUROCs of 0.75-0.84 (OS), 0.66-0.75 (LRC) and 0.71-0.78 (FFDM), outperforming multiview and 3D MIL (AUROCs: 0.50-0.77, p≥0.15) and comparable or superior to radiomics (AUROCs: 0.64-0.74, p≥0.012). Significant stratification was observed (HRs: 2.14-4.77, p≤0.039). Multimodal enhancement of 2-year OS/FFDM (AUROCs: 0.82-0.87, p≤0.018) was observed for patients without human papilloma virus positive (HPV+) tumors.
Conclusion
FM-based MIL demonstrates promise in HNSCC risk prediction, showing similar or superior performance to radiomics and enhancing clinical baselines in non-HPV+ patients.
Key Results
-
First end-to-end study using both foundation models and multiple instance learning for outcome prediction in head and neck squamous cell carcinoma.
-
Multiple instance learning approaches predict clinically-relevant 2-year endpoints and stratify patients across external cohorts with similar or better performance than handcrafted radiomics.
-
Multimodal inclusion of clinical and multiple instance learning information improve clinical baseline models in patients without human papillomavirus positive tumors.