A multimodal cross-attention pathotranscriptome integration for enhanced survival prediction of oral squamous cell carcinoma

Kountay Dwivedi
Amirreza Mahbod
Rupert C. Ecker
Klara Janjić

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Oral squamous cell carcinoma (OSCC) accounts for a major part of cancer mortality, with survival outcomes highly dependent on early diagnosis. While many approaches have been proposed for OSCC survival prediction, they often rely on unimodal data, which may be suboptimal. In this study, we introduced a unified cross-attention-based deep learning framework that integrates whole-slide histopathology images (WSIs) and transcriptomic data from OSCC patients for survival prediction. The framework employed an autoencoder for transcriptomic feature extraction and a state-of-the-art pathology foundation model—evaluated across five alternatives—to derive WSI embeddings. These embeddings were subsequently integrated using cross-attention and concatenation within a Cox proportional hazards model. The multimodal approach outperformed nearly all unimodal counterparts, achieving a maximum concordance index of 0.780±0.059 with cross-attention and 0.766±0.050 with concatenation. The results indicate that pathotranscriptomic integration could improve survival prediction for OSCC patients. The implementation is available on GitHub at: https://github.com/kountaydwivedi/multimodal fusion.git .

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible .

Version published to 10.1101/2025.10.31.25339218 on medRxiv
Nov 3, 2025

Deep Learning for Preoperative MRI-Based Endometrial Cancer Staging Prediction

This article has 5 authors:
1. Caili Gong
2. Yetong Qi
3. Ying Su
4. Tianjiao Li
5. Yongfeng Wei
This article has no evaluationsLatest version Dec 12, 2025
Predicting gene expression from whole slide images in prostate cancer using deep learning

This article has 14 authors:
1. Anxuan Han
2. Bo Li
3. Chui Yan Mah
4. Jessica Logan
5. Yanan Wang
6. Ning Liu
7. Feargal Ryan
8. David Lynn
9. Darren Foreman
10. John O’Leary
11. Douglas Brooks
12. Jose Polo
13. Lisa Butler
14. Fuyi Li
This article has no evaluationsLatest version Feb 4, 2026
An agentic multimodal AI framework for end-to-end breast cancer staging and biomarker profiling

This article has 10 authors:
1. Yang Liu
2. Shaohua Chen
3. Guiyun Zhang
4. Yue Wu
5. Ruirui Cao
6. Shanshan Zhang
7. Yufo Chen
8. Rui Wang
9. Erkang Tian
10. Yumei Li
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Deep Learning for Preoperative MRI-Based Endometrial Cancer Staging Prediction

Predicting gene expression from whole slide images in prostate cancer using deep learning

An agentic multimodal AI framework for end-to-end breast cancer staging and biomarker profiling