MRAN: A Reconstructive Attention Network for Handling Modality Sparsity in Multimodal Cancer Survival Analysis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Survival prediction in clear cell renal cell carcinoma (ccRCC) is challenging due to severe class imbalance and pervasive radiological data sparsity in real-world cohorts. Standard multimodal fusion techniques often degrade when modalities are missing, typically overfitting to the majority survival class (∼88%) and failing to reliably flag high-risk patients. We introduce the Multimodal Reconstructive Attention Network (MRAN), which integrates whole-slide images (WSI), CT, and clinical/genomic variables via attention-based fusion coupled with an auxiliary feature-level reconstruction branch. This branch learns to generate CT embeddings from histopathology, enabling the model to exploit radiological signals even for patients without acquired scans. Ablation studies confirm that a tri-modal configuration (WSI + CT + clinical), which explicitly excludes sparse MRI data, provides the optimal signal-to-noise ratio. Evaluated on the MMISTccRCC dataset with 5-fold cross-validation and an independent holdout test set, MRAN achieves a C-index of 0.835 and a Balanced Accuracy of 83.5%. At a clinically calibrated operating point, the model attains a Sensitivity of 81.3% (identifying mortality) and a Specificity of 85.7% (identifying survivors), thereby overcoming the accuracy paradox and demonstrating robust utility for 12-month risk stratification in ccRCC.

Article activity feed