Analyzing Information Disparities across Modalities in Mortality Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recent advances in deep learning have enabled the integration of heterogeneous data modalities for clinical prediction, allowing models to exploit complex information embedded within electronic health records (EHRs). Among these modalities, chest radiographs (CXRs) provide a rich source of visual information that can enhance patient outcome prediction for patients in the intensive care unit (ICU). However, the comparative impact of different CXR representations— raw images versus radiology reports—on predictive performance has not been systematically investigated. Such comparisons are essential for identifying the most informative modality and understanding how it complements other data sources. This study compares the predictive utility of raw CXRs versus radiology reports for 30-day post-discharge mortality prediction in ICU patients. We employed a Vision–Language Model (VLM) with patient discharge notes. On a filtered subset of the MIMIC-IV dataset (n = 1,360), augmenting discharge notes with CXRs achieved the best performance (AUROC = 0.843), surpassing both the discharge-note-only (AUROC = 0.816) and radiology-report-augmented (AUROC = 0.804) models. The experiments demonstrated that combining raw CXRs with discharge notes consistently outperformed models augmented with radiology reports. A radiologist’s review further revealed that reports often omitted clinically relevant findings visible in the images, highlighting that CXRs convey richer prognostic signals for mortality risk. These findings underscore the critical role of modality selection in clinical AI systems and suggest that textual summaries should be used as surrogates for multimodal data with caution, as they may fail to capture critical predictive information.