Multimodal Electronic Health Record Foundation Models with Electrocardiogram for Cardiovascular Disease Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Electronic health record (EHR) foundation models (FMs) have improved clinical task performance by learning comprehensive clinical context from sequential medical records called patient trajectories. However, existing multimodal approaches generate representations separately for each modality and integrate them only at the final prediction stage, without capturing temporal relationships across data types. Here, we propose TRIM, text-based representations for integrated multimodal trajectories, which converts unstructured data into text-based representations via their diagnostic reports and integrates them directly into patient trajectories. This enables EHR FMs to simultaneously analyze structured and unstructured longitudinal data while preserving temporal context. To develop and evaluate TRIM, we trained EHR FMs with electrocardiogram data to predict cardiovascular diseases (CVDs). TRIM-integrated EHR FMs improved overall CVD prediction performance. Model interpretation revealed that models prioritized ECG diagnostic information, focusing on clinically established risk factors. Survival analysis validated model decisions, with TRIM integration consistently increasing hazard ratios. TRIM provides a generalizable framework for integrating multimodal medical data into EHR FMs.