Enhancing Cause of Death Prediction: Development and Validation of ML Models Using Multimodal Data Across Multiple Healthcare Sites
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Importance
Timely and accurate determination of causes of death (CoD) is essential for public health surveillance, epidemiological research, and healthcare policy development. However, obtaining up-to-date and detailed CoD information is challenging due to delays in official death records and inconsistencies in data reporting across institutions.
Objective
To develop and validate machine learning (ML) models capable of predicting probable CoD by integrating comprehensive features from structured electronic health record (EHR) data, unstructured clinical notes, and publicly available data.
Design, Setting, and Participants
This multi-institutional retrospective cohort study was conducted at Vanderbilt University Medical Center (VUMC) and Massachusetts General Brigham (MGB). Deceased patients were included if they had at least one inpatient or outpatient encounter between October 1, 2015, and January 1, 2021, with corresponding death records from state health departments and the National Death Index. The study was comprised of 13,708 deceased patients from VUMC and 34,839 from MGB.
Exposures
Integration of structured EHR data, unstructured clinical notes processed using advanced language models, and publicly available data into machine learning models to predict CoD.
Main Outcomes and Measures
The primary outcome was the underlying CoD, classified into one of the top 15 National Center for Health Statistics (NCHS) rankable CoD categories, with all other causes grouped into an “Other” category. Model performance was evaluated using weighted area under the receiver operating characteristic curve (AUC) and weighted F-measure.
Results
The XGBoost model using structured EHR data alone achieved weighted AUCs of 0.86 (95% CI, 0.84–0.88) at VUMC and 0.80 (95% CI, 0.79-0.80) at MGB. Adding unstructured notes improved performance, with weighted AUCs of 0.90 (95% CI, 0.88–0.93) at VUMC and 0.92(95% CI, 0.91–0.92) at MGB. Adding publicly available data did not further improve performance. Cross-institutional validation revealed significant performance degradation.
Conclusions and Relevance
ML models integrating EHR structured and unstructured data to predict underlying CoD at the time of the most recent encounter among deceased patients achieved excellent performance within individual institutions. The inclusion of publicly available data did not improve performance, and all versions had poor portability between institutions. Healthcare institutions may benefit from adopting robust processes for locally tailored models, and future research should focus on enhancing model generalizability while addressing unique institutional data environments.