Explainable Machine Learning Predicts Mortality in Critically Ill Patients with Nonvariceal Upper Gastrointestinal Bleeding: A MIMIC-IV Study with External Validation

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Non-variceal upper gastrointestinal bleeding (NVUGIB) poses significant mortality in critically ill patients, necessitating accurate early prognostication for timely interventions. Recent advances in machine learning have demonstrated potential to significantly improve predictive performance compared to conventional clinical scores. Accordingly, this study aims to establish a machine learning model named NVUPreM to predict 30-day NVUGIB mortality and validate its superiority over traditional scoring systems. Methods: This retrospective study derived data from the Medical Information Mart for Intensive Care IV (n=11,237) and the eICU Collaborative Research Database (n=7,742) databases for model development and external validation. Predictors were selected via least absolute shrinkage and selection operator regression to minimize multicollinearity. Thirty-six machine learning algorithms were evaluated using tenfold cross-validation. The optimal model (NVUPreM) was compared against eight clinical scoring systems (AIMS65, Charlson, GBS, GCS, Admission-Rockall, SAPSII, SOFA) using the area under the receiver operating characteristic curve (AUC), calibration, decision curve analysis, and SHapley Additive exPlanations for interpretability. Results: The NVUPreM model demonstrated superior discrimination (AUC=0.876, [95% CI 0.846-0.907]) and sensitivity (0.86), showing the best predictive performance among all models. In internal validation, the NVUPreM model outperformed all clinical scores according to the results of AUCs (AIMS65: AUC=0.693; Charlson: AUC=0.636; GBS: AUC=0.575; GCS: AUC=0.707; NVUPreM: AUC=0.876; Admission-Rockall: AUC=0.633; SAPSII: AUC=0.777; SOFA: AUC=0.665), decision curve analysis and calibration curve. External validation in eICU confirmed robustness of the NVUPreM model in terms of discrimination (AUC=0.82, [95% CI 0.803-0.837]), calibration, and clinical application. The interpretability analysis revealed directional feature contributions, identifying predictors with significantly positive and negative impacts on the model output. Conclusion: The NVUPreM model significantly outperforms existing clinical scores in predicting 30-day NVUGIB mortality, offering both accuracy and interpretability, which could assist clinicians in early high-risk patient identification and personalized intervention.

Article activity feed