Development and validation of an Explainable Machine Learning Model for Predicting Multiple Organ Failure in Patients with Acute Pancreatitis: a Multicenter Cohort Study

Yi Hao
Peiyi Bai
Yunpeng Zhou
Yi Wang
Qinyang Du
Rongshen Guan
Gaopeng Li

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

1.1. Background Acute pancreatitis can lead to a serious and life-threatening situation known as multiple organ failure (MOF). Timely and precise forecasting and detection of MOF are essential. Predictive models utilizing machine learning have shown potential in forecasting MOF in emergency situations. Nevertheless, their use for predicting MOF specifically in patients with acute pancreatitis is still not widespread. This research seeks to create and confirm a machine learning-based model for predicting MOF in individuals suffering from acute pancreatitis. 1.2. Methods This research utilized two retrospective cohorts for the purposes of developing, validating, and testing the model. The derivation and validation cohorts were sourced from the MIMIC-IV database, which was divided randomly into two segments (70% for model development and 30% for internal validation). For external validation, a retrospective cohort from the eICU database was used. After performing several data preprocessing techniques, including interpolation and standardization, we applied four methods for feature selection, ultimately identifying 15 key features. Seven machine learning algorithms were utilized to create predictive models, with their performance assessed through various metrics such as ROC, DCA, PRC, calibration curve, and confusion matrice. The final model's interpretation was carried out using SHAP technology, and a web-based risk calculator was created for use in clinical settings. 1.3. Results The model was created utilizing data from the MIMIC-Ⅳ database (n = 582) and underwent validation and testing with both the MIMIC-Ⅳ (n = 250) and eICU (n = 474) datasets. We identified the best feature combination through four different selection techniques and seven machine learning algorithms. A variable that was recognized by all four selection methods was included in the model development process. The most effective model, CatBoost, was built using 15 easily accessible admission features, resulting in an AUC of 0.939, an F1 score of 0.795, and an accuracy of 0.856 during validation. Subsequently, hyperparameter optimization was conducted on the derivation cohort via 5-fold cross-validation and grid search. The final model was assessed on the test dataset, yielding an AUC of 0.855, an F1 score of 0.658, and an accuracy of 0.761. Additionally, SHAP analysis indicated that INR, Cr, and PO2 are the three most significant variables affecting the model's predictions. This model has been transformed into an online clinical tool to enhance its application in healthcare environments. 1.4. Conclusion The CatBoost model, which is interpretable, effectively forecasted the likelihood of MOF in individuals suffering from acute pancreatitis, showcasing strong predictive performance in both internal and external validation groups. Additionally, this model has been implemented as an online tool for risk assessment to improve its practical application in clinical settings.

Version published to 10.21203/rs.3.rs-8086659/v1 on Research Square
Dec 22, 2025

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

This article has 16 authors:
1. Hao Liu
2. Meijun Liu
3. Xinmiao Guan
4. Feng Cao
5. Changhao Liang
6. Zhongwen Qi
7. Jiaqi Hui
8. Junnan Zhao
9. Jingli Xing
10. Jianguo Zhou
11. Dong Zhang
12. Lei Liu
13. Xiaoliang Hao
14. Minjing Luo
15. Fengqin Xu
16. Yutong Fei
This article has no evaluationsLatest version Jan 12, 2026
Development and validation of machine learning models for predicting short- and long-term mortality in gastroparesis patients: a retrospective cohort study using the MIMIC-IV database

This article has 5 authors:
1. Lei Zhu
2. Qi Han
3. Bei Pei
4. Jie Zhang
5. Haolong Qi
This article has no evaluationsLatest version Dec 31, 2025
Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database

This article has 5 authors:
1. Qianqian Zhang
2. Nianzhi Zhang
3. Ying Zheng
4. Jing Zhou
5. Ling Liu
This article has no evaluationsLatest version Dec 30, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

Development and validation of machine learning models for predicting short- and long-term mortality in gastroparesis patients: a retrospective cohort study using the MIMIC-IV database

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database​

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database