Development and Validation of a Machine Learning-Based Risk Prediction Model for 60-Day Survival Status in Patients with Acute Pancreatitis Complicated with Acute Kidney Injury

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Acute kidney injury (AKI) is a severe complication of severe acute pancreatitis (SAP) with an extremely poor prognosis. Our study aimed to develop and validate an interpretable machine learning (ML) model to predict 60-day survival in patients with acute pancreatitis (AP) complicated by AKI, and to identify key prognostic factors to support clinical decision-making. Methods This was a retrospective cohort study, with data extracted from the MIMIC-IV v3.1 database (released in October 2024). The inclusion criteria were as follows: patients aged 18 years or older with acute pancreatitis complicated with acute kidney injury confirmed by ICD diagnostic codes, a clinical data completeness rate of ≥ 80%, and a follow-up duration of ≥ 60 days or a definitive in-hospital death record within 60 days. Propensity score matching (PSM) was applied to address the class imbalance between the survival and death groups. The optimal features were screened from 31 candidate variables using three feature selection algorithms: the Boruta algorithm, random forest (RF), and recursive feature elimination with cross-validation (RFECV). A total of 22 machine learning models were constructed, and their predictive performance was evaluated based on the following metrics: area under the receiver operating characteristic curve (AUC), accuracy, Kappa coefficient, sensitivity, specificity, and Brier score. The Shapley Additive exPlanations (SHAP) method was adopted to interpret the optimal model, which elucidated feature importance, nonlinear relationships, and inter-feature interactions. Ultimately, an interactive web application was developed to facilitate the clinical application and popularization of the model. Results A total of 11,699 eligible patients were included (5,849 in the survival group and 5,850 in the death group after propensity score matching (PSM). Fifteen core features were screened for model construction, including Sequential Organ Failure Assessment (SOFA) score, liver function, cardiovascular function, renal function, length of hospital stay (LoS), admission age, and comorbidity of intracerebral hemorrhage (co_ICH). The random forest (RF) model exhibited the best performance, with an area under the receiver operating characteristic curve (AUC) of 0.9731, accuracy of 0.9068, sensitivity of 0.8755, specificity of 0.9392, and Brier score of 0.0722. SHAP analysis revealed that admission age, co_ICH, liver function, and cardiovascular function were the most important predictors of 60-day mortality. Nonlinear relationships were observed between key features (e.g., activated partial thromboplastin time [PTT], red cell distribution width [RDW], and length of hospital stay) and survival outcomes, along with threshold effects and U-shaped associations. An interactive web application (https://medicalpredictor.shinyapps.io/Online_Prediction_of_60-Day_Survival/) has been successfully deployed for individualized risk assessment. Conclusions The random forest model developed in this study exhibits excellent performance and interpretability in predicting 60-day survival in patients with acute pancreatitis complicated by acute kidney injury. Key prognostic factors identified via SHAP analysis provide valuable clinical references, while the web application enhances the model's practical utility. This tool can assist clinicians in conducting early risk stratification and formulating personalized intervention strategies, thereby improving patient outcomes.

Article activity feed