Development and External Validation of a Machine Learning–Based Model for Early Prediction of Multiple Organ Dysfunction Syndrome in Critically Ill Patients with Sepsis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Multiple organ dysfunction syndrome (MODS) is a key determinant of prognosis in sepsis, yet conventional severity scoring systems based on linear assumptions and static variables fail to capture complex nonlinear physiological disturbances and dynamic inter organ interactions. Although machine learning has shown promise in outcome prediction among critically ill patients, studies focusing on MODS while ensuring interpretability and external validation remain limited. Methods This retrospective cohort study used data from the Medical Information Mart for Intensive Care IV and the eICU Collaborative Research Database. Adult patients meeting Sepsis 3 criteria and admitted to the ICU for the first time were included. Feature selection was performed using least absolute shrinkage and selection operator regression. Multiple machine learning models were developed, including logistic regression, random forest, gradient boosting machine, extreme gradient boosting, Light Gradient Boosting Machine, artificial neural networks, and support vector machines. Model performance was evaluated using the area under the receiver operating characteristic curve, calibration curves, and decision curve analysis. Shapley additive explanations were used for model interpretation, and external validation was conducted in an independent eICU cohort. Results Among 23,018 patients with sepsis, 4,931 (21.4%) developed MODS during ICU hospitalisation. All models showed acceptable discrimination, with LightGBM achieving the highest AUC (0.829), followed by GBM (0.824), random forest (0.823), and XGBoost (0.822). Logistic regression and elastic net showed moderate performance (both AUC 0.802), the neural network showed intermediate discrimination (AUC 0.803), whereas support vector machines (0.759) and k nearest neighbours (0.727) performed less well. LightGBM demonstrated stable discrimination, good calibration, and greater clinical net benefit in both internal testing and external validation. SHAP analysis identified the Sequential Organ Failure Assessment score, respiratory rate, lactate, coagulation indices including international normalised ratio, acid base status, and vasoactive agent use as key predictors with pronounced nonlinear effects. Conclusion Among the evaluated models, the gradient boosting based LightGBM showed the most robust performance for predicting MODS risk in sepsis, supporting early risk stratification and individualised ICU management. Prospective multicentre studies are warranted to confirm its clinical impact.

Article activity feed