A Comprehensive Machine Learning and SHAP Framework for Predicting I CU Length of Stay Using Non-Therapeutic Clinical Indicators

Zhanzhi Long
Shijun Tong

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective: Accurate prediction of Intensive Care Unit Length of Stay (ICU LOS) is a cornerstone for enhancing operational efficiency, optimizing clinical decision-making, and improving patient outcomes in critical care settings. This study aims to develop a robust and interpretable predictive framework by leveraging a comprehensive suite of machine learning algorithms and the SHapley Additive exPlanations (SHAP) method, utilizing exclusively non-therapeutic clinical indicators available at admission. Methods: A retrospective analysis was conducted on a curated cohort of 654 adult patients admitted to the ICU of a tertiary-care hospital. A set of 30 non-therapeutic indicators, encompassing demographics, severity scores, comorbidities, and admission details, was meticulously curated. We implemented and rigorously tuned eight distinct ML models: Linear Regression, Lasso Regression, Ridge Regression, Decision Tree, Random Forest, eXtreme Gradient Boosting, Support Vector Regression, and a Multi-Layer Perceptron neural network. Model performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²). The best-performing model was interpreted using SHAP for global feature importance and local explainability. Results: The ensemble tree-based models, particularly XGBoost and Random Forest, demonstrated superior predictive performance. XGBoost achieved the best results with an MAE of 2.15 days, RMSE of 3.42 days, and R² of 0.71. SHAP analysis revealed that the Sequential Organ Failure Assessment (SOFA) score, patient age, type of admission, and specific comorbidities (metastatic cancer, congestive heart failure) were the most influential predictors. The Glasgow Coma Scale (GCS) score was also identified as a critical factor, where lower scores significantly increased the predicted LOS. Conclusion: The integration of advanced ML models with the SHAP framework provides a powerful, accurate, and clinically interpretable tool for early prediction of ICU LOS. By identifying key drivers of prolonged stay from readily available non-therapeutic data, this approach facilitates proactive clinical management and strategic resource planning, ultimately supporting enhanced operational efficiency in the ICU.

Version published to 10.21203/rs.3.rs-8076538/v1 on Research Square
Nov 17, 2025

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026
A Multimodal Rolling-Window Framework for ICU Transfer Prediction in Hospitalized Patients with Comorbid Hypertension

This article has 9 authors:
1. Wenting Liao
2. Min Wang
3. Li Li
4. Wei Dong
5. Junqi Liao
6. Wenzhao Liang
7. Mingming Jiang
8. Kunlun He
9. Houqiang Li
This article has no evaluationsLatest version Jan 20, 2026
ICU Mortality and LOS Prediction Models Using MachineLearning Based on Both Real and Simulated Data

This article has 3 authors:
1. Girma Neshir Alemneh
2. Hirut Bekele Ashagrie
3. Lemlem Kassa Tegegne
This article has no evaluationsLatest version Jan 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

A Multimodal Rolling-Window Framework for ICU Transfer Prediction in Hospitalized Patients with Comorbid Hypertension

ICU Mortality and LOS Prediction Models Using MachineLearning Based on Both Real and Simulated Data