ICU Mortality and LOS Prediction Models Using MachineLearning Based on Both Real and Simulated Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Health institutions in low-resource settings have limited and skewed clinical data, whichcomplicates mortality prediction and resource utilization. This shortcomings areaddressed by introducing machine learning models that are trained on the combineddata from the actual and synthetic patient populations to predict ICU patients’mortality and Length of Stay (LOS). Taking a dataset of 10,810 patient records fromfive hospitals in Ethiopia, we evaluated three machine learning algorithms-LogisticRegression (LG), Random Forest and XGBoost- across three data settings: real-only,synthetic-only (by taking SMOTE-NC as an example) and mixed configurations of realversus synthetic data. Our findings show that hybrid models perform better, with thebest-performing hybrid models achieving a mean absolute error (MAE) of approximately5.5 days for LOS prediction and XGBoost achieving 99.5% accuracy for mortalityprediction. The ICU patient features such as age, pulse rate, oxygen saturation, andhemoglobin levels are important indicators of ICU outcomes in Ethiopia.A prototypewas created to show model performance and offer useful information. This studyprovides a solid foundation for strategically integrating synthetic data to improvepredictive analytics in healthcare settings with limited resources.

Article activity feed