A Machine Learning Model for Real-Time Hypoglycemia Risk Prediction in Hospitalized Diabetic Patients: Development and Validation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Hypoglycemia is the main obstacle for achieving optimal glucose management in diabetic patients. Despite advances in understanding risk factors, current prediction models for hypoglycemia often rely on static variables and are not optimized for real-time risk assessment in hospitalized patients. This study aims to develop and validate a machine learning (ML)-based prediction model for inpatient hypoglycemia, integrating dynamic clinical data to improve accuracy and clinical utility. Methods and Findings We conducted a retrospective study of 37,966 inpatients with diabetes mellitus at Nanfang Hospital, affiliated with Southern Medical University, from January 2021 to December 2022. After applying the inclusion and exclusion criteria, 2,845 patients were included in the final analysis. Data preprocessing focused on analyzing potential predictors, including demographic characteristics, medication use, comorbidities, and laboratory parameters. Through a stepwise forward variable selection method based on XGBoost, we identified 10 optimal predictors. The cohort was randomly split into training and testing sets at an 8:2 ratio. Predictive performance was assessed via the area under the curve (AUC). Ten ML algorithms, including the support vector machine (SVM), CatBoost, XGBoost, random forest, transformer, gradient boosting decision tree (GBDT), TabNet, AdaBoost, light gradient boosting machine (LGBM), and decision tree algorithms, were evaluated. The CatBoost algorithm demonstrated the best performance, achieving an AUC of 0.85, a positive predictive value (PPV) of 0.75, and a negative predictive value (NPV) of 0.89. The model's decision-making utility was further validated through decision curve analysis and calibration curves, which revealed superior clinical applicability. The key predictors included BMI; insulin use; and laboratory markers such as HbA1c, creatinine, and triglycerides. Conclusions Our ML-based predictive model for inpatient hypoglycemia demonstrates robust performance and integrates readily available clinical parameters, offering significant potential for early risk identification and preventive intervention. Future research should focus on multicenter validation and real-time integration into clinical decision support systems to increase generalizability and precision. This study highlights the importance of dynamic data in improving hypoglycemia risk prediction and underscores the potential of ML in advancing diabetes care.