Construction and Validation of an Interpretable Machine Learning Model for Predicting Diabetes Risk in COPD Patients

Lingpin Pang
Siyan Xu
Yingxin Wang
Tao Huang
Qian Xian
Wenjia Lin
Haowen Pang
Zhirui Chen
Bozhi Zhong
Hui Miao
Hui Chen
Xishi Sun
Jie Sun

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective To develop a machine learning (ML)-based prediction model for identifying high-risk diabetic individuals among COPD patients, thereby facilitating early and personalized management of this complication. Methods Data from COPD patients in the MIMIC-IV database were split into training (70%) and validation (30%) sets. LASSO regression and logistic regression were used to screen 49 variables, and six ML algorithms were employed to construct and internally validate the prediction model. Model performance was evaluated using multiple metrics, followed by external validation. Finally, SHAP (SHapley Additive exPlanations) analysis was performed for interpretability. Results All six ML algorithms demonstrated excellent performance in the training, testing, and validation sets, as evidenced by ROC curve analysis, with LightGBM showing the best overall performance. Feature importance analysis revealed that marital status, blood glucose level, and insurance type were the top three factors influencing diabetes development in COPD patients. Conclusion This study developed an interpretable ML-based risk prediction model for diabetes in COPD patients. The model provides clinicians with a novel tool for early personalized intervention, ultimately improving patient prognosis.

Version published to 10.21203/rs.3.rs-7033945/v1 on Research Square
Aug 19, 2025

Machine Learning-Based Risk Prediction Model for Fatigue in Chronic Heart Failure Patients

This article has 9 authors:
1. Min Zhou
2. Jingran Yang
3. Yimei Zhang
4. Yu Wang
5. Ruijie Yanglan
6. Qinlan Li
7. Yangjuan Bai
8. Wei Wei
9. Fang Ma
This article has no evaluationsLatest version Jan 27, 2026
Development and Validation of a Machine Learning-Based Risk Prediction Model for Ischemic Stroke-Diabetes Comorbidity

This article has 2 authors:
1. Litian Hu
2. Hongyu Sun
This article has no evaluationsLatest version Dec 23, 2025
Machine Learning Insights for Cardiovascular Risk Prediction in Diabetic Patients: Emphasis on Renal and Cardiac Markers Using Random Forests

This article has 1 author:
1. Julian Borges
This article has no evaluationsLatest version Jan 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine Learning-Based Risk Prediction Model for Fatigue in Chronic Heart Failure Patients

Development and Validation of a Machine Learning-Based Risk Prediction Model for Ischemic Stroke-Diabetes Comorbidity

Machine Learning Insights for Cardiovascular Risk Prediction in Diabetic Patients: Emphasis on Renal and Cardiac Markers Using Random Forests