Comparing different types of machine learning models in diagnosing diabetes mellitus utilizing electrocardiography and clinical data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Diabetes Mellitus (DM) represents one of the most significant global public health challenges of the 21st century. This dramatic increase in the prevalence returns to the poor early diagnosis of the diabetes mellitus.
Aim
To develop and validate a novel machine learning model for early diagnosis diabetes mellites based on the single lead electrocardiography (ECG) and clinical data in combination.
Materials and methods
A single center prospective study involved participants with vs without diabetes milieus. All the participants passed a consultation with cardiologist, single lead ECG registration, and random glucose measurement. The statistical analysis conducted using python 3.
Results
Based on the single lead electrocardiography parameters, the Gradient Boosting showed the highest performance with an area under the curve (AUC) 0.8824 (95% CI: 0.8620-0.9029). the use of the clinical parameters showed an AUC 0.9157 (95% CI: 0.8818-0.9496). Using the single lead electrocardiography and clinical data in combination, the LightGBM model showed the highest performance with an AUC 0.9527 (95% CI: 0.9211-0.9842).
Conclusion
In conclusion, while our study demonstrates a highly accurate model for diabetes diagnosis, these limitations highlight that this is a proof-of-concept. The journey from a promising algorithm to a validated clinical decision support tool requires addressing these challenges through larger, more diverse, longitudinal studies and a steadfast commitment to model interpretability.
What is known about this research topic?
Diabetes Mellitus is associated with measurable electrocardiographic changes, and clinical risk factors like BMI and hypertension are established predictors. Existing models typically use either ECG features or clinical data alone for screening.
What this study adds and its future implications
This study demonstrates that combining single-lead ECG parameters with clinical data significantly improves diabetes diagnosis accuracy using machine learning, achieving an AUC of 0.9527 with LightGBM. These findings support the development of non-invasive, integrated screening tools, though future work requires external validation and explainable AI for clinical adoption.