Evaluating the Performance of Ensemble Learning Methods in Diabetes Disease Classification

Sajjad aghasi javid
Aliasghar Khakpaki

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Diabetes mellitus is a prevalent metabolic disorder characterized by chronic hyperglycemia and associated with severe complications. Accurate early detection is essential for effective management and prevention of disease progression. This study systematically evaluates the performance of three ensemble learning approaches Bagging, Boosting, and Stacking on three benchmark diabetes datasets: Pima Indians Diabetes, Frankfurt Hospital Diabetes, and Sylhet Hospital Diabetes (NIDDK). Class imbalance, a common challenge in these datasets, was addressed using the Synthetic Minority Oversampling Technique (SMOTE) during preprocessing to enhance model stability and classification reliability. Experimental results indicate that Boosting-based methods consistently outperform Bagging and Stacking. On the Pima dataset, Gradient Boosting, Extreme Gradient Boosting, and CatBoost achieved a maximum accuracy of 81.82%. On the Frankfurt dataset, Light Gradient Boosting reached 99.25% accuracy, while on the NIDDK dataset, Light Gradient Boosting and CatBoost attained perfect accuracy (100%). These findings highlight the effectiveness of integrating SMOTE with Boosting-based ensemble models to mitigate class imbalance and improve diabetes classification. The results underscore the importance of both data preprocessing and algorithm selection in achieving high predictive performance, with significant implications for precision medicine and clinical decision support.

Version published to 10.21203/rs.3.rs-7775507/v1 on Research Square
Nov 3, 2025

Improving Type 2 Diabetes Prediction: Comparative Evaluation of Machine Learning Classifiers Using Balanced Data from the AWI-Gen Cohort

This article has 1 author:
1. Richmond Balinia Adda
This article has no evaluationsLatest version Nov 4, 2025
Unified approach for Accurate Heart Disease Prediction using Machine Learning Techniques

This article has 4 authors:
1. Raghavendra Rao RV
2. Ram Mohan Reddy Ch
3. Hemanth K
4. Hruthik Chavan D
This article has no evaluationsLatest version Oct 28, 2025
Toward Interpretable Glucose Forecasting for Type 2 Diabetes: A Comparative Study among Traditional, Deep, and Large Language Models

This article has 3 authors:
1. Rawan Alredaini
2. Maysoon Abulkhair
3. Hind Almisbahi
This article has no evaluationsLatest version Oct 15, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Improving Type 2 Diabetes Prediction: Comparative Evaluation of Machine Learning Classifiers Using Balanced Data from the AWI-Gen Cohort

Unified approach for Accurate Heart Disease Prediction using Machine Learning Techniques

Toward Interpretable Glucose Forecasting for Type 2 Diabetes: A Comparative Study among Traditional, Deep, and Large Language Models