Comparative Study of Machine Learning Techniques for Diabetes Forecasting

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rising global prevalence of diabetes has intensified the need for accurate and early diagnostic systems. As a significant global health concern, diabetes requires effective and precise prediction techniques. This study reviews research that utilizes clinical data and machine learning (ML) approaches for diabetes prediction. Common pre-processing steps include categorical data encoding, handling missing values, and normalization. To enhance model performance, dimensionality reduction techniques such as Principal Component Analysis (PCA) and feature selection are employed. Performance metrics—such as accuracy, precision, recall, F1-score, and AUC-ROC—are used to evaluate and compare various supervised learning algorithms, including Random Forest, Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), Logistic Regression, and Decision Trees. Many studies use small datasets, which limits generalizability despite reporting high accuracy. This study underscores the need for diverse datasets and clinically interpretable models, while also highlighting gaps in model interpretability and validation practices.

Article activity feed