Enhancing Prediction of Heart Disease Using Hybrid Machine Learning Methods XGBoost and K-Means Clustering
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Efficient Heart Disease Prediction by Hybrid Machine Learning Methods Heart disease is a major killer around the world and has much promise using machine learning model techniques highly problematic concerning privacy risk as well as lack the handling of heterogeneous (non-IID) data by institutions. This paper aims to create and experiment with a hybrid machine learning framework integrating Extreme Gradient Boosting (XGBoost) with K-Means Clustering in enhancing predictive power by bringing together powerful ensemble learning and unsupervised pattern detection. The model was tested and trained on benchmark databases such as the UCI Heart Disease and Cleveland datasets and had a centralized architecture optimized for strength and interpretability Performance to baseline models such as Logistic Regression, Naive Bayes, and Random Forest on accuracy, precision, recall, F1-score, AUC-ROC, and computational efficiency The XGBoost-KMeans model performed best with 93.8% accuracy, 92.4% precision, 93.1% recall, 92.7% F1-score, and AUC-ROC of 0.96, outperforming baseline models by 5–6% accuracy and enhancing generalization by classifying patient groups with comparable risk factors. The KMeans module enabled enhanced feature representation and clustering, while XGBoost enabled high-accuracy classification. Overall, the XGBoost-KMeans system provides a robust and explainable solution to predict heart disease with performance for healthcare analytics systems and deployable scalability in clinical settings with IoT and wearable data sources.