A Multi-Model Evaluation Framework for Accurate and Interpretable Heart Disease Prediction Using Ensemble Machine Learning and Low-Code Deployment Tools
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Heart disease remains a leading global cause of mortality, highlighting the urgent need for effective early diagnostic tools. This study introduces a robust, comparative machine learning framework for predicting heart disease based on a consolidated dataset comprising 918 patient records and 46 clinically relevant features. Ten well-established supervised learning algorithms—including Gradient Boosting, Random Forest, Logistic Regression, Support Vector Machine (SVM), Neural Network, AdaBoost, CN2 Rule Induction, k-Nearest Neighbors (kNN), Naive Bayes, and Decision Tree—were rigorously evaluated. The models were assessed using a suite of metrics, including accuracy, precision, recall, F1-score, area under the curve (AUC), and Matthews correlation coefficient (MCC), to ensure a comprehensive performance profile. Gradient Boosting achieved the highest predictive accuracy (87.4%) and AUC (0.928), outperforming all other models in identifying patterns within the clinical dataset. The methodology integrates both Python-based libraries and the Orange Data Mining tool to support low-code, reproducible workflows for healthcare practitioners and researchers. In addition to delivering high-performance classification, the study highlights model interpretability, feature relevance, and practical deployment using accessible platforms. These contributions underscore the potential of ensemble-based machine learning to enhance early detection and clinical decision-making in cardiovascular healthcare.