Comparison of Ensemble and Meta-Ensemble Models for Early Risk Prediction of Acute Myocardial Infarction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cardiovascular disease (CVD) is a major cause of mortality around the world. This underscores the critical need to implement effective predictive tools to inform clinical decision-making. This study aimed to compare the predictive performance of ensemble learning algorithms, including Bagging, Random Forest, Extra Trees, Gradient Boosting, and AdaBoost, when applied to a clinical dataset comprising patients with CVD. The methodology entailed data preprocessing and cross-validation to regulate generalization. The performance of the model was evaluated using a variety of metrics, including accuracy, F1 score, precision, recall, Cohen’s Kappa, and area under the curve (AUC). Among the models evaluated, Bagging demonstrated the best overall performance (accuracy ± SD: 93.36% ± 0.22; F1 score: 0.936; AUC: 0.9686). It also reached the lowest average rank (1.0) in Friedman test and was placed, together with Extra Trees (accuracy ± SD: 90.76% ± 0.18; F1 score: 0.916; AUC: 0.9689), in the superior statistical group (group A) according to Nemenyi post hoc test. The two models demonstrated a high degree of agreement with the actual labels (Kappa: 0.87 and 0.83, respectively), thereby substantiating their reliability in authentic clinical contexts. The findings substantiated the preeminence of aggregation-based ensemble methods in terms of accuracy, stability, and concordance. This underscored the prominence of Bagging and Extra Trees as optimal candidates for cardiovascular diagnostic support systems, where reliability and generalization were paramount.