Heart Failure Prediction: An Explainable Cross-Validated Comparison of Several Machine Learning Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Heart disease is the leading cause of death worldwide, contributing to millions of fatalities each year. Early detection and accurate risk prediction are therefore critical for timely intervention and improved patient outcomes. In this study, seven different Machine Learning (ML) prediction models were tested on three collections of heart disease data. The data was divided, with 80% used to train the models and 20% saved to test them later. The settings for the models were carefully adjusted using a method that checks their performance multiple times to find the best ones. The models were then evaluated on the unseen test data. Their performance was measured using several metrics (Accuracy, Recall, Precision, PR-AUC and ROC-AUC), and their results were further examined using a confusion matrix. In addition to traditional evaluation metrics, SHapley Additive exPlanations (SHAP) analysis was employed to interpret the contribution of each feature to the model’s predictions. It was observed that the Multi-Layer Perceptron (MLP) achieved the highest performance on both datasets, demonstrating strong predictive capability while remaining interpretable through the integration of SHAP. This study shows that modern ML models can be very good at predicting heart disease risk, and provide explainable performance. A ready-to-use method for predicting heart disease risk is provided by this study to help doctors choose the best tool.