Comparison of different machine learning methods for predicting heart disease
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose : The study aims to accurately predict the presence of heart disease using machine learning models. The research evaluates and compares the performance of five algorithms—Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosting—on a dataset containing clinical features of patients. The primary research question is to identify which algorithm demonstrates the best predictive performance for heart disease diagnosis. Methods : The study used a dataset of 270 patients with 13 clinical features. The data was preprocessed, and target variables were converted into binary values for classification. The dataset was split into training and test sets in a 70-30 ratio. Five machine learning models were trained and evaluated using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Confusion matrices were analyzed to gain additional insights into model performance. Results : Logistic Regression and Random Forest achieved the best performance among the models, with accuracies of 86.4% and 80.2%, respectively. Logistic Regression demonstrated an ROC-AUC score of 0.844, while Random Forest scored 0.88. The confusion matrices highlighted the predictive strengths and limitations of each model. Conclusion : Logistic Regression and Random Forest were identified as the most reliable models for predicting heart disease in this dataset. Future work will explore hyperparameter tuning and ensemble methods to further enhance model performance, providing valuable insights for early diagnosis and treatment of cardiovascular diseases.