Development and Validation of an Interpretable Machine Learning Model for Predicting 5-year Major Adverse Cardiovascular Events in Patients with Coronary Artery Disease

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Coronary artery disease (CAD) remains a major contributor to global cardiovascular mortality. The accurate prediction of prognosis is critical for guide clinical decision-making. This study aimed to develop and validate interpretable machine learning (ML) models for predicting 5-year major adverse cardiovascular events (MACE) in hospitalized CAD patients. Methods A prospective cohort of 705 CAD patients was included and randomly divided into training (n = 564) and validation (n = 141) sets. Eleven key predictors were selected using Least Absolute Shrinkage and Selection Operator (LASSO) regression. Six ML algorithms were developed, and model performance was assessed using discrimination, calibration, and decision curve analysis. Shapley Additive Explanations (SHAP) were applied to enhance model interpretability. Results Key predictors identified by LASSO regression included left ventricular ejection fraction (LVEF), N-terminal pro-B-type natriuretic peptide (NT-proBNP), nitrate use, CAD duration, depressive symptoms, and age. The random forest (RF) model demonstrated superior performance, achieving the highest Area Under the Curve (AUC) in both training (0.887, 95% CI: 0.859–0.915) and validation (0.753, 95% CI: 0.656–0.849) cohorts, along with optimal balance of sensitivity (0.856), F1 score (0.708), and Brier score (0.152). The LASSO method revealed that LVEF, NT-proBNP, and nitrate use were the top 3 predictors of 5-year mace. Depressive symptoms were also associated with increased MACE risk. Conclusions This interpretable RF-based model provides accurate and interpretable 5-year MACE prediction in CAD patients. By integrating clinical and psychosocial features, it supports personalized secondary prevention. External validation is warranted to assess real-world applicability.

Article activity feed