Machine Learning Prediction of MACE in Older Chinese Adults Integrating Traditional and Geriatric-Specific Risk Factors: A CHARLS Cohort Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Cardiovascular disease (CVD) poses a substantial health burden on China's aging population. Existing cardiovascular risk models often perform poorly in older Chinese adults and rarely integrate geriatric-specific non-traditional factors. This study aimed to develop and validate machine learning-based models incorporating both traditional and non-traditional risk factors for predicting major adverse cardiovascular events (MACE) among older Chinese adults. Methods Data from 4,580 participants aged ≥ 60 years without baseline MACE were obtained from the China Health and Retirement Longitudinal Study (CHARLS, 2011–2018). Incident MACE (myocardial infarction or stroke) was self-reported during a median follow-up of approximately 7 years. Candidate predictors included demographics, health behaviors, clinical measures, anthropometric indices, biomarkers, depressive symptoms (CES-D score), and functional limitations (Activities of Daily Living, ADL). Missing data were handled via Multiple Imputation by Chained Equations (MICE, 5 imputations). Logistic Regression (LR), Random Forest (RF), and XGBoost models were trained using stratified 70/30 splits for training and testing sets. Hyperparameter tuning employed a grid search with limited complexity. Model performance was evaluated by discrimination (AUC), calibration (Brier score and calibration plots), and clinical utility (Decision Curve Analysis, DCA). Exploratory non-linear relationships were assessed using generalized additive models (GAMs). Results were benchmarked against the Framingham Risk Score (FRS-CVD), and an LR-based nomogram was developed. Results Incident MACE occurred in 28.7% of participants. The LR model demonstrated the highest discrimination (mean AUC = 0.649), closely followed by XGBoost (mean AUC = 0.645); both significantly outperformed RF (mean AUC = 0.632) and the FRS-CVD benchmark (mean AUC = 0.504). LR and XGBoost models showed good calibration and superior net benefit in DCA. Significant independent predictors in the LR model included hypertension history (OR = 1.85), diabetes (OR = 1.34), age (OR = 1.02/year), systolic blood pressure (OR = 1.006/mmHg), high education level (OR = 1.60), waist circumference (OR = 1.01/cm), depressive symptoms (CES-D score, OR = 1.03/point), and ADL limitations (OR = 1.11/limitation). GAM analysis revealed significant non-linear relationships for age and waist circumference. Conclusion Machine learning models integrating traditional and non-traditional factors effectively predict MACE risk in older Chinese adults, outperforming the standard FRS. Central obesity, depressive symptoms, and functional impairments were significant predictors, underscoring the importance of holistic cardiovascular risk assessment in geriatric populations. The developed nomogram offers a practical clinical tool pending external validation.