Explainable Machine Learning Models for Predicting Health-Related Quality of Life in High-Risk Cardiovascular Populations: A Comparative Analysis of SF-12 Data and Clinical Risk Stratification

Guoliang Ma
Xin Hong
Lin Zhu
Wenting Li
Zhuanzhuan Fan
Kun Li
Wenyan Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Cardiovascular diseases (CVDs) continue to pose a substantial burden on global health. While traditional clinical metrics focus on physiological outcomes, they often overlook the multidimensional nature of health-related quality of life (HRQoL). Moreover, conventional regression models struggle to capture non-linear patterns in HRQoL data. In this context, machine learning (ML) offers a promising alternative for predictive modelling and individualized risk assessment. Methods This study employed a cross-sectional study design involving 8,857 high-risk CVD individuals in Nanjing, China. HRQoL was measured using the 12-item Short Form Health Survey Version 2(SF12v2), yielding physical (PCS) and Mental Component Summary (MCS) scores. Five ML models—SVM, LightGBM, XGBoost, Random Forest and Logistic Regression—were developed following hybrid feature selection (Random Forest-Recursive Feature Elimination). Model performance was evaluated using standard metrics such as AUC, F1-score, accuracy, sensitivity, and specificity. SHAP analysis was used to analyse predictor contributions. Results The SVM model achieved the best performance in classifying PCS outcomes (AUC = 0.632), while LightGBM achieved the most balanced classification for MCS (AUC = 0.571) in terms of sensitivity (0.834) and specificity (0.238). Key predictors of HRQoL included physical activity (MET), occupation, and tea consumption. SHAP analysis revealed that individuals with MET ≥ 8,000 min/week were 14.5% more likely to attain high PCS scores, while daily tea consumption reduced psychological distress risk by 19% in MCS. Conclusion ML models, particularly SVM and LightGBM, effectively predicted HRQoL in high-risk CVD populations, with MET, occupation, and lifestyle factors emerging as actionable intervention targets. SHAP interpretability strengthens clinical applicability, enabling personalised strategies for at-risk subgroups. These findings support the inclusion of ML-based HRQoL predictions in digital health frameworks for proactive, patient-tailored cardiovascular care.

Version published to 10.21203/rs.3.rs-7413340/v1 on Research Square
Sep 30, 2025

Development and validation of an interpretable machine learning model for predicting the risk of 8-year all-cause mortality in Cardiovascular-Kidney-Metabolic Syndrome among older adults: A multicenter and cohort study

This article has 5 authors:
1. Zhiren Zhu
2. Jie Zhang
3. Peirao Wu
4. Huiping Xue
5. Dongmei Gu
This article has no evaluationsLatest version Jan 25, 2026
Machine Learning-Based Risk Prediction Model for Fatigue in Chronic Heart Failure Patients

This article has 9 authors:
1. Min Zhou
2. Jingran Yang
3. Yimei Zhang
4. Yu Wang
5. Ruijie Yanglan
6. Qinlan Li
7. Yangjuan Bai
8. Wei Wei
9. Fang Ma
This article has no evaluationsLatest version Jan 27, 2026
Multidimensional Health Phenotyping and Metabolic Syndrome Prediction in Chinese Community-Dwelling Elderly: An Integrated Data-Driven Approach

This article has 9 authors:
1. Qin Liu¹
2. Bin Huang¹
3. Huan Du¹
4. Xiaojun Fei¹
5. Wenting Zhao¹
6. Liping Yu²
7. Hongchao Lou²
8. Airong Wang²
9. Ying Xiao¹
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and validation of an interpretable machine learning model for predicting the risk of 8-year all-cause mortality in Cardiovascular-Kidney-Metabolic Syndrome among older adults: A multicenter and cohort study

Machine Learning-Based Risk Prediction Model for Fatigue in Chronic Heart Failure Patients

Multidimensional Health Phenotyping and Metabolic Syndrome Prediction in Chinese Community-Dwelling Elderly: An Integrated Data-Driven Approach