Evaluation of Machine Learning and Traditional Statistical Models to Assess the Value of Stroke Genetic Liability for Prediction of Risk of Stroke within the UK Biobank

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background and objective: Stroke is one of the leading causes of mortality and long-term disability in adults over 18 years of age globally and its increasing incidence has become a global public health concern. Accurate stroke prediction is highly valuable for early intervention and treatment. Previous studies have utilized statistical and machine learning techniques to develop stroke prediction models. Only a few have included genome-wide stroke genetic liability and evaluated its predictive values. This study aimed to assess the added predictive value of genetic liability in the prediction of the risk of stroke. Materials and methods: The study included 243,339 participants of European ancestry. Stroke genetic liability was constructed using previously identified genetic variants associated with stroke by the MEGASTROKE project through genome-wide association studies (GWAS). In our study, we built four predictive models with and without stroke genetic liability in the training set: Cox proportional hazard (Coxph), Gradient boosting model (GBM), Decision tree (DT), and Random Forest (RF) to estimate time-to-event risk for stroke. We then assessed their performances in the testing set. Results: Each unit (standard deviation) increase in genetic liability increases the risk of incident stroke by 7% (HR = 1.07, 95% CI = 1.02, 1.12, P-value = 0.0030). The risk of stroke was greater in the higher genetic liability group, demonstrated by a 14 % increased risk (HR = 1.14, 95% CI = 1.02, 1.27, P-value = 0.02) compared with the low genetic liability group. The Coxph model including genetic liability was the best-performing model for stroke prediction achieving an AUC of 69.54 (95% CI = 67.40, 71.68), NRI of 0.202 (95% CI = 0.12, 0.28; P-value = 0.000) and IDI of 1.0×10-04 (95% CI = 0.000, 3.0×10-04; P-value = 0.13) compared with the Cox model without genetic liability. Conclusion: Incorporating genetic factors in the model may provide a slight incremental value for stroke prediction beyond conventional risk factors.

Article activity feed