Plasma Proteomics and Diabetes Duration Predict Aneurysm Incidence and Rupture Using Machine Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Early identification of individuals at high risk for aneurysms, particularly ruptured aneurysms, is critical for timely intervention. However, existing imaging-free prediction models have significant limitations. This study aims to develop a robust model for predicting aneurysm incidence and rupture by leveraging multi-omics data, including circulating proteomics, and identifying specific biomarkers.
Methods
UK Biobank participants (n = 502,389; mean age: 58.0 years; 54.4% female) without a history of aneurysms were divided into a training set, which included a derivation set (n = 473,630) and a validation set (n = 8,628), as well as a testing set (n = 20,131) for model evaluation. Cox proportional hazards (CPH) models were used to estimate the risk of aneurysm events, including total, unruptured, and ruptured aneurysms across three types: aortic aneurysm (AA), abdominal aortic aneurysm (AAA), and intracranial aneurysm (IA). We developed base models that incorporated plasma proteomics (Proteins), metabolomics (Metabolites), polygenic risk scores (PRS), and clinical risk factors (RF) to predict nine aneurysm-related outcomes using 10-fold cross-validation with LASSO regression. Additionally, we investigated the relationship between diabetes duration and aneurysm events and developed a classification model, the Diabetes Duration Score (DDscore), to enhance model performance.
Results
During the 14.8-year follow-up, there were 4,292 AA events, 2,730 AAA events, and 3,644 IA events. The Proteins Model demonstrated superior or comparable discriminative performance for most AA and AAA endpoints, with C-indexes exceeding 0.9 for rupture events. However, no predictive advantage was observed for IA endpoints. For different time windows, the Proteins Model achieved the highest AUC for most endpoints within 5 years. Time-dependent analysis revealed an opposing relationship between diabetes duration and aneurysm risk: shorter diabetes duration was associated with higher risk, while longer duration reduced risk. Adding DDscore significantly improved predictions for AA and AAA, particularly for ruptured AAA (C-index [95% CI]: Proteins + DDscore 0.93 [0.88-0.99] and Proteins + RF + PRS + Metabolites + DDscore 0.94 [0.91-1.00]). For clinical utility, the Proteins or Proteins + DDscore models provided greater net benefit at low decision thresholds (0%-2% for ruptured AA and 0%-1% for ruptured AAA). Additionally, 30 rupture-specific plasma proteins with high weight were identified for all types of aneurysms.
Conclusions
Plasma proteomics and diabetes duration demonstrated exceptional predictive capabilities for aneurysm events, particularly rupture. The machine-learning model developed in this study achieved accurate predictions even up to 10 years before diagnosis, with potential implications for high-risk screening and early intervention.