Refined selection of individuals for preventive cardiovascular disease treatment with a Transformer-based risk model

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

To develop and validate the Transformer-based Risk assessment survival model (TRisk), a novel deep learning model, for prediction of 10-year risk of cardiovascular disease (CVD) in both the general population and individuals with diabetes.

Design

Prospective open cohort study design.

Setting

Primary and secondary care in England as provided by Clinical Practice Research Datalink (CPRD) Gold

Participants

An open cohort of 3 million adults aged 25 to 84 years was identified using linked primary and secondary electronic health records from 291 and 98 general practices in England and were used for model development and validation, respectively (i.e., general population cohort). Additionally, a second cohort of patients with diabetes was extracted. At study entry, patients in both cohorts were free of CVD and not prescribed statins.

Methods

TRisk utilised all diagnosis, medication, procedure, and clinical test data up to study entry in linked longitudinal primary and secondary care electronic health records for prediction of 10-year risk of CVD. Discrimination, calibration, and decision curve analyses were conducted to investigate predictive performance. The proposed model was also compared against QRISK3 and a deep learning derivation model of QRISK3 (DeepSurv). Additional analyses compared discriminatory performance in other age groups, by sex, and across categories of socioeconomic status.

Main outcome measures

Incident cardiovascular disease recorded in either linked general practice or hospital admission datasets provided by CPRD Gold.

Results

TRisk demonstrated superior discrimination (C-index in the general population: 0.910; 95% confidence interval [CI]: 0.906 to 0.913). TRisk’s performance was found to be less sensitive to population age range than the benchmark models and outperformed other models also in analyses stratified by age, sex or socioeconomic status. All models were overall well-calibrated. In decision curve analyses, TRisk demonstrated greater net benefit than benchmark models across the range of relevant thresholds. At both the recommended 10% risk threshold and the 15% risk threshold, TRisk reduced both the total number of patients classified at high risk (by 22% and 35% respectively) and the number of false negatives as compared with currently recommended strategies. TRisk similarly outperformed other models in patients with diabetes. Compared with the widely recommended treat-all policy approach for patients with diabetes, TRisk at a 10% risk threshold would lead to deselection of 24% of individuals with a small fraction of false negatives (0.2% of cohort).

Conclusion

TRisk enabled a more targeted selection of individuals at risk of CVD compared to benchmark statistical and deep learning models, in both the general population and patients with diabetes. Incorporation of TRisk into routine clinical care would allow a reduction in the number of treatment-eligible patients by approximately one-third while preventing at least as many events as with currently adopted approaches.

Article activity feed