Deep Longitudinal Clusters of Type 2 Diabetes Pathophysiology and their Risk of Cardiovascular Disease Events and All-Cause Mortality

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

Despite the complex and non-linear progression of diabetes, its shared pathways with atherosclerotic cardiovascular disease (ASCVD) are conventionally described using models based on single time points. We identified longitudinal diabetes clusters before diagnosis using deep learning and studied their association with ASCVD events and mortality.

Methods

We analyzed 157,670 visits from 15,871 adults (25-65 years) without diabetes from four pooled U.S. cohorts (median follow-up: 22 years [IQR: 9-30]). A gated recurrent unit model with decay (GRU-D) was used to predict 1-year risk of diabetes or censoring within 10 years, by learning longitudinal embeddings across 25 clinical characteristics and biomarkers. Parallel Factor Analysis-2 (PARAFAC-2) and Gaussian mixture models (GMM) were used to group longitudinal participant representations as clusters. Landmark time Cox proportional hazards regressions, relative to last observation in the training window, were used to study covariate-adjusted associations of clusters with ASCVD and mortality. Prognostic utility of clusters beyond the PREVENT risk score was assessed using Harrell’s C-index. Findings were replicated in a fifth cohort.

Results

The analytic sample was aged 49 years [SD: 11], 58% female, and 68% white; 1,202 (8%) developed diabetes within the first 10 years. We identified five clusters (Cluster A to E) that differed in their clinical characteristics over time. Cluster E (46%) had the highest cumulative incidence of diabetes in the study period, followed by Cluster C (40%) and Cluster A (38%). Cluster C, which was defined by older age, high blood pressure, and suboptimal renal function at the first visit, had higher rates of ASCVD (HR: 1.09, 95%CI: 0.98-1.21) and mortality (HR: 1.08, 95%CI: 1.00-1.16), relative to Cluster A despite being similar in age and BMI at the first visit. Relative to Cluster A, all other clusters had similar or lower rates of ASCVD and mortality. We observed substantial cluster effects for three clusters (Clusters C to E), which were based on only two cohorts. The two clusters (Clusters A and B) that included participants from all four cohorts were reproduced in the fifth cohort and showed similar rates of outcomes. Clusters did not improve ASCVD prognosis, relative to a model that included only the PREVENT risk score.

Conclusions

Longitudinal clusters reveal substantial heterogeneity in the period before diabetes diagnosis, and their risk for ASCVD and mortality. However, clusters discovered may, in part, be explained by cohort effects from variations in recruitment and visit patterns after recruitment.

RESEARCH IN CONTEXT

Why did we undertake this study?

Pathophysiological heterogeneity before diabetes diagnosis is known to be dynamic over time. Previous studies that captured pre-diagnosis heterogeneity were limited to single time-point biomarkers.

What is the specific question we wanted to answer?

Can we identify longitudinal clusters of the period before diabetes diagnosis in a pooled cohort of U.S. adults (25-65 years) without diabetes using a deep learning-enabled workflow and does cluster membership predict risk of future ASCVD events and mortality?

What did we find?

Individuals without diabetes belonged to five longitudinal clusters with heterogeneous biomarker trajectories. The clusters showed clinically meaningful differences in relative incidence of ASCVD events and mortality.

What are the implications of our findings?

Risk stratification based on longitudinal trajectories of biomarkers can inform precision prevention of diabetes, ASCVD events, and premature mortality.

Article activity feed