Predicting the timing of first sustained cognitive worsening in Alzheimer’s disease using real-world clinical data and machine learning

Shruthi Venkatesh
Sinian Zhang
Wen Zhu
Michele Morris
Rocco Mercurio
Sarah B Berman
Hansruedi Mathys
Abby L Olsen
C. Elizabeth Shaaban
Shyam Visweswaran
Oscar L Lopez
Tianxi Cai
Jue Hou
Zongqi Xia

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Cognitive assessments are sparsely documented in electronic health records (EHRs), limiting scalable detection of cognitive worsening in real-world clinical settings.

Methods

We applied a deep neural network optimized for identifying clinical event timing from sparsely labeled gold-standard data ( label-efficient incident phenotyping from longitudinal EHR , LATTE) to predict time-to-first sustained cognitive worsening in AD patients from a large healthcare system (2011–2022) with linkage to an AD Research Center registry in a subset. Sustained cognitive worsening was defined as cognitive decline persisting over ≥2 consecutive visits within 3 years. Separate LATTE models were trained with worsening labels from Clinical Dementia Rating (CDR), Mini-Mental Status Examination (MMSE), and Montreal Cognitive Assessment (MoCA) scores; semi-supervised learning scaled predictions to larger imputation cohorts lacking sufficient longitudinal scores. We evaluated model performance using average time-specific area under the receiver operating characteristic curve (AUC), area between curves (ABC), and Brier scores. To demonstrate clinical utility, we examined whether predicted time-to-worsening differentiated clinically meaningful patient subgroups using competing-risk Cox proportional hazards models accounting for death.

Findings

The cohort comprised 27,614 AD patients (65% women, 91% non-Hispanic White, mean [SD] age at start of follow-up 78.76 [9.53] years). In gold-standard cohorts (n: CDR=632, MMSE=710, MoCA=752; remaining patients formed imputation cohorts), LATTE demonstrated robust predictive performance (average time-AUC: CDR 0.816, MMSE 0.694, MoCA 0.710; ABC: CDR 0.067, MMSE 0.293, MoCA 0.078; Brier score: CDR 0.252, MMSE 0.437, MoCA 0.295). APOE -ε4 carriers had shorter predicted time-to-worsening compared to non-carriers across all assessments in the imputation cohorts (HRs 1.241–1.376, all p <0.025), and k-means derived patient clusters showed differential time-to-worsening in the overall and imputation cohorts (HRs 0.777–0.908, all p <.001).

Interpretation

LATTE enables scalable prediction of sustained cognitive worsening timing, differentiating clinically meaningful patient subgroups. This approach could improve AD clinical monitoring and decision-making in routine care and support targeted clinical trial enrichment.

RESEARCH IN CONTEXT

Evidence before this study

The growing burden of Alzheimer’s disease (AD) creates an urgent unmet need for pragmatic tools to monitor cognitive decline at the point of care and identify target patient populations for clinical trial recruitment. However, cognitive assessments are sparsely documented in electronic health records (EHRs), and fluctuating scores can obscure true worsening, whereas specialized fluid and neuroimaging biomarkers are rarely available outside research settings, limiting scalable real-world utility.

Added value of this study

We applied a deep neural network algorithm optimized for identifying clinical event timing from sparsely labeled longitudinal EHR data (LATTE) to predict time-to-first sustained cognitive worsening across three complementary cognitive assessments in AD patients from a large healthcare system: the Clinical Dementia Rating (CDR), Mini-Mental State Examination (MMSE), and Montreal Cognitive Assessment (MoCA). We defined sustained cognitive worsening as clinically meaningful decline (without improvement) persisting over ≥2 consecutive visits within 3 years. We trained LATTE on gold-standard cohorts with sparse outcome labels derived from longitudinal cognitive assessments, then leveraged a semi-supervised framework to scale predictions to larger imputation cohorts lacking gold-standard cognition outcome labels. The algorithm robustly predicted the timing of first sustained cognitive worsening, with CDR outperforming MMSE and MoCA. For orthogonal validation, predicted time-to-worsening differentiated clinically meaningful subgroups defined by APOE-ε4 carrier status and knowledge graph-guided patient clustering.

Implications of all the available evidence

Scalable prediction of cognitive worsening from sparsely labeled EHR data could identify patients at higher risk of sustained cognitive worsening. In routine clinical practice, this could inform timely disease-modifying therapy initiation and care planning. Targeted enrichment of clinical trial populations with higher-risk patients could increase statistical power and reduce sample size requirements.

Version published to 10.64898/2026.06.02.26354764 on medRxiv
Jun 4, 2026

Predicting 24-Month MCI-to-Alzheimer’s Conversion Using Routine Clinical Assessments Without Neuroimaging or Genetic Testing

This article has 1 author:
1. Sophie Choe
This article has no evaluationsLatest version Jun 24, 2026
Explainable Longitudinal Machine Learning for Dementia Progression Using Cognitive and MRI Biomarkers

This article has 5 authors:
1. Gifty Duah
2. Eric Nyarko
3. Justice Yaw Effah
4. Isaac Boateng Numoah
5. Anani Lotsi
This article has no evaluationsLatest version Jul 14, 2026
Explainable Machine Learning Models for Alzheimer’s Diagnosis Using Routine and Low-Cost Clinical Data

This article has 3 authors:
1. Daniele De Carli
2. Alberto Sudati
3. Fabio Dercole
This article has no evaluationsLatest version Jul 13, 2026

Discuss this preprint

Listed in

Abstract

Background

Methods

Findings

Interpretation

RESEARCH IN CONTEXT

Evidence before this study

Added value of this study

Implications of all the available evidence

Article activity feed

Related articles

Predicting 24-Month MCI-to-Alzheimer’s Conversion Using Routine Clinical Assessments Without Neuroimaging or Genetic Testing

Explainable Longitudinal Machine Learning for Dementia Progression Using Cognitive and MRI Biomarkers

Explainable Machine Learning Models for Alzheimer’s Diagnosis Using Routine and Low-Cost Clinical Data