A Self-Explainable Dynamic Risk Monitoring Framework for Predicting Alzheimer’s Disease and Related Dementias

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Alzheimer’s Disease and Related Dementias (ADRD) affect millions worldwide and can begin over a decade before symptoms appear. ADRD are generally irreversible once clinical symptoms appear, making early prediction and intervention critical. While neuroimaging improves prediction, its availability restricts use at the population level. Electronic Health Record (EHR) data offers a scalable alternative, but existing models often overlook three key challenges: irregular clinical encounters, severe data sparsity, and the need for interpretability. To address these gaps, we propose GRU-D-RETAIN, a temporal deep learning architecture combines GRU-D’s strength in parameterized missing imputation with RETAIN’s explainable attention mechanism, enabling real-time risk monitoring at arbitrary clinical encounters with meaningful interpretations.

Methods

We identified 15,172 ADRD cases (age>=50) and 145,443 gender and date of birth matched controls from 6M patients in the University of Texas (UT) Physician EHR system. EHR were retrieved for each individual up to 10 years before ADRD diagnosis, and a random follow-up initiation date was assigned to simulate a real-world 10-year follow-up practice. Competing models including GRU-D-RETAIN, GRU-D, LSTM, Logit static, and Logit dynamic were trained on 6-fold cross-validation chunks and applied to the held-out to estimate performance.

Results

The scarcity of EHR records beyond 10 years before ADRD diagnosis precludes the development of valid predictive models beyond this timeframe. At the 10- year mark, only diagnoses of hypertension and hyperlipidemia exceeded 1% among ADRD cases. After randoming follow-up initiation date, GRU-D-RETAIN exhibited performance closely matching that of GRU-D across the entire follow-up period, both showing improved accuracy as follow-up time increases. Without applying data availability cut-off, both models achieved AUROC of 0.6 and 0.7 at 2-year and 8-year follow-up, respectively, significantly outperforming competing models. Data availability plays a more critical role than follow-up length in determining prediction performance. For example, 1 year of follow-up with 15% data availability yields comparable performance (AUROC of 0.75 and average precision of 0.5) to 7.5 years of follow-up with 10% data availability. For individual ADRD cases, GRU-D-RETAIN offered overall consistent explanations across training folds. However, certain folds produced different explanations at both the timestep and feature levels, despite yielding similar risk predictions.

Conclusion

We demonstrate that EHR data can support dynamic ADRD risk monitoring up to 10 years before diagnosis, though model utility depends highly on data completeness. GRU-D-RETAIN enables real-time risk monitoring with explainable attention weights at both timestep and feature levels, aiding clinicians in interpreting the output and identifying high-risk patients as well as potential key risk factors at individual level. This framework is broadly applicable to other conditions expecting irregular clinical encounters and requiring dynamic and interpretable risk assessment.

Article activity feed