Prediction, prognosis and monitoring of neurodegeneration at biobank-scale via machine learning and imaging

Anant Dadu
Michael Ta
Nicholas J Tustison
Ali Daneshmand
Ken Marek
Andrew B Singleton
Roy H Campbell
Mike A Nalls
Hirotaka Iwaki
Brian Avants
Faraz Faghri

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Alzheimer’s disease and related dementias (ADRD) and Parkinson’s disease (PD) are the most common neurodegenerative conditions. These central nervous system disorders impact both the structure and function of the brain and may lead to imaging changes that precede symptoms. Patients with ADRD or PD have long asymptomatic phases that exhibit significant heterogeneity. Hence, quantitative measures that can provide early disease indicators are necessary to improve patient stratification, clinical care, and clinical trial design. This work uses machine learning techniques to derive such a quantitative marker from T1-weighted (T1w) brain Magnetic resonance imaging (MRI).

Methods

In this retrospective study, we developed machine learning (ML) based disease-specific scores based on T1w brain MRI utilizing Parkinson’s Disease Progression Marker Initiative (PPMI) and Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohorts. We evaluated the potential of ML-based scores for early diagnosis, prognosis, and monitoring of ADRD and PD in an independent large-scale population-based longitudinal cohort, UK Biobank.

Findings

1,826 dementia images from 731 participants, 3,161 healthy control images from 925 participants from the ADNI cohort, 684 PD images from 319 participants, and 232 healthy control images from 145 participants from the PPMI cohort were used to train machine learning models. The classification performance is 0.94 [95% CI: 0.93-0.96] area under the ROC Curve (AUC) for ADRD detection and 0.63 [95% CI: 0.57-0.71] for PD detection using 790 extracted structural brain features. The most predictive regions include the hippocampus and temporal brain regions in ADRD and the substantia nigra in PD. The normalized ML model’s probabilistic output (ADRD and PD imaging scores) was evaluated on 42,835 participants with imaging data from the UK Biobank. There are 66 cases for ADRD and 40 PD cases whose T1 brain MRI is available during pre-diagnostic phases. For diagnosis occurrence events within 5 years, the integrated survival model achieves a time-dependent AUC of 0.86 [95% CI: 0.80-0.92] for dementia and 0.89 [95% CI: 0.85-0.94] for PD. ADRD imaging score is strongly associated with dementia-free survival (hazard ratio (HR) 1.76 [95% CI: 1.50-2.05] per S.D. of imaging score), and PD imaging score shows association with PD-free survival (hazard ratio 2.33 [95% CI: 1.55-3.50]) in our integrated model. HR and prevalence increased stepwise over imaging score quartiles for PD, demonstrating heterogeneity. As a proxy for diagnosis, we validated AD/PD polygenic risk scores of 42,835 subjects against the imaging scores, showing a highly significant association after adjusting for covariates. In both the PPMI and ADNI cohorts, the scores are associated with clinical assessments, including the Mini-Mental State Examination (MMSE), Alzheimer’s Disease Assessment Scale-cognitive subscale (ADAS-Cog), and pathological markers, which include amyloid and tau. Finally, imaging scores are associated with polygenic risk scores for multiple diseases. Our results suggest that we can use imaging scores to assess the genetic architecture of such disorders in the future.

Interpretation

Our study demonstrates the use of quantitative markers generated using machine learning techniques for ADRD and PD. We show that disease probability scores obtained from brain structural features are useful for early detection, prognosis prediction, and monitoring disease progression. To facilitate community engagement and external tests of model utility, an interactive app to explore summary level data from this study and dive into external data can be found here https://ndds-brainimaging-ml.streamlit.app . As far as we know, this is the first publicly available cloud-based MRI prediction application.

Funding

US National Institute on Aging, and US National Institutes of Health.

Research in context

Evidence before this study

We searched PubMed for articles published in English from database inception to May 11, 2023, about the use of machine learning on brain imaging data for Alzheimer’s disease (AD), dementia, and Parkinson’s disease (PD) populations. We used search terms “machine learning” AND “brain imaging” AND “neurodegenerative disorders” AND “quantitative biomarkers”. The search identified 25 studies. Most of these studies are focused on Alzheimer’s disease. They use machine learning to predict conversion from mild cognitive impairment to dementia or to build a classification tool. Many studies also focused on positron emission tomography (PET) images rather than cost-effective T1w MRI images in their analysis. None of the studies have focused on detecting disease during the asymptomatic phase of dementia and PD. Identified studies are limited in sample size (order of hundred samples) and extracted features. The assessments of the clinical utility of machine learning models’ predicted disease probabilities are scarce. Significantly, no attempts were made to validate the algorithm in an external cohort. In this work, we have limited our review to scientific studies that are transparent and reproducible, including those that provide code and validate their findings on a reasonable sample size.

Added value of this study

This study developed machine learning based quantitative scores to measure the risk, severity, and prognosis of Alzheimer’s disease and related dementias (ADRD) and Parkinson’s disease (PD) using brain imaging data. Neurodegenerative disorders affect multiple body functions and exhibit significant etiology and clinical presentation variation. Patients with these conditions may experience prolonged asymptomatic periods. Disease-modifying therapies are most effective during the early asymptomatic stage of the disease, making early intervention a crucial factor. However, the lack of biomarkers for early diagnosis and disease progression monitoring remains a significant obstacle to achieving this goal. We leveraged disease-specific cohorts ADNI (1,826 images from 731 dementia participants) and PPMI (684 images from 329 PD participants) to develop a machine learning classifier for AD and PD detection using T1w brain imaging data. We obtain disease-specific imaging scores from these trained models using the normalized disease probability score. In a sizable external biobank, UK Biobank (42,835 participants), we found these scores show strong predictive power in determining the occurrence of PD or dementia during a 5-year followup. The occurrence of PD increased stepwise over ascending imaging score quantiles representing heterogeneity within the PD population. Imaging scores are also associated with pathological and clinical assessment measures. Our study indicates this could be a single numeric indicator representing disease-specific abnormality in T1w brain imaging modality. The association of imaging scores with the polygenic risk score of related disorders implies the genetic basis of these scores. We also identified top brain regions associated with dementia and Parkinson’s disease using feature interpretation tools.

Implications of all the available evidence

The findings should improve our ability to create practical passive surveillance plans for individuals with a heightened risk of occurrence of neurodegenerative disease. We have shown that imaging scores complement other risk factors, such as age and polygenic risk scores for early detection. The integrated model could serve as a tool for early interventions and study enrollment. Understanding the genetic basis of imaging scores can provide valuable insights into the biology of neurodegenerative disorders. Additionally, these high-accuracy models able to facilitate accurate early detection at the biobank scale can empower precision medicine trial recruitment strategies as well as paths of care for the future. We have included the development of an interactive web server ( https://ndds-brainimaging-ml.streamlit.app ) that empowers the community to process their own data based on our models and explore the utility and applicability of these findings for themselves. Users can easily upload a Nifti or DICOM file containing their MRI image, and we handle the entire pre-processing and prediction process. All computations are performed on the Google Cloud Platform. In addition, we provide an interpretation of the ML prediction highlighting areas of the brain that have contributed to the decision and a what-if-analysis tool where users explore different scenarios and their effect on prediction.

Version published to 10.1101/2024.10.27.24316215 on medRxiv
Oct 28, 2024

Integrated Biomarker–Volumetric Profiling Defines Neurodegenerative Subtypes and Predicts Neuroaxonal Injury in Multiple Sclerosis Based on Bayesian and Machine Learning Analyses

This article has 14 authors:
1. Alin Ciubotaru
2. Roxana Covali
3. Cristina Grosu
4. Daniel Alexa
5. Laura Riscanu
6. Bîlcu Robert-Valentin
7. Radu Popa
8. Gabriela Dumachita Sargu
9. Cristina Popa
10. Cristiana Filip
11. Laura-Elena Cucu
12. Albert Vamanu
13. Victor Constantinescu
14. Emilian Bogdan Ignat
This article has no evaluationsLatest version Dec 24, 2025
Defining the natural history of Alzheimer’s disease by longitudinal cerebrospinal fluid proteomics.

This article has 17 authors:
1. Betty Tijms
2. Diederick de Leeuw
3. Calvin Trieu
4. Martí Jimenéz-Mausbach
5. Katarina Fritz-Wallace
6. Olav Mjaavatten
7. Elena-Raluca Bludjea
8. Roos Jutten
9. Argonde van Harten
10. Flora Duits
11. Anouk den Braber
12. Henne Holstege
13. Marissa Zwan
14. Everard Vijverberg
15. Frode Berven
16. Charlotte Teunissen
17. Pieter Jelle Visser
This article has no evaluationsLatest version Jan 16, 2026
RDoC-Informed Explainable AI as a Paradigm for Multilevel Alzheimer’s Disease Diagnosis and Progression Prediction: a Systematic Review

This article has 4 authors:
1. Mohammad Nami
2. David Peebles
3. Fadi Thabtah
4. Firuz Kamalov
This article has no evaluationsLatest version Feb 5, 2026