From Clinical Criteria to AI-Based Classification of Advanced Parkinson’s Disease: A Data-Driven Approach Using Structured PPMI Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Advanced Parkinson’s disease (APD) involves severe motor and non-motor complications and requires early identification, yet lacks a standardized quantitative definition. This study translated expert-defined CDEPA criteria into 216 structured variables from the Parkinson’s Progression Markers Initiative (PPMI) dataset to train machine learning models for early APD classification. A 1,302 patients cohort was followed up for 13 years. A label-rescuing strategy addressed longitudinal incompleteness. Supervised models trained on baseline data predicted future APD status. Binary classifiers outperformed multiclass approaches; the best-performing model (XGBoost, Year 9) achieved AUC 0.881, balanced accuracy 0.824, and F1 score 0.819. Top predictors included genetic mutation status, age, MDS-UPDRS I–II, REM sleep behavior disorder, and tremor severity. Non-motor symptoms—especially autonomic dysfunction and sleep disturbances—were more informative than motor signs, comprising 57.1% of top features. These findings support the feasibility of early APD classification and propose a scalable, data-driven framework for APD prediction.

Article activity feed