Parkinson’s disease in real life healthcare organization database: Medication based algorithm, incidence and prodromal symptoms
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Population-based research on Parkinson’s disease (PD) requires robust methods to the challenge of accurately identifying PD diagnosis and overcome diagnostic heterogeneity in routine care. Methods: We developed and validated a medication-based algorithm to define PD, using electronic health records of ~ 6 million individuals, including actual pharmacy purchase data, from Clalit Health Services, Israel’s largest healthcare provider, covering the years 2005–2025. Results: After applying exclusion criteria, the algorithm identified 34,368 patients (13,090 alive), stratified into probable and possible PD, with validation demonstrating ~95% true positive rate across diverse independent datasets. The mean age at index was 75.2 (SD 10.5) years (70.0 for those alive), with 56.5% males. Age-stratified analyses showed that incidence rates remained constant in individuals aged 20–60, but declined progressively in older groups. Over two decades, incidence decreased 2.3-fold in the [60–70) group, 2.63-fold in [70–80), 2.88-fold in [80–90), and 5.53-fold in [90–100), with consistent trends across sexes. Longitudinal analyses confirmed prodromal non-motor and motor features. Constipation was significantly more prevalent in future PD patients up to 10 years before diagnosis (12 years in males). Depressive episodes diverged 9 years prior to index, particularly after age 75. Tremor was more prevalent 16–18 years before diagnosis in both sexes. In contrast, codes reflecting tobacco use were less frequent among PD patients than controls, with differences extending 18 years prior to diagnosis, especially in males. Conclusions: This protocol provides a reproducible framework for large-scale registry studies, enabling exploration of prodromal features, evaluation of risk factors, and application of machine learning to estimate risk and identify individuals at elevated PD risk.