Directional Gene-Level Concordance and Methodological Constraints in Blood Transcriptomic and DNA Methylation Studies of Parkinson’s Disease

Chiragh Dewan
Inayat Chauhan
Kanika Sharma
Sangeeta Sharma
Rupinderjeet Kaur

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Assessing reproducibility across different molecular profiling studies is a persistent methodological challenge (Zhang et al., 2009; Sweeney et al., 2017; Ioannidis, 2005). Differences in platform technology, cohort composition, analytical pipelines, and feature definitions often make it difficult to interpret cross-study comparisons based solely on gene-identity overlap.

In this study, we conducted a retrospective computational analysis of seven publicly available analytical datasets (including alternative analytical pipelines applied to the same cohort) derived from five biologically independent peripheral blood transcriptomic and DNA methylation cohorts, comprising 3,487 samples (1,824 Parkinsons disease cases and 1,663 controls). Reproducibility was evaluated using gene-identity overlap, enrichment-based comparisons, and a permutation-based framework to assess directional consistency of effect estimates across datasets. We also tested the robustness of results by varying false discovery rate thresholds and applying alternative probe-to-gene collapsing strategies. All analyses were performed using reproducible workflows implemented in R and Python with fixed random seeds.

Across independent cohorts, gene-identity overlap was generally limited, with enrichment ratios close to one, especially when datasets were generated using different platforms. In several datasets, limited numbers of statistically significant features further constrained overlap-based comparisons. In contrast, directional consistency showed greater stability. High levels of directional consistency were observed across independent cohort comparisons when restricted to overlapping statistically significant features and remained stable across statistical thresholds (90.0% at FDR < 0.05 and 82.8% at FDR < 0.10). When evaluated across the full shared gene universe without conditioning on statistical significance, directional consistency was substantially lower (∼30 to 32%) but remained significantly above permutation-based null expectations. Permutation testing confirmed that the observed directional consistency exceeded what would be expected by chance. A combined analysis including methodological replicates (n ≥ 3 datasets) showed 98.3% directional consistency; however, this estimate includes non-independent analytical pipelines applied to the same cohort and reflects analytical stability rather than independent biological replication. Rather than introducing a new statistical method, this study examines how commonly used reproducibility metrics behave under crossstudy heterogeneity and identifies their practical limitations and appropriate use boundaries.

Version published to 10.64898/2026.05.17.725808 on bioRxiv
May 20, 2026

Exploring transcriptomic and genomic latent variable correction approaches in differential expression analysis

This article has 7 authors:
1. Yadusayan Appulingam
2. John Jammal
3. Aminah Ali
4. Simon Topp
5. NYGC ALS Consortium
6. Alfredo Iacoangeli
7. Oliver Pain
This article has no evaluationsLatest version Apr 8, 2026
Cross-assay RNA modeling reveals cancer biomarkers

This article has 10 authors:
1. Hope A. Townsend
2. Kimberly R. Jordan
3. Rebecca J. Wolsky
4. Lucy B. Van Kleunen
5. Natalie R. Davidson
6. Kian Behbakht
7. Matthew J. Sikora
8. Robin D. Dowell
9. Aaron Clauset
10. Benjamin G. Bitler
This article has no evaluationsLatest version May 5, 2026
Machine learning cross-platform proteomic imputation enables protein quality scoring and replication of epidemiological associations

This article has 30 authors:
1. Linke Li
2. Ahmed Alaa
3. Youxin Tan
4. Ilker Demirel
5. Samuel Friedman
6. Qiayi Zha
7. Russell Tracy
8. Kent D. Taylor
9. Bing Yu
10. Christie M. Ballantyne
11. Rajat Deo
12. Ruth Dubin
13. Michael Y. Tsai
14. Gina M. Peloso
15. Jennifer Brody
16. Tom Austin
17. Bruce M. Psaty
18. Jayna Nicholas
19. Laura M. Raffield
20. Usman Tahir
21. Josef Coresh
22. Whitney Hornsby
23. Andrew Chan
24. Stephen S. Rich
25. Jerome I. Rotter
26. Peter Ganz
27. Robert Gerszten
28. Anthony Philippakis
29. Pradeep Natarajan
30. Zhi Yu
This article has no evaluationsLatest version May 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Exploring transcriptomic and genomic latent variable correction approaches in differential expression analysis

Cross-assay RNA modeling reveals cancer biomarkers

Machine learning cross-platform proteomic imputation enables protein quality scoring and replication of epidemiological associations