Biological versus Technical Reliability of Epigenetic Clocks and Implications for Disease Prognosis and Intervention Response
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
DNA methylation–based aging biomarkers, or epigenetic clocks, are increasingly used to estimate biological age and predict health outcomes. Their translational utility, however, depends not only on predictive accuracy but also on reliability, the ability to provide consistent results across technical replicates and repeated biological measures. Here, we leveraged the TranslAGE platform to comprehensively evaluate the technical and biological reliability of 18 Epigenetic clocks, including chronological predictors, mortality predictors, pace-of-aging measures, reliable variants, and newer explainable clocks. Technical reliability was quantified across four independent datasets. For standard replicate assays on EPIC and 450K arrays, nearly all clocks achieved excellent technical reproducibility. However, some clocks showed dramatic drops in technical reliability based on differences in slide position and DNA extraction protocol. PC-based clocks, especially PCGrimAge and SystemsAge remained technically reliable in all cases. In contrast, biological reliability, measured across repeated samples collected within hours, before and after meals, under acute stress, across environmental exposures, and over days, was markedly lower, with most clocks showing only moderate stability. PCGrimAge was the only clock with good ICC > 0.75 for biological reliability. Importantly, technical reproducibility did not predict biological reliability; clocks that were technically robust often proved biologically unreliable. We further demonstrated that reliability directly constrains downstream applications. Clocks with higher ICCs produced more stable prognostic associations with cognitive decline and more consistent responsiveness to a vegan diet intervention, whereas unreliable clocks yielded highly variable or spurious effects. Together, these findings reveal that technical reliability is not enough: biological reliability remains a critical limitation for many DNA methylation clocks that constrains their utility, and our work provides a roadmap for prioritizing next-generation clocks most suited for clinical translation.
