CoxMDS: Multiple Data Splitting for High-dimensional Mediation Analysis with Survival Outcomes in Epigenome-wide Studies
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Causal mediation analysis investigates whether the effect of an exposure on an outcome operates through intermediate variables known as mediators. Although progress has been made in high-dimensional mediation analysis, current methods do not reliably control the false discovery rate (FDR) in finite samples, especially when mediators are moderately to highly correlated or follow non-Gaussian distributions. These challenges frequently arise in DNA methylation studies. We introduce CoxMDS, a multiple data splitting method that uses Cox proportional hazards models to identify putative causal mediators for survival outcomes. CoxMDS ensures finite-sample FDR control even in the presence of correlated or non-Gaussian mediators. Through simulations, CoxMDS is shown to maintain FDR control and achieve higher statistical power compared with existing approaches. In applications to DNA methylation data with survival outcomes, CoxMDS identified eight CpG sites in The Cancer Genome Atlas (TCGA) that are consistent with the hypothesis that DNA methylation may mediate the effect of smoking on lung cancer survival, and two CpG sites in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) that are consistent with the hypothesis that DNA methylation may mediate the effect of smoking on time to Alzheimer’s disease conversion.