Combinatorial epigenomic patterns define regulatory programs underlying disease heterogeneity
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Disease is a heterogeneous process that involves multiple organs and cell types. Understanding how genomic variation contributes to disease requires approaches that move beyond the linear assumptions of additive models and resolve underlying disease pathways. While genome-wide association studies have catalogued hundreds of thousands of genomic variants linked to disease, our understanding of their cell-type specific roles remains largely limited, restricting our ability to translate genetic findings into targeted interventions.
Here, we analyse consortium-scale epigenomic data spanning 833 biological samples across 8 epigenetic features to develop a generalisable machine learning framework that models the modular architecture of genome regulation. We define 720 epigenomic signatures, Epigenetically Co-Modulated Patterns (EpiCops), that capture co-regulated genomic regions with tissue and cell-specific regulatory activity. Using EpiCops, we effectively segregate functional genomic loci of mixed biological contexts, including cell-type specific enhancers, variants of complex traits and diseases. Applied to type-2-diabetes, EpiCops identify variant clusters associated with distinct biological pathways and organs, including clusters of opposing cardiovascular risk profiles driven by divergent organ-specific regulatory mechanisms. By integrating EpiCops with partitioned polygenic risk score, we further validate robustness of these variant clusters in independent cohort studies. Collectively, our study demonstrates EpiCops as a scalable framework for resolving the cell-type specific regulatory architecture of complex disease and advancing mechanistic understanding of disease processes.