Predicted constrained accessible regions mark regulatory elements and causal variants
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Open chromatin regions (OCRs) provide genome-wide DNA elements in a cell-type-specific manner at high resolution but do not always mark active regulatory regions, hampering pinpointing causal functional variants in genetic association studies. Here, we developed a new scoring system, CAMBUS (Chromatin Accessibility Mutation Burden Score), to find active and constraint OCRs via machine-learning predictions of OCRs from surrounding DNA sequences. CAMBUS predicted 66,043 active and constrained OCRs in 29 immune cell types. These OCRs were highly constrained and enriched in known enhancers and super-enhancers. By exploiting the resolution of OCRs, we identified regulatory elements overlapping with known regulatory pathogenic variants of immune-mediated diseases and experimentally-proven or putative causal variants of complex traits, especially leukocyte-related traits, even for rare variants. In summary, our approach can derive hidden constrained and functional OCRs in a cell-type-dependent manner, thus informing non-coding causal genetic factors to human diseases at high resolution.