Characterization of sequence determinants of enhancer function using natural genetic variation
Abstract
Sequence variation in enhancers, a class of cis-regulatory elements that control cell type-specific gene transcription, contributes significantly to phenotypic variation within human populations. Enhancers are short DNA sequences (∼200 bp) composed of multiple binding sites (4-10 bp) for transcription factors (TFs). The transcriptional regulatory activity of an enhancer is encoded by the type, number, and distribution of TF binding sites that it contains. However, the sequence determinants of TF binding to enhancers and the relationship between TF binding and enhancer activity are complex, and thus it remains difficult to predict the effect of any given sequence variant on enhancer function. Here, we generate allele-specific maps of TF binding and enhancer activity in fibroblasts from a panel of F 1 hybrid mice that have a high frequency of sequence variants. We identified thousands of enhancers that exhibit differences in TF binding and/or activity between alleles and use these data to define features of sequence variants that are most likely to impact enhancer function. Our data demonstrate a critical role for AP-1 TFs at many fibroblast enhancers, reveal a hierarchical relationship between AP-1 and TEAD TF binding at enhancers, and delineate the nature of sequence variants that contribute to AP-1 TF binding. These data represent one of the most comprehensive assessments to date of the impact of sequence variation on enhancer function in chromatin, with implications for identifying functional cis-regulatory variation in human populations.
Article activity feed
-
Author Response:
Reviewer #1 (Public Review):
Here, the authors used multiple F1 crosses and the resulting embryonic fibroblasts to perform molecular profiling with ATAC-seq and a combination of ChIP-seq, Hi-ChIP, and CUT&RUN on multiple modified histones and transcription factors proteins. The resulting data are a good resource for quantifying allelic bias in protein-DNA binding and chromatin accessibility.
The authors claim there's "enrichment of SNPs/indels within a 150 bp window" in enhancers (Fig. 2H), but this enrichment looks quite middling. Can they quantify the level of enrichment and is it significant?
We have added a quantification of the enrichment of SNPs in the allele-specific enhancers compared to shared enhancers (Lines 1382-1385). The average number of SNPs within central 150 bp of enhancers is:
4.468 for …
Author Response:
Reviewer #1 (Public Review):
Here, the authors used multiple F1 crosses and the resulting embryonic fibroblasts to perform molecular profiling with ATAC-seq and a combination of ChIP-seq, Hi-ChIP, and CUT&RUN on multiple modified histones and transcription factors proteins. The resulting data are a good resource for quantifying allelic bias in protein-DNA binding and chromatin accessibility.
The authors claim there's "enrichment of SNPs/indels within a 150 bp window" in enhancers (Fig. 2H), but this enrichment looks quite middling. Can they quantify the level of enrichment and is it significant?
We have added a quantification of the enrichment of SNPs in the allele-specific enhancers compared to shared enhancers (Lines 1382-1385). The average number of SNPs within central 150 bp of enhancers is:
4.468 for enhancers with allele-specific H3K27ac levels. 3.203 for enhancers with shared H3K27ac levels. For these shared enhancers, we subsampled the shared sites to generate a set with an identical distribution of H3K27ac levels to that observed on the active allele of the allele-specific set. This helps to control for potential differences in mappability of each allele given that the allele-specific set has more SNPs, on average, and SNPs are necessary to identify allele-specific reads. (discussed in Lines 1261-1264).
This enrichment is also clearly significant (p-value < 2.2 x 10-16, Pearson’s Chi-squared test). We have added this information to the corresponding figure legend in the revised manuscript (Lines 1381-1382).
-
-
Evaluation Summary:
This manuscript describes a useful dataset for those interested in regulatory variation. The large scale of variants surveyed offers the potential to look for dependencies between nearby TF binding events at the same accessible site, and will likely be useful to those interested in dissecting sequence determinants of transcription-factor binding genome-wide.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)
-
Reviewer #1 (Public Review):
Here, the authors used multiple F1 crosses and the resulting embryonic fibroblasts to perform molecular profiling with ATAC-seq and a combination of ChIP-seq, Hi-ChIP, and CUT&RUN on multiple modified histones and transcription factors proteins. The resulting data are a good resource for quantifying allelic bias in protein-DNA binding and chromatin accessibility.
The authors claim there's "enrichment of SNPs/indels within a 150 bp window" in enhancers (Fig. 2H), but this enrichment looks quite middling. Can they quantify the level of enrichment and is it significant?
-
Reviewer #2 (Public Review):
Yang et al. apply genome-wide profiling of gene expression, accessibility, and ChIP-seq for CTCF and FOS occupancy, and H3K27ac, H3K4me1, H3K4me2, and H3K4me3 to a series of F1 hybrid mice derived from 9 divergent strains/species. The authors find that loss of AP-1 binding coincides with loss of H3K4me1 along with H3K27ac. They argue that most observed changes in accessibility or occupancy derive from cis effects rather than trans. The authors identify that while AP-1 binding does not rely on co-binding by TEAD, TEAD occupancy frequently is lost when a nearby AP-1 site is genetically perturbed. This is an interesting investigation of the dependencies between TFs binding nearby at the same accessible site, and will likely prove a useful resource for the field.
-