Characterization of sequence determinants of enhancer function using natural genetic variation

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This manuscript describes a useful dataset for those interested in regulatory variation. The large scale of variants surveyed offers the potential to look for dependencies between nearby TF binding events at the same accessible site, and will likely be useful to those interested in dissecting sequence determinants of transcription-factor binding genome-wide.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article

Abstract

Sequence variation in enhancers that control cell-type-specific gene transcription contributes significantly to phenotypic variation within human populations. However, it remains difficult to predict precisely the effect of any given sequence variant on enhancer function due to the complexity of DNA sequence motifs that determine transcription factor (TF) binding to enhancers in their native genomic context. Using F 1 -hybrid cells derived from crosses between distantly related inbred strains of mice, we identified thousands of enhancers with allele-specific TF binding and/or activity. We find that genetic variants located within the central region of enhancers are most likely to alter TF binding and enhancer activity. We observe that the AP-1 family of TFs (Fos/Jun) are frequently required for binding of TEAD TFs and for enhancer function. However, many sequence variants outside of core motifs for AP-1 and TEAD also impact enhancer function, including sequences flanking core TF motifs and AP-1 half sites. Taken together, these data represent one of the most comprehensive assessments of allele-specific TF binding and enhancer function to date and reveal how sequence changes at enhancers alter their function across evolutionary timescales.

Article activity feed

  1. Author Response:

    Reviewer #1 (Public Review):

    Here, the authors used multiple F1 crosses and the resulting embryonic fibroblasts to perform molecular profiling with ATAC-seq and a combination of ChIP-seq, Hi-ChIP, and CUT&RUN on multiple modified histones and transcription factors proteins. The resulting data are a good resource for quantifying allelic bias in protein-DNA binding and chromatin accessibility.

    The authors claim there's "enrichment of SNPs/indels within a 150 bp window" in enhancers (Fig. 2H), but this enrichment looks quite middling. Can they quantify the level of enrichment and is it significant?

    We have added a quantification of the enrichment of SNPs in the allele-specific enhancers compared to shared enhancers (Lines 1382-1385). The average number of SNPs within central 150 bp of enhancers is:

    4.468 for enhancers with allele-specific H3K27ac levels. 3.203 for enhancers with shared H3K27ac levels. For these shared enhancers, we subsampled the shared sites to generate a set with an identical distribution of H3K27ac levels to that observed on the active allele of the allele-specific set. This helps to control for potential differences in mappability of each allele given that the allele-specific set has more SNPs, on average, and SNPs are necessary to identify allele-specific reads. (discussed in Lines 1261-1264).

    This enrichment is also clearly significant (p-value < 2.2 x 10-16, Pearson’s Chi-squared test). We have added this information to the corresponding figure legend in the revised manuscript (Lines 1381-1382).

  2. Evaluation Summary:

    This manuscript describes a useful dataset for those interested in regulatory variation. The large scale of variants surveyed offers the potential to look for dependencies between nearby TF binding events at the same accessible site, and will likely be useful to those interested in dissecting sequence determinants of transcription-factor binding genome-wide.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

  3. Reviewer #1 (Public Review):

    Here, the authors used multiple F1 crosses and the resulting embryonic fibroblasts to perform molecular profiling with ATAC-seq and a combination of ChIP-seq, Hi-ChIP, and CUT&RUN on multiple modified histones and transcription factors proteins. The resulting data are a good resource for quantifying allelic bias in protein-DNA binding and chromatin accessibility.

    The authors claim there's "enrichment of SNPs/indels within a 150 bp window" in enhancers (Fig. 2H), but this enrichment looks quite middling. Can they quantify the level of enrichment and is it significant?

  4. Reviewer #2 (Public Review):

    Yang et al. apply genome-wide profiling of gene expression, accessibility, and ChIP-seq for CTCF and FOS occupancy, and H3K27ac, H3K4me1, H3K4me2, and H3K4me3 to a series of F1 hybrid mice derived from 9 divergent strains/species. The authors find that loss of AP-1 binding coincides with loss of H3K4me1 along with H3K27ac. They argue that most observed changes in accessibility or occupancy derive from cis effects rather than trans. The authors identify that while AP-1 binding does not rely on co-binding by TEAD, TEAD occupancy frequently is lost when a nearby AP-1 site is genetically perturbed. This is an interesting investigation of the dependencies between TFs binding nearby at the same accessible site, and will likely prove a useful resource for the field.