Long-range promoter–enhancer contacts are conserved during evolution and contribute to gene expression robustness

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The paper addresses a subject of potential great interest regarding the evolution of gene regulation and of enhancer landscapes. Available chromatin looping and gene expression data in mouse and human are analyzed to compare diverse properties of genome-wide promoter-centered maps, including associations with gene expression. It is shown that there is conservation of regulatory landscape across the two species, and that the extent of conservation in the TSS-distal landscape is associated with gene expression similarities. These overall results are in agreement with a large body of work in the field.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript.The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Gene expression is regulated through complex molecular interactions, involving cis -acting elements that can be situated far away from their target genes. Data on long-range contacts between promoters and regulatory elements are rapidly accumulating. However, it remains unclear how these regulatory relationships evolve and how they contribute to the establishment of robust gene expression profiles. Here, we address these questions by comparing genome-wide maps of promoter-centered chromatin contacts in mouse and human. We show that there is significant evolutionary conservation of cis -regulatory landscapes, indicating that selective pressures act to preserve not only regulatory element sequences but also their chromatin contacts with target genes. The extent of evolutionary conservation is remarkable for long-range promoter–enhancer contacts, illustrating how the structure of regulatory landscapes constrains large-scale genome evolution. We show that the evolution of cis -regulatory landscapes, measured in terms of distal element sequences, synteny, or contacts with target genes, is significantly associated with gene expression evolution.

Article activity feed

  1. Evaluation Summary:

    The paper addresses a subject of potential great interest regarding the evolution of gene regulation and of enhancer landscapes. Available chromatin looping and gene expression data in mouse and human are analyzed to compare diverse properties of genome-wide promoter-centered maps, including associations with gene expression. It is shown that there is conservation of regulatory landscape across the two species, and that the extent of conservation in the TSS-distal landscape is associated with gene expression similarities. These overall results are in agreement with a large body of work in the field.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript.The reviewers remained anonymous to the authors.)

  2. Reviewer #1 (Public Review):

    Here, the authors analyze publicly-available chromatin looping and gene expression data in mouse and human to measure diverse properties of genome-wide promoter-centered maps, including associations with gene expression. After uniformly processing all data, they produced simulated chromatin looping maps. Using these results, they show there is conservation of regulatory landscape across the two species, and that the extent of conservation in the TSS-distal landscape is associated with gene expression evolution. The results support general concepts in this area of research.

  3. Reviewer #2 (Public Review):

    This study uses available genomics data, especially promoter centered capture-Hi-C data, to analyze the relationship between distal enhancer and cis-regulatory elements and gene expression across species and across cell types. Understanding how enhancers regulate gene expression, and how these patterns are conserved (or not) across evolution and between cell types and species is a very important topic, of intense interest. Given the abundance of genomics data now out there, it is also great to see a study that makes use of the available data. I am not an expert on evolutionary comparisons or bioinformatics, and therefore cannot evaluate all of the technical details of e.g. their promoter capture Hi-C simulations, and I just took the authors at their word. I also found the paper to be clearly written and the figures to be clear.

    My concern with the study is that, as best as I can tell, there are no new conclusions. Every conclusion is either too vague and general, or something that (almost) everyone working on enhancers already knows. They claim to have found slighter stronger correlations than some previous studies which may well be true. But from my reading, this paper does not contain any original conclusions or insights, that are not already widely accepted in the field of enhancer biology, and the study seems to completely neglect 3D genome structuring elements such as insulators and TADs. As such, this study is a nice confirmation of already known results, but not at the level of new conceptual insights I would expect to see in an original research paper.

  4. Reviewer #3 (Public Review):

    In this manuscript the authors unify a public datasets of capture HiC data within a common framework and use these data to examine the relationships between topological chromatin organization, enhancer function, gene expression, and the evolutionary conservation thereof. The introduction correctly states several pending and exciting questions in the field and the authors performed a large body of work in multiple directions to address some of them. The results are well presented in clear and pleasant figures, even though the text of the manuscript sometimes lacks similar clarity (see some examples below). Overall, I feel that the manuscript can be substantially improved in several key areas.

    1. First, it critically lacks focus, accumulating analyses in many directions, most of which lead to either unsurprising, or sometimes unconvincing conclusions (see examples below). This huge amount of results (6 Figures and 36 supplementary figures!) hampers the reader's interest and dilute the few novel and exciting results in a crowd of less significant observations. In their current form, the results remain too descriptive, with lots of scattered observations.

    2. Most experiments presented in the manuscript use a 'control' dataset, constructed by a sort of 'shuffling' of the actual data. While this sounds like a good idea in principle, I remained unable to grasp exactly how this procedure was performed, which unfortunately prevented me from fully appreciating the significance of the results.

    Broadly, I understand that the simulated dataset is made by attributing to each promoter the same number of enhancers as in the real data, picked among all enhancers in its vicinity with a probability depending only on their distance to the promoter of interest but irrespective of their 'real' HiC target. If this is correct, some results seem to raise unaddressed questions about its relevance and possible biases.
    - l 222-225: the authors note that restriction fragments are more conserved than control data in gene-rich, but not in in gene-poor regions. Couldn't this happen simply because in gene-poor regions, the simulated data are in fact closer to the real data : if there are no other genes in the vicinity of the promoter of interest there will be no fragments targeting other promoters, hence no shuffling of the enhancer-promoter links can occur.
    - In the simulated data, one expects that some fragments will contact 0 baits. Why are they not shown in figures 1, 1S1c,f, 1S2b, 1S3b ?
    - Fig 1S1c,f show that in the simulated dataset, each fragment contacts less baits than in the actual dataset. Why can we see the opposite in Fig 1S2b and 1S3b ?
    - While Fig1S1b,e show that each bait contacts the same number of fragments in the simulated and actual dataset, which is expected by construction, why can we see a marked difference in Fig1S2a and 1S3a? Even if there is a small difference due to a posteriori filtering of simulated data, it should go in the opposite direction of what is seen (it should lower the number of fragments per bait, not increase it).

    3. I appreciate that the authors do not attempt to overestimate the importance of their results, but my impression is that almost none of the conclusions are really novel with respect to the existing literature. Roughly, figures 1 to 4 do not say much more beyond the fact that the dataset is enriched in enhancer-promoter interactions. This is not uninteresting, but not really a surprise in itself either, given that it represents topological contacts of promoters.
    Being enriched in enhancer-promoter interactions, it ensues that the dataset also tends to be more conserved, both sequence-wise (Figure 3) and synteny-wise (Figure 4).
    Not only is this expected, but the observed size effect seems very small, both for the enrichment itself (measured overlap with known enhancers in Fig.2) and for the consequences on conservation. This is exemplified in lines 195-197 of the manuscript results section: "For the comparison between human and mouse, the median aligned length fraction of contacted fragments is 27% in PCHi-C data, which is significantly higher than the 23% observed in the simulated dataset". It seems to me that even a small enrichment could generate such small effects, with clear statistical significance but limited biological significance.

    4. More exciting observations come only with Figures 5 and 6. They however still need more solid support.

    a) For example, data in Figures 5c and Fig5-Supp 4 and Fig5-Supp 5 would be a lot more interesting if restricted to interactions within synteny blocks, thus measuring solely interactions that are lost/kept between human and mouse independent on synteny conservation. This would be very interesting, as it has not been measured before. Would the conservation be dependent on the distance? This cannot be seen in the present data.

    b) The question of the link between the evolution of gene expression and that of enhancer landscapes, asked in Figure 6 is of major interest and has not been much explored so far. The result is however disappointing in that it only confirms the findings of a previous study (Berthelot et al., 2018), with a weaker signal than in the original study. The correlation between conservation of expression and number of chromatin contacts (Fig6c), which is supposed to be the key result, seems extremely modest, to say the least. The correlation with expression specificity or with expression levels is more convincing, but also of lesser interest.

    5. GO enrichment analyses of the conserved contacts are only briefly mentioned and relegated to supplementary data. The only conclusion of the manuscript is that it is "consistent with the presence of strong functional constraints on the cis-regulatory landscapes of developmental genes". This is already very well known. I am sure more can be drawn from these analyses, even though they should be carefully controlled for important confounding factors (eg gene density). For example, if the conservation of contacts were studied independent on the synteny, would contacts of specific GO categories be more or less conserved than others? In other terms, do rules of chromatin contact vary depending on gene function? This would be new.