A mathematical model clarifies the ABC Score formula used in enhancer-gene prediction

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This important study dissects the mathematical and biological assumptions underlying the commonly used Activity-by-Contact model of enhancer action in transcriptional regulation. The authors provide a convincing mathematical analysis that links this (mostly phenomenological) model to concrete molecular mechanisms of enhancer function. This work provides a strong foundation from which to analyze a broad swath of genome-wide data such as that generated by CRISPRi screens.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Enhancers are discrete DNA elements that regulate the expression of eukaryotic genes. They are important not only for their regulatory function, but also as loci that are frequently associated with disease traits. Despite their significance, our conceptual understanding of how enhancers work remains limited. CRISPR-interference methods have recently provided the means to systematically screen for enhancers in cell culture, from which a formula for predicting whether an enhancer regulates a gene, the Activity-by-Contact (ABC) Score, has emerged and has been widely adopted. While useful as a binary classifier, it is less effective at predicting the quantitative effect of an enhancer on gene expression. It is also unclear how the algebraic form of the ABC Score arises from the underlying molecular mechanisms and what assumptions are needed for it to hold. Here, we use the graph-theoretic linear framework, previously introduced to analyze gene regulation, to formulate the default model , a mathematical model of how multiple enhancers independently regulate a gene. We show that the algebraic form of the ABC Score arises from this model. However, the default model assumptions also imply that enhancers act additively on steady-state gene expression. This is known to be false for certain genes and we show how modifying the assumptions can accommodate this discrepancy. Overall, our approach lays a rigorous, biophysical foundation for future studies of enhancer-gene regulation.

Article activity feed

  1. eLife Assessment

    This important study dissects the mathematical and biological assumptions underlying the commonly used Activity-by-Contact model of enhancer action in transcriptional regulation. The authors provide a convincing mathematical analysis that links this (mostly phenomenological) model to concrete molecular mechanisms of enhancer function. This work provides a strong foundation from which to analyze a broad swath of genome-wide data such as that generated by CRISPRi screens.

  2. Reviewer #1 (Public review):

    Summary:

    The authors aim to formalize the mathematical underpinnings of a proposed general model and discuss the relationship of this model to the ABC Score, a widely adopted heuristic for enhancer-gene predictions. While the ABC model serves as a useful binary classifier, it struggles to predict quantitative enhancer effects on gene expression. Using a graph-theoretic linear framework, the authors derive a mathematical model (the "default model") that explains how the algebraic form of the ABC Score arises under specific assumptions. They further demonstrate that the default model's predictions of enhancer additivity are inconsistent with observed non-additive enhancer effects and propose alternative assumptions to account for these discrepancies.

    Strengths:

    The graph-theoretic approach enables systematic exploration of enhancer interactions beyond simple additivity and enables hypothesis generation when such expectations fail. This work makes clear where assumptions are made and the consequences of those assumptions.

    Weaknesses:

    While the theoretical framework is elegant, I think there is always more space to demonstrate the practicality of this approach. Further guidance for how to experimentally connect this framework with typical measurements could help bolster the immediate benefits. To be clear, I do not think this is something the authors "must" do, but rather something that might help drive home the usefulness in a more accessible way.

  3. Reviewer #2 (Public review):

    Summary:

    The Activity-by-Contact (ABC) model is a relatively widespread model of enhancer-gene regulation. This model leverages CRISPRi data to predict whether a gene is regulated by a given enhancer. To make this possible, this model accounts for the activity of an enhancer and its contact frequency with a target promoter in order to produce an "ABC score". However, while quantitative in its ability to predict enhancer-promoter regulation, this model is mostly phenomenological and does not commit to specific molecular mechanisms.

    In this manuscript, the authors formalize the molecular and mathematical assumptions made by the ABC model. Specifically, they demonstrate a basic set of assumptions that can be made to arrive at the ABC model's mathematical structure. The resulting default model (basically, a null model) places particular emphasis on the requirement that gene activation and enhancer-gene communication must be independent and at a steady state. The authors leverage and extend a graph-based formalism they have previously spearheaded to show the generality of their conclusions with respect to different molecular realizations of the process by which enhancers interact with their promoters.

    Previously published works have found that specific models of how multiple enhancers communicate with the same gene can result in additive mRNA production rates. Here, the authors demonstrate that steady-state mRNA levels are additive regardless of the specific Markovian model for how any individual enhancer communicates with the gene, as long as the model follows the basic assumptions of their default model.

    By coarse-graining, both gene activation and enhancer-gene communication to simple two-state models, the authors then clearly demonstrate that the mathematical structure of the ABC model emerges. This mathematical structure implies that the ABC score summed over all the enhancers regulating a given gene must equal 1. However, experimental measurements show values ranging from 0 to 3. The authors show that, in order to explain these experimental deviations with respect to the theory, at least one of the assumptions of the default model must be broken. They demonstrate that either invoking enhancer cooperativity in mRNA production rates or breaking the assumption that individual enhancers communicate with the gene independently can explain existing experimental data.

    Strengths:

    By demonstrating that the mathematical structure of the ABC model emerges from a set of basic assumptions including the independence of gene activation and enhancer-gene communication, the authors succeeded in their aim to put the ABC model on a formal and molecular footing. Since some experimental results do not agree with the ABC model, the authors importantly demonstrated which assumptions of the model can be broken to explain such data. The theoretical work in this manuscript is written in a reasonably accessible manner that features how a graph theory-based approach to modeling biochemical networks can result in general statements about biological phenomena.

    Weaknesses:

    While the authors discuss a number of experimental techniques that can be used to test the validity of their model, a more specific discussion of proposed experiments could have strengthened the impact of the paper by providing explicit opportunities for dialogue with experimentalists.

  4. Author response:

    We thank both reviewers for their time and effort in considering our manuscript. We are pleased that the reviewers recognised the strength of our theoretical analysis and found it "elegant" and "reasonably accessible". We also acknowledge the suggestions made by both reviewers that the manuscript could be improved by more discussion of potential experiments. We were concerned not to make the original manuscript too long but, in the light of the reviewers' comments, we will submit a revised version with more details of the kinds of experiments that would build on the results that we have presented.