Uncovering the Mechanistic Landscape of Regulatory DNA with Deep Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The regulatory genome encodes the logic that governs gene expression, enabling cells to respond to developmental, environmental, and evolutionary cues. This logic arises from complex cis -regulatory mechanisms that integrate transcription factor motifs, their syntactical arrangement, and surrounding sequence context, features that remain challenging to decode. Here, we present SEAM (Systematic Explanation of Attribution-based Mechanisms), a computational framework that combines deep learning with explainable AI to map the mechanistic impact of genetic mutations. Applied to human and Drosophila regulatory loci, SEAM uncovers functional binding sites at sequences of interest and identifies which mutations preserve, disrupt, or create novel binding sites. SEAM also reveals that two qualitatively distinct classes of regulatory signal are operative at many loci: signals that are robust to mutation and signals that are readily reprogrammable. These results clarify the inherent ability of regulatory DNA to evolve. They also position SEAM as a versatile framework for interpreting non-coding variants and for informing the mechanism-aware design of synthetic sequences.