Dissecting regulatory syntax in human development with scalable multiomics and deep learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Transcription factors (TFs) establish cell identity during development by binding regulatory DNA in a sequence-specific manner, often promoting local chromatin accessibility, and regulating gene expression. Mapping accessible chromatin offers critical insights into transcriptional control, but available datasets for human development are restricted to bulk tissue, single organs, or single modalities. Here, we present the Human Development Multiomic Atlas (HDMA), a single-cell atlas of chromatin accessibility and gene expression from 817,740 fetal cells across 12 organs, spanning 203 cell types and over 1 million candidate cis-regulatory elements, many of which exhibit organ-specific in vivo enhancer activity. Deep learning models trained to predict accessibility from local DNA sequence unravel a comprehensive lexicon of motifs which influence accessibility, including composite motifs exhibiting distinct syntactic constraints predicted to mediate TF cooperativity. We identify “hard” syntactic rules requiring precise motif spacing and orientation, “soft” rules allowing flexible motif arrangements, and ubiquitous motifs that inhibit accessibility. Model-based interpretation of genetic variants reveals that disruption of motifs with positive and negative effects is associated with concordant effects on gene expression. Our work delineates how motif syntax governs cell type-specific chromatin accessibility and provides a foundational resource for decoding cis-regulatory logic and interpreting genetic variation during human development.

Article activity feed