Contextualising transcription factor binding during embryogenesis using natural sequence variation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding how genetic variation impacts transcription factor (TF) binding remains a major challenge, limiting our ability to model disease-associated variants. Here, we used a highly controlled system of F1 crosses with extensive genetic diversity to profile allele-specific binding of four TFs at several embryonic time-points, using Drosophila as a model. Using a combined haplotype test, we identified 9-18% of TF bound regions impacted by genetic variation. By expanding WASP (a tool for allele-specific read mapping) to examine INDELs, we increased detection of allele imbalanced (AI) peaks by 30-50%. This fine-grained ‘mutagenesis’ could reconstruct functionalized binding motifs of all factors. To prioritise potential causal variants, we trained a convolutional neural network (Basenji) to predict TF binding from DNA sequence. The model could accurately predict experimental AI for strong effect variants, providing a mechanistic interpretation for how genetic variation impacted TF binding. This revealed unexpected relationships between TFs, including potential cooperative pairs, and mechanisms of tissue specific recruitment of the ubiquitous factor CTCF.