Robust footprinting with sample-specific Tn5 bias correction for bulk and single cell ATAC-seq
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Footprint analysis of assay for transposase-accessible chromatin via sequencing (ATAC-seq) enables base-resolution mapping of regulatory elements but is often underpowered and prone to false positives due to data sparsity and Tn5 transposase cleavage bias. We uncover substantial sample-to-sample variability in Tn5 transposase cleavage bias across samples, revealing a previously underappreciated source of batch effects and highlight the motivations for sample-specific Tn5 bias modeling. We present TraceBIND, a computational framework that corrects sample-specific Tn5 bias and use it to identify TF and nucleosome footprints through a dynamic flanking window statistical scan. Compared to existing approaches, TraceBIND substantially reduces false discoveries, controlling type 1 error while maintaining high sensitivity. Even though TraceBIND is completely unsupervised and does not require training on ChIP-seq data, we used multiple validation analyses to demonstrate that TraceBIND matches the power of supervised methods for detection of known TF binding sites. In scATAC-seq data from aging rat kidney, TraceBIND discovers age-associated dynamic regulatory changes, linking footprint activity to age-associated epigenetic drift. This demonstrates that footprint-informed scATAC-seq analysis reveals rich regulatory signals missed by conventional peak-based approaches.