Sparse Sequencing permits accurate and efficient quantification of genome-wide cytosine modification levels
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
5-methylcytosine (5mC) and 5-hydroxymethycytosine (5hmC) play crucial roles in epigenetic gene regulation, with dynamic levels in development and disease. While high-depth, base-resolution whole genome studies offer the most detailed view of epigenetic landscapes for individual samples, many open questions are answered by surveying changes in 5mC/5hmC levels genome-wide or at specific genomic elements across larger cohorts of samples. Nonetheless, current global quantification methods, including mass spectrometry and immunochemistry, are typically limited in accessibility, throughput, or accuracy. Here, using computational down-sampling of deeply sequenced data, we first demonstrate that sequencing base-resolution libraries to shallow depth can be sufficient for highly accurate quantification. Sparse sampling (<0.24% of the genome) can precisely measure genomic 5mC/5hmC levels (error <5%), even when levels are low (<0.3%). Using a combined chemical/enzymatic workflow, we then validate that Sparse-Sequencing (Sparse-Seq) shows high accuracy and less variability than mass spectrometry, while distinctively preserving genomic context. Applying Sparse-Seq to track developing mouse brains serially revealed an earlier emergence of 5hmCpG compared to 5mCpH and uncovered previously overlooked, genomic feature-specific differences in epigenetic dynamics. This work establishes a rigorous foundation for employing Sparse-Seq as a highly accessible approach for 5mC/5hmC quantification, allowing for economical and high-throughput analysis of epigenetic landscapes.