Variable performance of widely used bisulfite sequencing methods and read mapping software for DNA methylation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
DNA methylation (DNAm) is the most commonly studied marker in ecological epigenetics, yet the performance of popular library preparation strategies and bioinformatic tools is seldom assessed and compared in genetically variable natural populations. We profiled DNAm using reduced representation bisulfite sequencing (RRBS) and whole genome bisulfite sequencing (WGBS), including technical and biological replicates from lab-reared and wild-caught threespine stickleback (Gasterosteus aculeatus). We then compared how the most commonly used read mapper and methylation caller (Bismark) performed relative to two alternative pipelines (BWA mem or BWA meth read mappers analyzed with MethyDackel). BWA meth provided 50% higher mapping efficiency than BWA mem and 45% higher efficiency than Bismark. Despite differences in mapping efficiency, BWA meth and Bismark produced highly similar methylation profiles, while BWA mem systematically discarded unmethylated cytosines. Sequencing depth filters had large impacts on CpG sites recovered across multiple individuals, with the largest impact on WGBS data. Notably, the prevalence of CpG sites with intermediate methylation levels is greatly reduced in RRBS data compared to WGBS, which may have important consequences for functional interpretations. We conclude by discussing how library construction and bisulfite alignment wrappers can influence SNP filtering, genomic coverage, and the abundance and reliability of data available for downstream analysis. Our analyses suggest that researchers studying genetically variable populations may prioritize filtering SNPs by constructing RRBS libraries with small insert sizes and paired end reads, which is counter to conventional wisdom.