Variable performance of widely used bisulfite sequencing methods and read mapping software for DNA methylation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

DNA methylation (DNAm) is the most commonly studied marker in ecological epigenetics, yet the performance of popular library preparation strategies and bioinformatic tools is seldom assessed in genetically variable natural populations. We profiled DNAm using reduced representation bisulfite sequencing (RRBS) and whole genome bisulfite sequencing (WGBS) of technical and biological replicates from threespine stickleback ( Gasterosteus aculeatus ) liver tissue. We then compared how the most commonly used methylation caller (Bismark) performed relative to two alternative pipelines (BWA mem or BWA meth read mappers analyzed with MethylDackel). BWA meth provided 50% and 45% higher mapping efficiency than BWA mem and Bismark, respectively. Despite differences in mapping efficiency, BWA meth and Bismark produced similar methylation profiles, while BWA mem systematically discarded unmethylated cytosines. Depth filters had large impacts on CpG sites recovered across multiple individuals, particularly with WGBS data. Notably, the prevalence of CpG sites with intermediate methylation levels is greatly reduced in RRBS, which may have important consequences for functional interpretations. We conclude by discussing how library construction and bisulfite sequence alignment software can influence the abundance and reliability of data available for downstream analysis. Our analyses suggest that researchers studying genetically variable populations will benefit from deeply sequencing a few initial individuals to identify the amount of genomic coverage necessary for mean methylation estimates to plateau, a value that may differ by species and population. We additionally advocate for paired end sequencing on RRBS libraries to filter SNPs that may bias methylation metrics, which is counter to conventional wisdom.

Article activity feed