Quantitative methylation reference datasets of Quartet DNA reference materials for benchmarking genome-wide epigenome sequencing
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The absence of cross-lab and cross-replicate reproducibility assessments and the lack of quantitative methylation reference datasets (ground truth) impedes benchmarking genome-wide epigenome sequencing for its intended use in clinical settings such as disease diagnostics and prognostics. Using the four Quartet DNA reference materials, we generated cross-lab epigenome sequencing datasets with three technical replicates per sample using three mainstream protocols, including whole-genome bisulfite sequencing, enzymatic methyl-seq, and TET-assisted pyridine borane sequencing. We found profound strand biases in methylation quantification in each library across all protocols. Cross-lab and cross-replicate reproducibility analyses showed low qualitative concordance of detection (mean Jaccard index = 0.36) yet high quantitative agreement of methylation levels (mean Pearson correlation coefficient = 0.96) in overlapping CpG sites. Then, we constructed genome-wide reference datasets using consensus voting, providing ground truth for cross-protocol and cross-lab proficiency tests. Additionally, we revealed that the mean CpG depth, coverage, and strand consistency highly correlate with the reference datasets-dependent quality metrics. The Quartet DNA reference materials and genome-wide quantitative methylation reference datasets provide foundational benchmarks for epigenome sequencing, enabling standardized quality assessment of emerging epigenomic technologies and analytical pipelines.