DoBSeqWF: A framework for sensitive detection of individual genetic variation in pooled sequencing data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation Population screening for rare genetic diseases is limited by the high cost of next-generation sequencing. Double-batched sequencing (DoBSeq) is a cost-effective method for assigning rare variants to individuals using two-dimensional unique double-pooled sequencing. However, this method produces complex, high-depth sequencing data that requires a specialized workflow for efficient and reproducible analysis. Results We developed DoBSeqWF (DoBSeq Workflow), a Nextflow-based pipeline for processing the pooled sequencing data from alignment through variant calling, filtering, and ultimately individual assignment of rare variants. Using separate training and validation datasets with whole genome sequencing as the gold standard, we benchmarked multiple variant callers, and we developed and implemented machine learning filters that improve rare variant calling performance while maintaining high sensitivity. The pipeline enables reproducible analysis and can be easily updated as bioinformatic tools and variant interpretations evolve. Availability and Implementation DoBSeqWF is freely available at https://github.com/RasmussenLab/DoBSeqWF.