Benchmarking Genomic Variant Calling Tools in Inbred Mouse Strains: Recommendations and Considerations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the growing affordability of whole genome sequencing, variant identification has become an increasingly common task, but there are many challenges due to both technical and biological factors. In recent years, the number of software packages available for variant calling has rapidly increased. Understanding the benefits and drawbacks of different tools is important in setting leading practices and highlighting limitations. These considerations are crucial in model organism research, as many variant calling programs assume outbred genomes and implicit heterozygosity, which may not apply to inbred laboratory models. Here, we present an analysis of variant calling tools and their performance in the simulated genomes of the C57BL/6J inbred laboratory mouse and nine non-reference laboratory strains. Our findings reveal a tradeoff between the recall and precision of tools. Balancing these considerations, we show that an optimal call set is obtained by using an ensemble approach, but specific variant calling recommendations vary by strain and analytical goals. Further, we highlight filters improving the performance of different variant calling tools, both for the discovery of rare variants and in the discovery of strain polymorphisms. In summary, our work provides best practices for calling and filtering genomic variants in inbred organisms, particularly laboratory mice.
Article Summary
Identifying mutations and rare genetic variants is a central task for modern genomics. Many computational tools exist for variant detection, but their performance varies across diverse applications. Further, few variant calling tools have been benchmarked against inbred genomes, which are commonly used for research. To address this, we evaluated five variant calling tools using simulated data from ten diverse inbred mouse strains. We show that variant detection, recall, and precision vary across tools and mouse strains, and that an ensemble approach improves confidence in detected mutations. Our findings offer a set of best practices for variant calling in inbred organisms across diverse analytical applications.