Aardvark: Sifting through differences in a mound of variants

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Variant benchmarking is critical in assessing the accuracy of genomic secondary pipelines. However, traditional benchmarking tools that require exact genotype matches inject biases from variant representation and are ill-suited for tandem repeat or structural variation. We describe Aardvark, a variant benchmarking tool that introduces the basepair score to directly compare haplotype sequences, reducing representation biases while allowing for partial credit scoring. The tool also includes a traditional genotype score and supports separate or joint benchmarking of small variants, tandem repeats, and structural variants (<10 kb). Aardvark accepts standard inputs, runs ≈16x faster than hap.py, and is freely available and open source ( https://github.com/PacificBiosciences/aardvark ).

Article activity feed