Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery

Yury A. Barbitoff
Ruslan Abasov
Varvara E. Tvorogova
Andrey S. Glotov
Alexander V. Predeus

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

@cwarden45's saved articles (cwarden45)

Abstract

Background

Accurate variant detection in the coding regions of the human genome is a key requirement for molecular diagnostics of Mendelian disorders. Efficiency of variant discovery from next-generation sequencing (NGS) data depends on multiple factors, including reproducible coverage biases of NGS methods and the performance of read alignment and variant calling software. Although variant caller benchmarks are published constantly, no previous publications have leveraged the full extent of available gold standard whole-genome (WGS) and whole-exome (WES) sequencing datasets.

Results

In this work, we systematically evaluated the performance of 4 popular short read aligners (Bowtie2, BWA, Isaac, and Novoalign) and 9 novel and well-established variant calling and filtering methods (Clair3, DeepVariant, Octopus, GATK, FreeBayes, and Strelka2) using a set of 14 “gold standard” WES and WGS datasets available from Genome In A Bottle (GIAB) consortium. Additionally, we have indirectly evaluated each pipeline’s performance using a set of 6 non-GIAB samples of African and Russian ethnicity. In our benchmark, Bowtie2 performed significantly worse than other aligners, suggesting it should not be used for medical variant calling. When other aligners were considered, the accuracy of variant discovery mostly depended on the variant caller and not the read aligner. Among the tested variant callers, DeepVariant consistently showed the best performance and the highest robustness. Other actively developed tools, such as Clair3, Octopus, and Strelka2, also performed well, although their efficiency had greater dependence on the quality and type of the input data. We have also compared the consistency of variant calls in GIAB and non-GIAB samples. With few important caveats, best-performing tools have shown little evidence of overfitting.

Conclusions

The results show surprisingly large differences in the performance of cutting-edge tools even in high confidence regions of the coding genome. This highlights the importance of regular benchmarking of quickly evolving tools and pipelines. We also discuss the need for a more diverse set of gold standard genomes that would include samples of African, Hispanic, or mixed ancestry. Additionally, there is also a need for better variant caller assessment in the repetitive regions of the coding genome.

Version published to 10.1186/s12864-022-08365-3
Feb 22, 2022
Version published to 10.1101/2021.04.13.439626 on bioRxiv
Apr 14, 2021

HitSV: Maximizing discovery of structural variants across sequencing technologies

This article has 5 authors:
1. Yadong Wang
2. Gaoyang Li
3. Yadong Liu
4. Bo Liu
5. Long Qian
This article has no evaluationsLatest version Feb 20, 2026
MobiDeep: an AI-based meta-score for scoring non-coding DNA variations

This article has 18 authors:
1. Abdelhakim Bouazzaoui
2. Jean-Madeleine de Sainte Agathe
3. Simon Cabello-Aguilar
4. Ophélie Evrard
5. Juliette Nectoux
6. Marina Konyukh
7. Leila Qebibo
8. Thibault Coste
9. Sandrine M. Caputo
10. Perrine Brunelle
11. Yohann Jourdy
12. Cécile Rouzier
13. Mireille Cossée
14. Charles Van Goethem
15. Olivier Ardouin
16. Vasiliki Kalatzis
17. Anne-Françoise Roux
18. David Baux
This article has no evaluationsLatest version Mar 11, 2026
A sensitive and accurate framework for population-scale structural variant discovery and genotyping across sequence types

This article has 4 authors:
1. Xin Wang
2. Guangbao Luo
3. Li Xiao
4. Zhangjun Fei
This article has no evaluationsLatest version Feb 18, 2026

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusions

Article activity feed

Related articles

HitSV: Maximizing discovery of structural variants across sequencing technologies

MobiDeep: an AI-based meta-score for scoring non-coding DNA variations

A sensitive and accurate framework for population-scale structural variant discovery and genotyping across sequence types