High-accuracy SNV calling for bacterial isolates using deep learning with AccuSNV

Herui Liao
Arolyn Conwill
Ian Light-Maka
Martin Fenk
Alyssa H. Mitchell
Evan B. Qu
Paul Torrillo
Jacob S. Baker
Felix M. Key
Tami D. Lieberman

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate detection of mutations within bacterial species is critical for fundamental studies of microbial evolution, reconstructing transmission events, and identifying antimicrobial resistance mutations. While many tools have been developed to identify single nucleotide variants (SNVs) from whole-genome sequencing, they often suffer from high false positive rates due to the complexity of bacterial genomes and the need for different filtering cutoffs across sample types and sequencing depths. As datasets increase in size, the manual filtering required for high accuracy presents a significant obstacle. Here, we present AccuSNV, a novel deep learning-based tool for high-precision and automated bacterial SNV calling. Unlike traditional methods that process one sample at a time, AccuSNV leverages a convolutional neural network (CNN) that integrates alignment information across multiple samples, enhancing precision through learned across-sample patterns. We evaluated AccuSNV against seven popular SNV calling tools using simulated data from six bacterial species with varied sequencing depths, numbers of isolates, mutations, and divergence levels. To further validate its real-world utility, we tested AccuSNV on multiple curated bacterial datasets containing reported SNVs. In both simulated and real-world scenarios, AccuSNV consistently achieved the best performance. Moreover, AccuSNV provides comprehensive user-friendly downstream analysis modules and outputs, including mutation annotation information, phylogenetic inference, dN/dS calculations, and optional manual filtering. Together with the automated deep learning–based calling, these features make AccuSNV broadly accessible to users with different levels of computational expertise.

Version published to 10.1101/2025.09.26.678787 on bioRxiv
Sep 29, 2025

Benchmarking Genomic Foundation Models for Gene Fusion Detection from DNA Sequences

This article has 5 authors:
1. Radim Krupička
2. Mariana Komárková
3. Bohuslav Dvorský
4. Kateřina Kollinová
5. Ondřej Klempíř
This article has no evaluationsLatest version Dec 23, 2025
Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis

This article has 3 authors:
1. Marina Masliakova
2. Steve Lefever
3. Jo Vandesompele
This article has no evaluationsLatest version Jan 21, 2026
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Benchmarking Genomic Foundation Models for Gene Fusion Detection from DNA Sequences

Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world