High-accuracy SNV calling for bacterial isolates using deep learning with AccuSNV

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate detection of mutations within bacterial species is critical for fundamental studies of microbial evolution, reconstructing transmission events, and identifying antimicrobial resistance mutations. While many tools have been developed to identify single nucleotide variants (SNVs) from whole-genome sequencing, they often suffer from high false positive rates due to the complexity of bacterial genomes and the need for different filtering cutoffs across sample types and sequencing depths. As datasets increase in size, the manual filtering required for high accuracy presents a significant obstacle. Here, we present AccuSNV, a novel deep learning-based tool for high-precision and automated bacterial SNV calling. Unlike traditional methods that process one sample at a time, AccuSNV leverages a convolutional neural network (CNN) that integrates alignment information across multiple samples, enhancing precision through learned across-sample patterns. We evaluated AccuSNV against seven popular SNV calling tools using simulated data from six bacterial species with varied sequencing depths, numbers of isolates, mutations, and divergence levels. To further validate its real-world utility, we tested AccuSNV on multiple curated bacterial datasets containing reported SNVs. In both simulated and real-world scenarios, AccuSNV consistently achieved the best performance. Moreover, AccuSNV provides comprehensive user-friendly downstream analysis modules and outputs, including mutation annotation information, phylogenetic inference, dN/dS calculations, and optional manual filtering. Together with the automated deep learning–based calling, these features make AccuSNV broadly accessible to users with different levels of computational expertise.

Article activity feed