Unraveling the Impact of Genome Assembly on Bacterial Typing: A One Health Perspective

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background In the context of pathogen surveillance, it is crucial to ensure interoperability and harmonized data. Several surveillance systems are designed to compare bacteria and identify outbreak clusters based on core genome MultiLocus Sequence Typing (cgMLST). Among the different approaches available to generate bacterial cgMLST, our research used an assembly-based approach (chewBBACA tool). Methods Simulations of short-read sequencing were conducted for 5 genomes of 27 pathogens of interest in animal, plant, and human health to evaluate the repeatability and reproducibility of cgMLST. Various quality parameters, such as read quality and depth of sequencing were applied, and several read simulations and genome assemblies were repeated using three tools: SPAdes, Unicycler and Shovill. In vitro sequencing were also used to evaluate assembly impact on cgMLST results, for 6 bacterial species: Bacillus thuringiensis, Listeria monocytogenes , Salmonella enterica , Staphylococcus aureus , and Vibrio parahaemolyticus . Results The results highlighted variability in cgMLST, which appears unrelated to the assembly tools, but rather induced by the intrinsic composition of the genomes themselves. This variability observed in simulated sequencing was further validated with real data for five of the bacterial pathogens studied. Conclusion This highlights that the intrinsic genome composition affects assembly and resulting cgMLST profiles, that variability in bioinformatics tools can induce a bias in cgMLST profiles. In conclusion, we propose that the completeness of cgMLST schemes should be considered when clustering strains.

Article activity feed