Circling in on plasmids: benchmarking plasmid detection and reconstruction tools for short-read data from diverse species
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The ability to detect and reconstruct plasmids from genome assemblies is crucial for studying the evolution and spread of antimicrobial resistance and virulence in bacteria. Though long-read sequencing technologies have made reconstructing plasmids easier, most (97%) of the bacterial genome assemblies in the public domain are generated from short-read data. Work to compare plasmid reconstruction tools has focused primarily on E. coli , leaving gaps in our understanding of how well these tools perform on other, less well-characterized, taxa.
Using high quality assemblies as ground truth, we benchmarked 12 plasmid detection tools (which identify plasmid contigs in assemblies) and four plasmid reconstruction tools (which group contigs from the same plasmid together). We tested their ability to characterize diverse plasmids from short-read assemblies representing a wide range of Enterobacterales and Enterococcus species, including newly discovered and poorly characterized species collected from non-human hosts. Plasmer, PlasmidEC, PlaScope, and gplas2 were the highest-scoring plasmid detection tools, performing well for both Enterobacterales and enterococci. The two major determinants of accurate plasmid detection were representation in plasmid databases - with Enterobacterales plasmids being more easily detected than those from enterococci - and assembly contiguity, which was also key for successful plasmid reconstruction. Gplas2 performed best for plasmid reconstruction; however, less than half of plasmids were perfectly reconstructed, suggesting that substantial room for improvement remains in this class of tools.
Key Messages
-
The ability to detect and reconstruct plasmids from short-read assemblies is crucial to study the spread of antimicrobial resistance and virulence genes in bacteria.
-
Most past comparisons of tools for plasmid detection and reconstruction have focused on plasmids from well-characterized E. coli ; therefore, we broadened our benchmarking set to include plasmids from diverse Enterobacterales and Enterococcus species, collected from a wide range of hosts.
-
We compared the predictions of 12 recent plasmid detection and four plasmid reconstruction tools on short read assemblies, against a truth set generated from high-quality hybrid assemblies.
-
Plasmer, PlasmidEC, PlaScope, and gplas2 were the highest-scoring plasmid detection tools.
-
Gplas2 was the highest-scoring tool for plasmid reconstruction; the quality of plasmid reconstruction was mainly determined by assembly fragmentation, with plasmids in more contiguous assemblies being easier to reconstruct.