Plasmids Across Datasets: Resistance, Virulence, Mobility, and Host Taxonomy
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Plasmids are autonomous DNA molecules that can replicate independently and transfer horizontally between bacterial cells. They play a key role in disseminating adaptive traits, such as antimicrobial resistance and virulence. Understanding plasmid mobility and its association with these traits is crucial to microbial ecology, public health and genomic surveillance. Several databases have been developed to catalogue plasmids assembled from bacterial isolates and metagenomic samples. However, differences in database construction and curation can introduce biases that affect subsequent analyses. In this study, we compare three distinct plasmid genome datasets — the NCBI Reference Sequence Database (RefSeq), the Integrated Microbial Genomes & Microbiomes system (IMG/PR) from bacterial isolates (I) and microbiomes (M) — to assess the influence of data origin on inferences about plasmid mobility types, antimicrobial resistance genes (ARGs), virulence genes (VGs) and host taxonomy. Our analysis reveals that plasmids assembled from metagenomes tend to be smaller than those assembled from isolates. RefSeq plasmids are enriched in conjugative plasmids (pCONJ) and display a higher frequency of ARGs and VGs. In contrast, regardless of whether they originate from isolates or metagenomes, IMG/PR plasmids are enriched in mobilizable plasmids (pMOB). Furthermore, ARGs are more frequently associated with highly mobile plasmids, particularly pCONJ. These findings highlight the importance of database selection in studies of plasmid epidemiology, functional potential and mobility. Standardised curation practices and cross-database comparisons are essential to ensure robust and reproducible insights into plasmid-mediated gene flow.
Importance
Plasmids are DNA molecules that can replicate and transfer between bacteria, thereby helping to spread genes that enable bacteria to survive and adapt in different environments. This gene exchange plays a significant part in bacterial evolution. Researchers study these processes using plasmid databases, but the way these databases are constructed can influence the conclusions that are drawn. In this study, we found that key traits, such as those involved in antibiotic resistance and the ability to cause disease, are more often linked to plasmids with greater mobility, particularly in databases containing more clinical samples. However, our results demonstrate that the choice of dataset can significantly impact our understanding of the dissemination of critical genes among bacteria. These findings are valuable for tracking and controlling antibiotic resistance and disease, and highlight the need for carefully constructed, representative databases to support accurate research into how bacteria share genes.