plsMD: A plasmid reconstruction tool from short-read assemblies

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

While whole genome sequencing (WGS) has become a cornerstone of antimicrobial resistance (AMR) surveillance, the reconstruction of plasmid sequences from short-read WGS data remains a challenge due to repetitive sequences and assembly fragmentation. Current computational tools for plasmid identification and binning, such as PlasmidFinder, cBAR, PlasmidSPAdes, and Mob-recon, have limitations in reconstructing full plasmid sequences, hindering downstream analyses like phylogenetic studies and AMR gene tracking. To address this gap, we present plsMD, a novel tool designed for full plasmid reconstruction from short-read assemblies. plsMD integrates Unicycler assemblies with replicon and full plasmid sequence databases (PlasmidFinder and PLSDB) to guide plasmid reconstruction through a series of contig manipulations. Using a dataset of 83 samples containing 223 plasmids, plsMD outperformed existing tools, achieving excellent recall and precision of 91.2% and 93.6%, respectively, for plasmid reconstruction, and high sensitivity and specificity of 94.5% and 99.5% for plasmid-chromosomal contig separation. plsMD supports two usage modalities: single-sample analysis for plasmid reconstruction and gene annotation, and multi-sample analysis for phylogenetic investigations of plasmid transmission. This computational tool represents a significant advancement in plasmid analysis, offering a robust solution for utilizing existing short-read WGS data to study plasmid-mediated AMR spread and evolution.

Key points

  • Accurate plasmid reconstruction from short-read assemblies, surpassing existing binning-based tools.

  • Replicon-guided approach enables detection of divergent plasmids, enhancing genomic annotation accuracy.

  • Supports single- and multi-sample analysis, enabling plasmid transmission and evolutionary studies.

plsMD uses Unicycler assemblies to identify plasmid replicons via PlasmidFinder and aligns sequences to PLSDB. It refines alignments, selects reference plasmids, and reconstructs full plasmid sequences. Circular plasmids without replicons are identified separately. plsMD outputs plasmid and non-plasmid FASTA files. Two workflows are supported: single-sample (plasmid/non-plasmid separation, annotation of AMR, VF, IS, and replicons) and batch-sample (grouping plasmids by replicon, MAFFT alignment, rotation, and phylogenetic tree construction). Validation showed 91.2% plasmid recall, 93.6% precision, 94.6% sensitivity, and 99.8% specificity.

Article activity feed