ACCIO: An Assembly-Based Tool Enabling Plasmid Detection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

2.

Plasmids are extrachromosomal mobile genetic elements that often carry genes responsible for antimicrobial resistance. Plasmid epidemiology aims to track the evolution and spread of plasmids, but the field currently faces significant barriers that make practical implementation using whole genome sequence data difficult. Hybrid-assembled genomes remain the most reliable way to identify and track complete plasmids; however, most genomic surveillance data exists in the form of short-read sequencing, which lacks the resolution required to accurately resolve plasmids. Despite recent advances, long-read-only assemblies have not yet reached the consistency seen in hybrid assemblies. The ideal approach to plasmid epidemiology using whole genome sequence data would consider the limitations of sequencing technologies and the constraints of existing genomic surveillance infrastructure, in addition to the unique evolutionary biology of plasmids. Here, we present ACCIO (Assembly-based Circular Contig Identification for Outbreaks), a tool which creates a reference plasmid database and uses it to infer which plasmids, and genetically related plasmid groupings, are present in an input assembly (Illumina, Nanopore, or hybrid assembly). We validated ACCIO using an internal dataset of 303 plasmid-harboring bacterial clinical and surveillance isolates collected from a single acute tertiary care center. When highly related database plasmids were grouped together, ACCIO achieved 100% sensitivity and 92.1% positive predictive value (PPV) for detection of plasmid groups using hybrid assemblies, and comparably strong performance for Illumina (93.0% sensitivity, 86.6% PPV) and Nanopore (79.3% sensitivity, 91.4% PPV) assemblies. Evaluation on three external datasets yielded consistently high performance. Finally, when benchmarked against MOB-suite, a tool for reconstruction and typing of plasmids, ACCIO demonstrated superior performance across nearly all assembly types and plasmid grouping levels. By integrating database construction, clustering, and plasmid calling into a single workflow compatible with all major sequencing platforms, ACCIO is intended to help advance plasmid epidemiology beyond its current technological and infrastructural barriers.

3.

Impact statement

Detecting and tracking plasmids—the mobile genetic elements often responsible for spreading antimicrobial resistance in hospital settings—is challenging, particularly when relying on short-read sequencing data alone. Short-read genome assemblies, despite widespread use in surveillance of bacterial pathogens, inherently lack the resolution required for plasmid analyses. Current bioinformatic methods struggle to identify whole plasmids from short-read assemblies alone, and often, hybrid assembly using both short- and long-read data is required for the robust analyses that are essential for tracking plasmids.

To address these challenges, we developed ACCIO, a bioinformatics tool which utilizes input genome assemblies (short-read, long-read, or hybrid assemblies) to assess the plasmid content of clinical bacterial isolates for epidemiologic purposes. We validated its use against the recovery of circular plasmid sequences from hybrid assembled genomes as a gold standard method for determining plasmid content. Using a curated local database of 430 plasmid sequences, ACCIO provided accurate inferences of plasmid content from short-read (Illumina), long-read (Oxford Nanopore Technologies), and hybrid assemblies (both), ultimately facilitating genomic surveillance of plasmids regardless of sequencing technology. This work represents a meaningful step forward in advancing plasmid surveillance beyond the technological and infrastructural barriers that limit its broader expansion into healthcare and other settings.

4.

Data summary

Short- and long-read sequencing data have been deposited in the NCBI Sequence Read Archive (SRA) under multiple BioProjects, and corresponding hybrid genome assemblies are available in GenBank. Accession numbers for all BioProjects, BioSamples, and SRA datasets are provided in Supplementary Data S1. All supporting data, software code, and experimental/analysis protocols are provided within the article or in supplementary data files. External validation of ACCIO used three external datasets (Cho et al. 2023, BioProjects PRJNA475751 and PRJNA874473, DOI: 10.1038/s41598-024-70540-1; Lipworth et al. 2024, BioProject: PRJNA604975, DOI: 10.1038/s41467-024-45761-7; Khezri et al. 2021, European Nucleotide Archive (ENA): PRJEB45084, DOI: 10.3390/microorganisms9122560).

List of External Software:

  • MOB-suite (v3.1.9) – https://github.com/phac-nml/mob-suite

  • Skani (v0.2.2) – https://github.com/bluenote-1577/skani

  • Scipy (v1.16.1) – https://github.com/scipy/scipy

  • Pling (v2.0.0) – https://github.com/iqbal-lab-org/pling

  • MUMmer / NUCmer (v4.0.1) – https://mummer4.github.io/

  • Mash / Mash Screen (v2.3) – https://github.com/marbl/Mash

  • SPAdes (v3.15.5) – https://github.com/ablab/spades

  • Unicycler (v0.5.1) – https://github.com/rrwick/Unicycler

  • Flye (v2.9.5) – https://github.com/mikolmogorov/Flye

  • QUAST (v5.2.0) – https://github.com/ablab/quast

  • Kraken2 (v2.1.3) – https://github.com/DerrickWood/kraken2

  • CheckM (v0.4) – https://github.com/Ecogenomics/CheckM

  • Albacore/Guppy – [no longer officially hosted; was distributed by ONT]

  • Guppy – https://nanoporetech.com/software/other/guppy

  • Dorado – https://github.com/nanoporetech/dorado

  • Bowtie2 (v2.5.4) – https://github.com/BenLangmead/bowtie2

  • Minimap2 (v2.28) – https://github.com/lh3/minimap2

  • Biopython (v1.85) – https://biopython.org/

  • Pandas (v2.3.1) – https://pandas.pydata.org/

  • Plasme (v1.1) – https://github.com/HubertTang/PLASMe

  • BLAST(v2.17.0) – https://blast.ncbi.nlm.nih.gov/Blast.cgi

  • Article activity feed