Clustering of plasmid genomes for genomic epidemiology by using rearrangement distances, with pling
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Integration of plasmids into genomic epidemiology is challenging, because there are no clearly defined evolving-units (equivalent to species), and because plasmids appear to evolve as much by structural change (rearrangements, insertions and deletions) as by mutation (1). Further, plasmids transfer horizontally between bacterial hosts (2), and thus a model beyond just a phylogeny is needed to integrate their genetic information with that of their hosts.
Pling (3) is a tool designed to measure a genetic distance between plasmids that is related to how they empirically appear to evolve, by measuring the distance between two plasmids as the minimum number of structural changes needed to change one plasmid into the other (ignoring SNP differences). Having done this, it constructs a relatedness network of the plasmids under study, and then clusters them into groups that are credibly recently related.
We give here a protocol for running pling, and how we integrate its information with plasmid typing and SNP information. Together, these provide a system for deciding which plasmids are worth treating as “the same plasmid” for the purposes of epidemiology, quantifying their relatedness in terms of rearrangements and SNPs, and then seeing how they are distributed across the host phylogeny.