ROCker models for reliable detection and typing of short-read sequences carrying mcr , erm , mph , and lnu antibiotic resistance genes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Quantitative monitoring of emerging antimicrobial resistance genes (ARGs) using short-read sequences remains challenging due to the high frequency of amino acid functional domains and motifs shared with related but functionally distinct (non-target) proteins. To facilitate ARG monitoring efforts using unassembled short-reads, we present novel ROCker models for mcr , mph , erm , and lnu ARG families as well as models for variants of special public health concern within these families including mcr-1 , mphA , ermB , lnuF, lnuB, and lnuG genes. For this, we curated target gene sequence sets for model training and built these models using the recently updated ROCker V2 pipeline (Gerhardt et al., in review). To validate our models, we simulated reads from the whole genome of ARG-carrying isolates spanning a range of common read lengths and used them to challenge the filtering efficacy of ROCker vs. common static filtering approaches such as similarity searches using BLASTx with various e-value thresholds or hidden Markov models. ROCker models consistently showed F1 scores up to 10x higher (31% higher on average) and lower false-positive (by 30%, on average) and false-negative (by 16%, on average) rates based on 250 bp-long reads compared to alternative methods. The ROCker models and all related reference material and data are freely available through http://enve-omics.ce.gatech.edu/rocker/models , further expanding the available model collection developed previously for other genes. Their application to short-read metagenomes, metatranscriptomes, and PCR amplicon data should facilitate more accurate classification and quantification of unassembled short-read sequences for these ARG families and specific genes.

Significance

Antimicrobial resistance gene families encoding erm and mph genes confer resistance to the macrolide class of antimicrobials used to treat a wide range of infections. Similarly, the mcr gene family confers resistance to polymyxin E (colistin), a drug of last resort for many serious drug-resistant bacterial infections, and the lnu gene family confers resistance to lincomycin, reserved for patients allergic to penicillin or where bacteria have developed resistance to other antimicrobials. Assessing the prevalence of these genes in clinical or environmental samples and monitoring their spreading to new pathogens are thus important for quantifying the associated public health risk. However, detecting these and other resistance genes in short-read sequence data is technically challenging. Our ROCker bioinformatic pipeline achieves reliable detection and typing of broad-range target gene sequences in complex data sets, and thus contributes toward solving an important problem in ongoing surveillance efforts of antimicrobial resistance.

Article activity feed