Global evolutionary epidemiology and resistome dynamics of Citrobacter species , Enterobacter hormaechei , Klebsiella variicola , and Proteeae clones

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Citrobacter spp., Enterobacter hormaechei subsp ., Klebsiella variicola and Proteae tribe members are rarely isolated Enterobacterales increasingly implicated in nosocomial infections. Herein, we show that these species contain multiple genes encoding resistance to important antibiotics and are widely and globally distributed, being isolated from human, animal, plant, and environmental sources in 67 countries. Certain clones and clades of these species were internationally disseminated, serving as reservoirs and mediums for the global dissemination of antibiotic resistance genes. As they can easily transmit these genes to more pathogenic species, additional molecular surveillance studies should be undertaken to identify and contain these antibiotic‐resistant species.

Article activity feed

  1. ###Reviewer #3:

    General assessment:

    This manuscript examines publicly available genomes of a number of Enterobacteriaceae species, and makes statements regarding their evolution, geographical distribution and antimicrobial resistance. While repurposing existing data can add value, such analyses must be carefully done and inferences only made after assessment and consideration of the potential limitations and biases of such data. Currently, the rationale and methods for performing the analyses outlined in this manuscript are not sufficient to support the conclusions. Following critical evaluation of the metadata associated with the genomes, and more robust analyses, useful insights may be obtained.

    Numbered summary of substantive concerns:

    1. More justification for examination of these particular bacterial species is required. For example, only 59 M morganii genomes were included; given these small numbers, how big is the clinical problem, and is a global analysis really possible?

    2. There is no description of inclusion / exclusion criteria for these genomes. It is clear that most genomes derived from the United States; a full description of the selection process will provide a greater understanding of potential bias, which could affect the results and conclusions reached.

    3. A number of outbreaks are stated to have been observed, but there is no robust evidence presented to support such identification, other than presumably clustering in the phylogenetic trees. More generally, without proper evaluation of the metadata associated with the genomes, there is a large risk that any observations (regarding similarity or clustering, or higher prevalence of resistance determinants, etc) are merely due to the nature of the genome collection rather than true biological or epidemiological relatedness. A critical evaluation of the representativeness of the genome collection is required.

    4. Various qualitative statements on differences between species or clades are made, such as the relative richness of resistomes, but (in addition to the issue described in the previous point) such statements require the use of appropriate statistical tests. Definitions are required for terms such as "closely related", "comparable" resistome diversity etc.

    5. The analyses performed are currently not sufficient to underpin many of the statements made in this manuscript regarding the evolution and transmission of these bacteria. For example, the trees presented in the figures appear to be cladograms, therefore the branch lengths are meaningless. Branch lengths are important in this context. Also, the phylogeography was evaluated by mapping genome origins physically onto a map, but there are more sophisticated approaches for this (eg phylogenetic diffusion models), though such analyses may regardless be heavily biased by the nature of the genome collection.

  2. ###Reviewer #2:

    This manuscript presents species-by-species analysis of presence and distribution of antimicrobial resistance (AMR) genes for the less isolated Enterobacteriaceae species using the genome and meta data registered in PATRIC database. It is valuable, but most analyses are not quantitative but just descriptive, and sentences describing the results are not easy to read. The phylogenetic tree and heatmap indicating presence of AMR genes are presented for each species, but it's hard to understand what the main message is in each figure, and what are characteristics of a species compared to the others. The current manuscript will be useful as a dictionary indicating the presence of a specific AMR gene in each species for researchers in AMR.

    -Each figure should have legends to let readers understand which color indicates what at a glance. Information of geographical region should be clearly indicated in the figure, in particular when it is mentioned in the main text. Also, what do the different colors in the strain names in the tree mean?

    -The Method section is too simple and lacks sufficient explanation. For example, what is a criterion to judge presence of an antimicrobial resistance gene?

    -The list of detected AMR genes at the top should be clearly categorized using different colors and headers (e.g., "ESBL", "AmpC" etc)

    -L126: what is the "outbreak"? I cannot tell in the figure and how it was defined.

    -Examples of the not quantitative but just descriptive explanations are L135 "richer resistome" and L136 "common". Why do the authors not specifically present the number and percentage?

    In the entire text, the authors do not conduct any statistical test to judge significance of the difference they mention.

  3. ###Reviewer #1:

    Sekyere and Reta present a comprehensive descriptive characterization of the epidemiology, phylogeographical distribution and antibiotic resistance profiles of six species of Enterobacteriaceae. Using a total of 2377 publicly available genomes, the authors show many multidrug resistant clones that are distributed worldwide. This study potentially provides important insight into a group of clinically relevant bacteria that remain poorly characterized compared to their more well-known relatives. Below are my comments.

    Major comments:

    1. The entire study is basically a descriptive enumeration of the resistance characteristics six different bacterial species based on genome sequences, with numerous reference to "less" or "more" or synonyms of these words (a few examples are line 140 "richer resistome diversity", line 157 "lesser resistome abundance and diversity", line 163 "richly endowed", line 215 "fewer resistome diversity and abundance", line 217 "sparse", lines 218 and 221 "virtually absent", line 222 "substantial abundance", line 244 "richest abundance of resistomes"). The lack of statistical analyses to compare lineages/clusters of the same species and between species and determine significant differences among them is problematic. Throughout the text, there is no reference to specific numerical values (e.g., p values) when making these comparisons.

    2. Similar to my comment above are the references to "short (or close) evolutionary distance" (for examples, lines 131, 208, 228, 265, 432, 439). How was evolutionary distance measured - number of SNPs, phylogenetic distance, average nucleotide identity? This "closeness" or "shortness" should be explicitly stated in terms of number, for example number of SNPs.

    3. The Methods section needs more details. I have listed my specific comments on methods below.

    3 a) Lines 504-511: How many genomes were initially downloaded? Were these genomes complete or in draft stages? How were these filtered and the final 2377 genomes selected? What were the criteria for selecting the 2377 genomes - number of contigs, size of genomes, assembly quality, available metadata, etc - or did the authors use programs that check genome quality such as CheckM? Line 510 "filtered to remove poor genome sequences" How is poor defined here?

    3 b) Line 517: How were the 1000 genes used for phylogenetic reconstruction selected?

    3 c) Lines 522-525: Simply drawing the distribution of subspecies and species on a map does not constitute a phylogeographical analysis. There are many biases that can influence the geographic distribution of microbes, most notably the sampling scheme used (for example, more samples from a single country or from a specific host/environment/setting), the composition of the database being used (NCBI and PATRIC in this study) and the collection of more strains of a single species and fewer strains in other species. The current study, similar to many others, has these biases and were in fact mentioned in the Results section. How do the authors address these biases?

    3 d) Lines 526-531 Resistome analyses: The current study is basically a summary of the information from the NCBI Pathogen Detection database. The authors need to briefly describe how resistance genes were identified in the genomes from this database. Since the entire study and all figures focus on the ARGs, authors need to show the reliability and confidence on how these were identified.

    1. Results, lines 187-188: Citation for "local and international outbreaks" needed. How did the authors come up with the inference that lines 183-186 represent outbreaks? Analyses of outbreaks require information on dates of sampling, which are lacking from this dataset. Hence, to make inferences that such topologies in the tree represent outbreaks is quite a stretch. I suggest that the authors either carry out temporal analyses of their data to be able to say that there were outbreaks or remove suggestions of the occurrence of outbreaks.

    2. Discussion, lines 447- 457: I agree that both vertical and horizontal modes of evolution of resistance bacteria are important mechanisms in the spread of resistance in many pathogens and there are numerous previous studies that have reported this. However, the study did not carry out any specific analyses on HGT and vertical evolution, hence to say that "both phenomena are being observed" (lines 455-456) is misleading.

    3. Discussion or Conclusion: The authors mentioned that a limitation in their study is that the genomes they downloaded were those available only up to January 2020. I think there are a few more important limitations and caveats that need to be discussed (for example, see comment 3.c above)

  4. ##Preprint Review

    This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on medRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 1 of the manuscript.

    ###Summary:

    The reviewers agreed that the topic is interesting in principle, i.e. tracking antibiotic resistance globally in less well-studied but nonetheless clinically important bacterial species. However, the reviewers also had several major concerns, with the main concerns being:

    1. Overall lack of rigor in the analysis. This is due in large part to a lack of precision in the methods, e.g. differences in diversity are not statistically supported, lengths of evolutionary distance are not defined, the definition of a resistance gene is unclear, how an outbreak is defined is unclear.

    2. The paper does not address biases in sample collection. Since the data were taken from a central repository, there are many different studies included, each with their own biases. It is important to address these biases when comparing datasets from different groups and from different geographical locations.

    3. There is insufficient evidence to make claims about horizontal gene transfer.

    The individual reviews provide more details on each of these points.