Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy
Curation statements for this article:-
Curated by eLife
Summary: We appreciate this study and find that the conclusions that reclassify Clostridiodes are largely justified by the data/analysis. The major concern is that the work represents the application of standard approaches to refine species classification, as opposed to either proposing a novel approach to classify species or defining a split that might be more surprising and/or clinically significant (e.g. Kumar et al. Nature Genetics, 2019). Consequently, despite being a useful contribution to the literature we believe it is more suitable for a specialized audience.
Reviewer #1 opted to reveal their name to the authors in the decision letter after review.
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
Abstract
Clostridioides difficile infection (CDI) remains an urgent global One Health threat. The genetic heterogeneity seen across C. difficile underscores its wide ecological versatility and has driven the significant changes in CDI epidemiology seen in the last 20 years. We analysed an international collection of over 12,000 C. difficile genomes spanning the eight currently defined phylogenetic clades. Through whole-genome average nucleotide identity, and pangenomic and Bayesian analyses, we identified major taxonomic incoherence with clear species boundaries for each of the recently described cryptic clades CI–III. The emergence of these three novel genomospecies predates clades C1–5 by millions of years, rewriting the global population structure of C. difficile specifically and taxonomy of the Peptostreptococcaceae in general. These genomospecies all show unique and highly divergent toxin gene architecture, advancing our understanding of the evolution of C. difficile and close relatives. Beyond the taxonomic ramifications, this work may impact the diagnosis of CDI.
Article activity feed
-
-
Reviewer #2:
In this manuscript, Knight et al examine the genetic diversity in >12,000 publicly available C. difficile genomes in order to characterize genomic evidence of taxonomic incoherence among this genomically diverse pathogen. Their primary analysis employs average nucleotide identity thresholds to identify species boundaries, with secondary analyses examining core genome size changes, gene content, and estimated emergence dates. The authors' main conclusion is that the previously identified C. difficile cryptic clades CI-III are genomically divergent enough from the main clades C1-5 to warrant classification as different genomospecies. This paper is a useful contribution in benchmarking our understanding of the genetic diversity of C. difficile using all currently publicly available genomes, but the results are largely …
Reviewer #2:
In this manuscript, Knight et al examine the genetic diversity in >12,000 publicly available C. difficile genomes in order to characterize genomic evidence of taxonomic incoherence among this genomically diverse pathogen. Their primary analysis employs average nucleotide identity thresholds to identify species boundaries, with secondary analyses examining core genome size changes, gene content, and estimated emergence dates. The authors' main conclusion is that the previously identified C. difficile cryptic clades CI-III are genomically divergent enough from the main clades C1-5 to warrant classification as different genomospecies. This paper is a useful contribution in benchmarking our understanding of the genetic diversity of C. difficile using all currently publicly available genomes, but the results are largely unsurprising given previous phylogenetic analyses involving clades 1-5 and CI-III, and is therefore probably best suited for a specialty journal. Additionally, in some instances, the methods lack details, reducing their interpretability and reproducibility.
Major Comments:
There are some claims that are too strong and not supported by the data or literature, including the claim that the rise of community-associated CDI is likely due to presence of C. difficile in livestock (Lines 53-54 - far too little evidence to make such a sweeping claim), the statement of apparent rapid population expansion into clades C1-4 (Lines 278-279 - only shown for certain sequence types and greatly impacted by observation bias), the statement that these findings "impacts the diagnosis of CDI worldwide" (Lines 37-38 -too grandiose given limited evidence of the clinical importance of the cryptic clades).
Generally, it is hard to discern which sets of genomes and variants were used for each of the bioinformatic analyses that are described. If there are a limited number of genome sets it might be useful to define them in the results to allow the reader to more easily follow along and understand the scope of different analyses.
The dated phylogenomic analyses methods would benefit from a more thorough assessment of model assumptions along with more description of the sources of bias and uncertainty at play. Specific questions are:
Was the temporal signal in the data evaluated?
What are the potential impacts of using a single clock model and demographic prior for such a diverse set of taxa?
Was the clock rate restricted to the cited 2.5x10-9 - 1.5 x 10-8 range? What clock prior distribution was applied?
Were relaxed clock priors explored?
What went into the selection of the demographic model prior in BEAST? Were alternative models evaluated?
The significant uncertainty in the divergence estimates should be emphasized/listed as a limitation.
Similarly, the pangenome analyses could be more thoroughly described, and the relevance of the core-genome size changes more robustly explored. Specifically:
How did the core genome change when excluding any of C1-5? Were these changes much different than when excluding CI-III?
The differences between Roary and Panaroo are notable, and potentially important for the microbial genomics community. More details should be provided on these results and how sensitive they are to the input parameters of the respective programs (e.g. collapsing paralogs in Roary and percent identity for orthologs). In addition, it is important to know if any filtering was done with respect to the quality of assemblies, which could have a significant impact on Roary's behavior.
-
Reviewer #1:
General Assessment:
The work presented by Knight et al. in "Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy" is of excellent quality and spans several of the themes of eLife. The manuscript provides a thorough and robust examination of publicly available C. difficile genomes, to deliver a much-needed update of C. difficile phylogeny, in particular the cryptic clades of C. difficile. However, there are some further clarifications could be included to confirm if the cryptic clades of C. difficile, and the 26 unclassified STs (which seemingly form 4 distinct clusters) should indeed be assigned to the Clostridioides genus, distinct from both C. mangenotii and C. difficile.
Specific comments:
Lines 96-97 and Figure 2: Figure 2 suggests the 26 unclassified STs form at least 4 distinct …
Reviewer #1:
General Assessment:
The work presented by Knight et al. in "Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy" is of excellent quality and spans several of the themes of eLife. The manuscript provides a thorough and robust examination of publicly available C. difficile genomes, to deliver a much-needed update of C. difficile phylogeny, in particular the cryptic clades of C. difficile. However, there are some further clarifications could be included to confirm if the cryptic clades of C. difficile, and the 26 unclassified STs (which seemingly form 4 distinct clusters) should indeed be assigned to the Clostridioides genus, distinct from both C. mangenotii and C. difficile.
Specific comments:
Lines 96-97 and Figure 2: Figure 2 suggests the 26 unclassified STs form at least 4 distinct clusters, yet these STs are classified as outliers. Could you please comment on why these are considered outliers? Or do these STs represent new cryptic clades? C-IV, C-V etc.? And do these unclassified STs also fit into the criteria for the novel independent Clostridioides genomospecies?
Lines 161-162; Table 1: C. mangenotii is referred to as Clostridioides mangenotii on lines 161-162, but has been listed as Clostridium mangenotii in table 1. Was this intentional? Or should this be Clostridioides mangenotii as C. difficile is also listed as Clostridioides difficile?
Figure 6: Many of the numbers and symbols on the figure are difficult to see e.g. Figure 6A the values listed above each data point are extremely small. Can these values/symbols be increased?
Lines 224-225: Given that C. difficile strains lacking tcdA and tcdB can still cause infections, consider rephrasing "indicating their ability to cause CDI".
Figure 7: As with Figure 6, many of the numbers and symbols on the figure are difficult to see. Can these values/symbols be increased?
General comments:
Were the unclassified STs included in the species wide ANI analyses in Figure 3? If similar analyses were performed for these STs and given the clusters that are presented in Figure 2 would this support the idea that they may also fit into the criteria for the novel independent Clostridioides genomospecies?
Similarly, were these same unclassified STs included in the BactDating and BEAST analyses? Or the pairwise ANI and 16S rRNA value comparisons in Figure 5? Or the pangenome and toxin gene analysis also presented in Figures 6 and 7? And would this add further strength to the idea that these "outliers" could be the first typed representatives of additional genomospecies?
Lastly, your conclusions are a little too on the fence. You have presented sufficient evidence to suggest that the cryptic clades of C. difficile likely represent novel independent Clostridioides genomospecies, but dilute out the importance of this throughout the discussion and conclusions. Although controversial, the evidence provided gives credence to these claims, and the text should be changed to reflect this.
-
Summary: We appreciate this study and find that the conclusions that reclassify Clostridiodes are largely justified by the data/analysis. The major concern is that the work represents the application of standard approaches to refine species classification, as opposed to either proposing a novel approach to classify species or defining a split that might be more surprising and/or clinically significant (e.g. Kumar et al. Nature Genetics, 2019). Consequently, despite being a useful contribution to the literature we believe it is more suitable for a specialized audience.
Reviewer #1 opted to reveal their name to the authors in the decision letter after review.
-
-