Characterization of Shigella flexneri in northern Vietnam in 2012–2016

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction. Shigellosis remains a considerable public health concern in developing countries. Shigella flexneri and Shigella sonnei are prevalent worldwide and S. sonnei has been replacing S. flexneri .

Gap Statement. S. flexneri still causes outbreaks of shigellosis in northern Vietnam but limited information is available on its genetic characteristics.

Aim. This study aimed to characterize the genetic characteristics of S. flexneri strains from northern Vietnam.

Methodology . This study used 17 isolates from eight incidents, collected in northern Vietnam between 2012 and 2016. The samples were subjected to whole genome sequencing, molecular serotyping, cluster analysis and identification of antimicrobial resistance genes. Additionally, phylogenetic analysis was performed including isolates from previous studies.

Results. Clusters were identified according to spatiotemporal backgrounds. The results suggested that two incidents in Yen Bai province in 2015 and 2016 were derived from a very recent common ancestor. All isolates belonged to phylogroup (PG) 3, which was divided into two sub-lineages. Thirteen of 17 isolates, including those from the Yen Bai incidents, belonged to sub-lineage Sub-1 and were serotyped as 1a. The remaining four isolates belonged to sub-lineage Sub-2 and were the globally predominant serotype 2a. The Sub-1 S. flexneri isolates possessed the gtrI gene, which encodes the glycosyl transferase that determines serotype 1a, with bacteriophage elements in the vicinity.

Conclusion. This study revealed two PG3 sub-lineages of S. flexneri in northern Vietnam, of which Sub-1 might be specific to the region.

Article activity feed

  1. Comments to Author

    1. Methodological rigour, reproducibility and availability of underlying data There's still information missing that was previously requested and would be required for others to repeat this work: Line 85. I appreciate the extra information included, but what were the cut-offs for defining the genes to be core, what percentage identity and coverage is required? Bionumerics is not a freely available software, therefore it is not easy for a reader to access that information. Line 86: As above, what are the default settings? This software is not freely available and therefore this is not easy to access information for the reader. Line 88: As above, what are the default settings. Line 94: Was it PHAST or PHASTER that was used for this analysis, as PHAST has been superseded by PHASTER (Arndt, D., Grant, J., Marcu, A., Sajed, T., Pon, A., Liang, Y., Wishart, D.S. (2016) PHASTER: a better, faster version of the PHAST phage search tool. Nucleic Acids Res., 2016 May 3.) Line 95. As requested previously, what substitution model was used and what model was used for calculating the rate of heterogeneity? Line 101. Thank you for uploading the genome assemblies. 2. Presentation of results The figure legends ought to be expanded, currently they do not fully describe what is presented in the figures and lack detail that would be beneficial for the reader to interpret the figures. Figure 1: Still no mention of models used to produce the tree. Figure 1: I think for clarity it would be good to have a key for serotype 1a/2a (red/black) and for dfrA1/A14 (black/red). This information is in the figure legend but it would make it clearer to have it on the actual figure and red text for 1a and A14 would be unrequired (it makes it hard to read). Figure 1: Error in the figure assigning isolate 0228 to the Lao Cai region, which ought to be for isolate P12012. Figure 2: Parameters for the BioNJ tree still not included. Figure 3: Specific contig information is BSBY01000027, not sequence027. As reference is made to BSBY01000027 in the main text (as it should be), please change sequence027 to the accession number. Figure 3: Information that was previously in the figure legend has been removed, that ought to have remained. It would be beneficial for the reader to have the following information in the figure legend, to link to the figure: Compared sequences were obtained from, S. flexneri 0228 (accession number: CP012735), S. flexneri P15021 (accession number: BSBY01000027), S. flexneri Y53 (accession number: AF139596). Figure 3: Please include information that allows readers to know that the bands connecting the sequences represents nucleotide identity, and link to the key present on the figure. Figure 3: The comparison region has been altered from the previous version, omitting the 5' region of the CP012735 sequence (region previously spanned 4650000-4654000 bp, now it is 4659000-4699000 bp. PHAST analysis seems to reveal the phage region from CP012735 to be the region 4646913-4698561 bp (which is represented in Figure 4, and I feel these should be consistent). Figure 4: The figure resolution is rather low, which makes it difficult to read. Figure 4: Figure legend needs to be expanded to describe what the BRIG image represents e.g. that the rings represent BLAST nucleotide identity (presumably?), what sequences are presented within the rings, GC skew and GC% etc. 3. How the style and organization of the paper communicates and represents key findings The organisation and layout of the findings is fine. 4. Literature analysis or discussion Thank you for expanding on the findings and adding further context and discussion. Line 55: Please expand on this sentence "Phylogenetically, S. flexneri encompasses seven phylogroups (PGs), excluding S. flexneri serotype 6.", to explain how the phylogroups were determined by Connor et al, and including this reference (used already in your manuscript as number 5). This is important as serotypes aren't specific to a PG (as you see with your work and particularly due to the large number of serotypes present in PG3) and the definition of a phylogroup is not dependent on that. I think it is important for the reader to understand that, so adding this additional information would be helpful. Line 141 please change sequence027 to the accession number: BSBY01000027. Line 142 The following "The contig sequence027 (Accession number, BSBY01000027" should be changed to "The contig BSBY01000027" due to the previous change requested. Line 155 to 164: Thank you for expanding on these findings. I feel my previous comment was misunderstood (apologies). I was not referring to a transposon, but transposase genes associated with the phage element in 0228, which presumably would be required for movement of the phage out of and into the genome. Those genes appeared to be lost in isolate P15021. However, examining the BRIG figure it appears those genes may be present (given homology is detected) but most likely they are located on a different contig. Therefore, my previous comment is not relevant. Thank you for doing this extra analysis. 5. Any other relevant comments The additional information added to the manuscript is much appreciated, particularly the additional information and discussion in the "Results and Discussion" section.

    Please rate the manuscript for methodological rigour

    Good

    Please rate the quality of the presentation and structure of the manuscript

    Good

    To what extent are the conclusions supported by the data?

    Strongly support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes

  2. This is a study that would be of interest to the field and community. The reviewers have highlighted minor concerns with the work presented. Please ensure that you address their comments. Please provide more detail in the Methods section and ensure that software is consistently cited and its version and parameters included. Please pay particular attention the points raised by the reviewers regarding the inclusion of details for the methodology employed. Both reviewers also ask for the figures to be amended and made clearer. In the current format this manuscript is very descriptive and would benefit from a more in depth discussion by putting the findings into context.

  3. Comments to Author

    1. Methodological rigour, reproducibility and availability of underlying data The methodology is appropriate but requires additional information in order for other researchers to be able to reproduce this work: No mention of settings for BioNJ tree or MST tree (what algorithm was used?). This information needs to be included. Line 79 - no mention of specific settings for cgMLST: How many alleles were in the core genome MLST scheme? What were the parameters used for defining "core genome"? Line 80 - What were the settings for AMR analysis (minimum percentage for coverage and identity)? Line 82 - Were assemblies used for the Snippy analysis or were the reads used? This information should be included. Line 82 - Would be useful to mention the S. flexneri strain 301 is Serotype 2a, as it is relevant to the work undertaken. Line 85 - What substitution model was used and what model was used for calculating the rate of heterogeneity? Line 90 - With regard to data availability, as assemblies were produced and used in the analysis, they should be deposited in DDBJ as well as the reads. 2. Presentation of results Table 2: Adding either serotype information to this table and/or sub-lineage information would be helpful, particularly with regard to observations made. Figure 1: Would be helpful to have the information about the incident included. Could replace province abbreviations with incident information, as the colour coding already denotes the province. Figure 1: Additional information regarding settings for MST tree would be useful. Figure 1: What are the branch lengths? Allele differences? Figure 2: Bootstrap scores are missing from the tree. Figure 2: Is the tree unrooted? Figure 2: Information about the models used to produce the tree should be included in the figure legend (as with Figure 1). Figure 2: Typo: dfr genes (A1/A14) not (A1/A17) Figure 3: Annotation on the tree denotating each Phylogroup would be helpful, to allow for comparison. Figure 3: Again, what parameters were used for the BioNJ tree? Figure 3: What are the branch lengths? Allele differences? Figure 3: The figure legend is not clear to me. "Run accession numbers of representatives of phylogroups are shown along with nodes" - what this means is not clear and I would recommend rewriting. Accession numbers for the representatives of each phylogroup should be included in the figure legend and aren't. Figure 4: Needs to be annotated properly. Information should be added to the figure about which sequence belongs to which source (name only, spanning region not required and can be left in the legend), rather than just in the figure legend. The specific contig information for isolate P15021 should be included. 3. How the style and organization of the paper communicates and represents key findings The organisation and layout of the findings is fine. 4. Literature analysis or discussion There is a lot of discussion that could be included to go along with the findings. As well as expanding on the findings discussed: Line 116-118 - Might be good to mention that the only sub-2 isolate to have gyrA mutations lays within its own clade compared to the other sub-2 isolates? This is interesting and worth discussing. Line 118 - The authors mention "It was likely that Sub-1 and Sub-2 would correspond to Lin-3.1 and Lin-3.2 of PG3 in a recent study, respectively" - Please expand on this to explain reasoning or link back to previous observations made to explain reasoning. This is another finding that would be interesting to expand on and put into context. Line 121 - Do all the serotype 1a isolates have this region containing gtrI? If so, is it the same as the region found within P15021? If it isn't the same, how is it different? Line 122 - What is the definition of "most matched"? Coverage and identity information should be included. Line 123 - It is not significant that there is a higher degree of sequence coverage and identity (if that is what is meant but "most matched") between the CP012735 and P15021 compared to just the gtrI region, because the gtrI region only contains the immediate genes. It is not a "like for like" comparison. It would be more relevant if you compared on the gtrI region of CP012735, P15021 and the original gtrI sequence. However, overall comparison between CP012735 and P15021 is important on its own. Line 124-126 - This should be expanded upon. What you have found is that the gene responsible for serotype 1a assignment appears to be carried on a mobile element, which is why the isolates from China group in Sub-2, though they are serotype 1a (which you would expect to group in Sub-1). The authors go on to mention seroconversion being a strategy for survival and it appears the isolates from China have undergone seroconversion (as they group phylogenetically with serotype 2a isolates). Though there is no evidence presented here that this has occurred for the North Vietnam isolates. An additional reason to expand on your findings as it is an interesting discussion point. Line 127-130 - No mention of loss of transposase genes in the genomic island within the North Vietnam isolates of serotype 1a, suggesting they are unable to undergo seroconversion. This would be interesting to expand upon, particularly with the findings regarding the isolates from China that are serotype 1a and group with the serotype 2a isolates from this study found in North Vietnam. 5. Any other relevant comments General typos: Line 46 - Gram instead of gram Line 121 - Kbp instead of Kb (particularly because the authors use Kbp in figure 4) Line 122 - nBLAST instead of BLAST Overall this is an interesting paper, characterising Shigella flexneri strains from Northern Vietnam. I feel the paper would benefit from including a discussion about the findings presented, rather than just presenting the findings. There are a lot of interesting results here that lack context or expansion, therefore are not highlighted as well as they could be.

    Please rate the manuscript for methodological rigour

    Satisfactory

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Strongly support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes

  4. Comments to Author

    The authors characterize 17 Shigella flexneri isolates from 8 incident events that occurred in 5 provinces in Northern Vietnam. All isolates belong to phylogroup 3 and are distributed in 2 sublineages, Sub-1 and Sub-2. The manuscript is concise; however, the tables and Figures need to be revised or modified. Comments 1. In Abstract, "The results suggested that two incidents in Yen Bai province in 2015 and 2016 were attributable to a single source" The isolates from two incidents are genetically closely related, indicating that they are derived from a very recent common ancestor but they are not necessarily come from a common source. They may likely be attributable to a common source only. 2. Lines 54-55, "Phylogenetically, S. flexneri encompasses seven phylogroups (PGs), excluding S. flexneri serotype 6, S. flexneri serotype 2a of PG3 is predominant globally" The authors should briefly describe the 7 phylogroups of S. flexneri in the Introduction or Discussion, and provides more details regarding PG3. 3. Line 65 "3.1 Bacterial isolates" All isolates (genomes) used as references for the phylogenetic studies should be described in Methods. 4. Table 1. The reference strains used in the phylogenetic analysis should be included in Table 1, with relevant information (serotype, PGs, etc.). The accession numbers for the genomic sequences of the reference strains can be added to this Table. 5. Table 2. The information on incidents and sublineages should be added since the information has been mentioned in the Results and Discussion. 6. Figure 1 and Figure 2 Figure 1 and Figure 2 give phylogenetic relationships among the same panel of isolates. The phylogenetic structures constructed using the two methods are highly similar. Figure 2 contains more meaningful data. Figure 1 should be removed. 7. Lines 109-111, "The distribution of serotypes completely matched the two sub-lineages, and serotype 1a and 2a isolates belonged to Sub-1 and Sub-2, respectively" However, in Figure 2, strain 0228 is serotype 1a but belongs to Sub-2. The description should be revised. 8. Lines 228-231, "White circles show strains of reference ("REF") and the previous study [5]. The color indicates the province: green, Yen Bai; red, Son La; purple, Dien Bien; yellow, Cao Bang; and light blue, Lao Cai. Branch lengths are shown on the branches. Abbreviations of the province and isolation year are indicated on each node" White circles indicate all reference strains. What is REF? S. flexneri 301? The circles in the figure should be labeled with the isolate/strain names to keep consistency. The provinces of the isolates recovered have been indicated in colors illustrated on the figure legend, it is not necessary to state in the legend of the figure. References (indicated in white color) should be indicated in the figure legend. Since the figure generated using the tools provided in BioNumerics, it can be saved in "Metafile" format and can be edited using the Microsoft Powerpoint. 7. Figure 2 More information, e.g. year of isolation and incident, can be added in Figure 2. Serotypes 1a/2a, as well as dfrA1/dfrA14, can be listed on the same column in different colors. FEF should be replaced by the strain name (S. flexneri 301 or Sf301) as other 4 references (0228, IB0034, H10, H04) are indicated by strain names. 8. Lines 236-237, the description "Boxes indicate serotype (1a/2a), dfr genes (A1/A17), and provinces. The color indicative of the province is the same as in Fig. 1. The location of the province is shown on the right" The figure legend is not necessary. 9. Figure 3 The PGs (phylogroup) and strain names of the reference strains should be indicated. The characteristics of the reference strains should be described in Table 1 and the accession numbers should be given in Methods or Table 1. If Figure 3 was generated using BioNmerics, the authors can generate the figure in "Metafile" format that can be "degrouped" using Powerpoint to allow the objects (lines, circles, texts) to be modified or revised. Lines 242-243, the description "Run accession numbers of representatives of phylogroups are shown along with nodes. Red nodes indicate the isolates in this study" needs to be revised. 10. Figure 4. The strains and/or accession numbers for the 3 genetic maps should be indicated in the figure. The text of Lines 248-249 should be revised.

    Please rate the manuscript for methodological rigour

    Satisfactory

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Strongly support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    No: The research does not involves human and animal subjects, there are no ethical issues in the research.