Genome Architecture Reveals Hidden Strain-Level Diversity in the Highly Conserved Fish Pathogen Nocardia seriolae

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Fish nocardiosis, caused by Nocardia seriolae, poses a persistent threat to the aquaculture industry. Yet, the genomic determinants underlying strain-level diversity and adaptation remain poorly understood due to the high nucleotide conservation of this species. The objective of this study was to characterize genome-level variation among all publicly available complete genomes using integrative comparative genomics approaches that extend beyond nucleotide identity metrics. Nine complete genomes were analyzed using average nucleotide identity, whole-genome structural rearrangement analysis, single-copy phylogenomics, genomic island prediction, pangenome reconstruction, functional annotation, and antimicrobial resistance and virulence profiling. Although average nucleotide identity values confirmed extreme nucleotide conservation across all strains, extensive strain-specific structural rearrangements, including inversions, translocations, and duplications, were detected. Phylogenomic reconstruction resolved geographically associated lineages despite minimal nucleotide divergence. Pangenome analyses supported an open pangenome dominated by a large, conserved core genome, with limited but persistent accessory gene content and core biased gene duplication. Functional profiling revealed enrichment of transcriptional and metabolic processes, while resistance and virulence analyses identified only conserved intrinsic determinants shared across all strains. These findings demonstrate that genome architecture and pangenome dynamics provide critical resolution for understanding N. seriolae diversification. The study highlights the importance of integrating structural genomics and phylogenomics for strain tracking and surveillance in aquaculture systems

Article activity feed

  1. Thank you for addressing comments and suggestions. Your article was sent back out for re-review and before we can accept your manuscript, we would like to invite you to make minor amendments in line with the reviewers’ report

  2. Comments to Author

    First I would like to thank the authors for addressing the comments done in the previous round of revision. I can visualize the effort done in addressing each of the comments done by the reviewers. The new version looks expanded and stronger. However, certain points should be raised: - Regarding the previous comment about the reproducibility of the code, the given details in the described methods can partially address it. However, if certain code was used for this study, whether commands on a terminal or a workflow, this should be shown too. Repositories like GitHub or FigShare can be of use for this. Please, consider this option if it applies for your study. - Why was the most divergent strain used as a reference for the genomic architecture analysis? Does this affect the final result about the genomic architecture diversity shown by the other strains? - For the tree in Figure 2A: To show a picture of the host next to the strain is certainly helpful. However, it can be confusing when associating each host to a country. It would be suggested to use colors or labels next to the leaves for a better reading of the tree. - In Line 178 - Minimum identity is set at 90% for ABRicate. However, Table 2 shows a lower identity for both the resistence and virulence genes addressed. - While the question about the number of genomes used for the pangenome analysis has been correctly addressed by the authors, the conclusion about the openness of this pangenome remains too promising, considering the use of 9 genomes. I suggest a reread of these conclusions as to be more accurate with the limitations presented by the sample size. Overall the manuscript looks more detailed than before and I would like to thank the authors again for their efforts.

    Please confirm that no generative AI tools or large language models have been used to generate this peer review report or to assist with any part of the peer review process.

    I confirm no generative AI tools were used in preparation of this review.

    Please rate the manuscript for methodological rigour

    Satisfactory

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Partially support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes

  3. Comments to Author

    The manuscript presents a genomic exploration on a pathogen which is of importance in the fish industry. Thus, the analysis performed by the authors is of importance and is of interest in the industry. However, certain points should be addressed to solidify the presented work. Title: The title introduces the reader to concepts such as homogeneity and pangenomics. However, homogeneity is only addressed here and it is not discussed further in the manuscript. It would be suggested to address the use and meaning of "homogeneity" for the objectives addressed in the manuscript. It is understandable that the manuscript compares genomes and finds that these genomes do not present much variation. However, "homogeneity" might not be the correct word to address these differences and variation between genomes. Introduction: The first paragraph does a good job describing the problematics around this pathogen. However, the manuscript would benefit itself if this is shortened, so as to keep the overall discussion focused on the genomics of N. seriolae. The important points here would be the economic problems that N. seriolae represents, and a brief introduction on the pathogenic process, so as to address virulence factors. Objective: The objective, as being addressed from the 80th line, can be read as ambitious. Genomics at this stage is mostly an exploratory tool, and it helps to define an objective for further studies. In this case, even if the findings of the study do contribute to knowledge for further studies in the species diversity, for example, it doesn't necessarily solve completely this question because of the limitation on information. It would be suggested to simplify the general aim of the work, reducing its scope taking into account the limits of genomic analysis. Materials and Methods: It is necessary to raise a concern regarding the use of 7 genomes in the overall pangenomic analysis. It is understood that for the objectives of the further analysis, like pathogenic islands, complete genomes are necessary. However, when it comes to the construction of the pangenome of a species, all genomic data from said species should be used, since completeness is not an issue in pangenomic analysis. It would be suggested to use complete and draft genomes of N. seriolae only for the pangenomic analysis. And for further analysis regarding the identification of genomic islands in the genomes, it would be advised to continue only with the 7 complete genomes. As a side note, at line 109, the authors should specify the version of IslandViewer used for the analysis. Results: Certain issues with the figures showing the results should be addressed: Supplementary Figure 3 would be more comprehensible if it uses the strain names rather than the GenBank Accession Numbers. Supplementary Figure 4 needs to have a more extensive description since it is not properly described and it is difficult to understand it as it is. Figures 3 and 4 would benefit from having bigger labels. Regarding Figure 4, a better description would be helpful for readers not familiar with pangenomics analysis. Regarding Figure 4, it is understood that the phylogenomic tree building described in methods is used here. However, the Supplementary Figure 3 also provides a tree using Neighbor-Joining. It would be better if the method of this tree is addressed too. It would also be suggested to have a bigger phylogenetic tree in the principal figures being shown. Discussion: A previous work with 2 sub-clades is addressed. It would be interesting to get the details on how both works differ and might be of use to understand the benefits of comparative genomic analysis in the study of these pathogens. Overall the work is setting a stone raising questions about the pathogen N. seriolae. However, using as much information as needed would be of great importance to the overall objective of the study. Regarding reproducibility: It is not clear where the used code is deposited.

    Please rate the manuscript for methodological rigour

    Satisfactory

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Strongly support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes

  4. Comments to Author

    The proposed methodological are appropriate for addressing the objectives. However, the analyses and interpretation of the results were discussed superficially. Given that the species is considered to have low genetic diversity, it would be more suitable to include higher-resolution analyses, such as variant calling, genomic rearrangements, or recombination analysis, considering the syntenic variation results. ANI analyses would not be sufficient to detect differences between strains . I recommend employing alternative methods for tree construction, such as maximum likelihood or Bayesian approaches. Details regarding strain-specific genes, accessory genes, and shared or unique genomic islands were not adequately discussed or annotated. Additionally, the presence of virulence genes was not described in relation to specific strains or their phenotypes, which would have been valuable to reinforce the study objectives. Caution is needed with terms such as "strong correlation" (lines 203-204), which would be more appropriate to describe a statistically significant relationship. Did not show the relevant results (significance of the open pangenome, mobile elements, and sequence similarity results) in the abstract . Regarding the results, to show in the figures, Tables, strain names or accession numbers . Additionally, the figures lack proper formatting, sizes letters are not appropriate (Figure 3 and supplementary figures) In the section on single-copy phylogenomic studies, the phylogeny and ANI analyses could be better represented, as this would enhance the discussion of clustering and their relationship with phenotypic characteristics (e.g., host, countries, years, or other ecological traits). Furthermore, improved Supplementary Figures 1 and 3 could be main figures. The discussion is descriptive, general and lacks insights of the functional roles of specific genes. Furthermore, the interpretation of phylogenetic trees is not clear (lines 284, 289). The study highlight high level of genetic uniformity in the species ("a high degree of uniformity in their genomes"). However, Figure 2 shows substantial structural differences (e.g., inversions, translocations) between genomes, while Table 1 and Supplementary Figure 5 indicate variability in accessory genes. Other question is the ANI analysis alone not is sufficient to evaluate differences between strains. For this reason I suggest a substantial revision of manuscript.

    Please rate the manuscript for methodological rigour

    Satisfactory

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Not at all

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes