Evolutionary genomics reveals variation in structure and genetic content implicated in virulence and lifestyle in the genus Gaeumannomyces

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

Gaeumannomyces tritici is responsible for take-all disease, one of the most important wheat root threats worldwide. High-quality annotated genome resources are sorely lacking for this pathogen, as well as for the closely related antagonist and potential wheat take-all biocontrol agent, G. hyphopodioides . As such, we know very little about the genetic basis of the interactions in this host–pathogen–antagonist system. Using PacBio HiFi sequencing technology we have generated nine near-complete assemblies, including two different virulence lineages for G. tritici and the first assemblies for G. hyphopodioides and G. avenae (oat take-all). Genomic signatures support the presence of two distinct virulence lineages in G. tritici (types A and B), with A strains potentially employing a mechanism to prevent gene copy-number expansions. The CAZyme repertoire was highly conserved across Gaeumannomyces , while candidate secreted effector proteins and biosynthetic gene clusters showed more variability and may distinguish pathogenic and non-pathogenic lineages. A transition from self-sterility (heterothallism) to self-fertility (homothallism) may also be a key innovation implicated in lifestyle. We did not find evidence for transposable element and effector gene compartmentalisation in the genus, however the presence of Starship giant transposable elements may contribute to genomic plasticity in the genus. Our results depict Gaeumannomyces as an ideal system to explore interactions within the rhizosphere, the nuances of intraspecific virulence, interspecific antagonism, and fungal lifestyle evolution. The foundational genomic resources provided here will enable the development of diagnostics and surveillance of understudied but agriculturally important fungal pathogens.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    General Statements

    We thank all three reviewers for their time and care in reviewing our manuscript, in particular Reviewer 3 for providing a detailed critique that was very useful for planning revisions. We are grateful that all three reviewers indicate that the new genome resources presented in this work are of high-quality and address an existing knowledge gap. We are also grateful for general assessments that the manuscript is 'well-written', and the analyses 'well performed' and 'thorough'.

    We acknowledge Reviewer 3’s legitimate criticism that the assembly and annotation data is not already publicly available and would like to assure the reviewing team that we have been pressing NCBI to progress the submission status since before the preprint was submitted. We regret the delay but hope that we can resolve this issue promptly. Furthermore, as some additional fields in the REAT genome annotation are lost during the NCBI submission process, we will ensure that comprehensive annotation files are also added to Zenodo.

    Reviewer 3 also made the general comment that 'the manuscript could greatly benefit from merging the result and discussion sections' and we would naturally be happy to make this adjustment if the journal in question uses that format.

    Description of the planned revisions

    • We will follow suggestions by Reviewer 3 to improve clarity of two figures:

    Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

    Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

    • We will address the choice of LOESS versus linear regression for investigating the relationship between candidate secreted effector protein (CSEP) density and transposable element (TE) density, as queried by Reviewer 3:

    Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

    We agree with the reviewer, that as we did not find an association with the more sensitive LOESS, we expect that linear regression would also not find an association, supporting our current conclusions. We will add this negative result into the text.

    • We will check for other features associated with the distribution of CSEPs, as queried by Reviewer 3:

    Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

    • We will integrate TE variation into the PERMANOVA lifestyle testing, as suggested by Reviewer 3:

    Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

    In reviewing this suggestion, we also spotted an error in our data plotting code, and the PERMANOVA lifestyle result for all genes will be corrected from 17% to 15% in Fig. 4a. Correcting this error does not impact our ultimate results or interpretation.

    • To complement the current graphical-based assessment of approximate data normality, we will include additional tests (Shapiro-Wilk for sample sizes

    Line 743: Q-Q plots are not a formal statistical test for normality.

    • One of the main critiques from Reviewer 3 was that, although we already acknowledged low sample sizes being a limitation of this work, the manuscript could benefit from reframing with greater consideration of this factor. They also highlighted a few specific places in the text that could be rephrased in consideration of this:

    Line 267: "Multiple strains" can be misleading about the magnitude.

    Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

    Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

    Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

    We intend to edit the text to address this, including removal of both text and figure references to ‘soft-core’ genes, as we agree the term is likely not meaningful in this case, and removing it has no bearing on the results or interpretation.

    Description of the revisions that have already been incorporated in the transferred manuscript

    • We have amended the text in a number of places for clarity/fluency as suggested by Reviewer 3:

    ii) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

    Please see new lines 477-478: ‘Regarding differences between pathogenic Gt and non-pathogenic Gh, we found that Gh has a larger overall genome size and greater number of genes.’

    Lines 309-314: The message seems a bit out of context in the paragraph.

    This is valid, these lines have now been removed.

    Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

    Our intention here was the exact reverse, that crop pathogens are under pressure to favour homothallism (as Reviewer 3 points out, anecdotally this often seems to play out in nature). We have rephrased lines 386-390 to hopefully make our stance more explicit: 'Together, this could suggest a selective pressure towards homothallism for crop fungal pathogens, and a switch from heterothallism in Gh to homothallism in Gt and Ga may, therefore, have been a key innovation underlying lifestyle divergence between non-pathogenic Gh and pathogenic Gt and Ga.'

    Lines 463-464: Please refer to the analyses when discussing the genetic divergence.

    We have rephrased this sentence to make our intended point clearer, please see new lines 459-461: ‘If we compare Ga and Gt in terms of synteny, genome size and gene content, the magnitude of differences does not appear to be more pronounced than those between GtA and GtB.’

    • We have also fixed the following typographic errors highlighted by Reviewer 3:

    Line 399: You mean, Fig 4C?

    Line 722: You missed "trimAI"

    Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder"

    • We have added genome-wide RIP estimates to Supplementary Table S1 as requested by Reviewer 3:

    Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

    • We have added a note clarifying that differences in overall genome size between lineages are not fully explained by differences in gene copy-number (lines 406-408: 'We should note that the total length of HCN genes was not sufficiently large to account for the overall greater genome size of GtB compared to GtA (Supplemental Table S1).') in response to a comment from Reviewer 3:

    Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

    • We have made an alteration to the author order and added equal second-author contributions.

    Description of analyses that authors prefer not to carry out

    • In response to our analysis regarding the absence of TE-effector compartmentalisation in this system, Reviewer 1 requested additional analyses:

    While TE enrichment is typically associated with accessory compartments, it is not a defining feature. To bolster the authors' claim, it is essential to demonstrate that there is no bias in the ratio of conserved and non-conserved genes across the genomes.

    We believe that there are two slightly different compartmentalisation concepts being somewhat conflated here – (1) the idea of compartments where TEs and virulence proteins such as effectors are significantly colocalised in comparison with the rest of the genome, and (2) the idea of compartments containing gene content that is not shared in all strains (i.e. accessory). The two may overlap – as Reviewer 2 states, accessory compartments may also be enriched with TEs – but not necessarily. We specifically address the first concept in our text, and we appreciate Reviewer 3’s response on this subject:

    There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

    We believe that the second concept of accessory compartments is shown to be irrelevant in this case from our GENESPACE results (see Fig. 2), which demonstrate that gene content is conserved, broadly syntenic even, across strains, with no clear evidence of accessory compartments or chromosomes regarding gene content. We have already acknowledged that other mechanisms of compartmentalisation beyond TE-effector colocalisation may be at play (as seen from our exploration of effector distributions biased towards telomeres, see section from line 156: ‘Although CSEPs were not broadly colocalised with TEs, we did observe that they appeared to be non-randomly distributed in some pseudochromosomes (Fig. 3a)…’).

    • Reviewer 1 questioned the statement that higher level of genome-wide RIP is consistent with lower levels of gene duplication:

    L422: Is the highest RIP rate in GtA consistent with its low levels of gene duplication? Does this suggest that duplicated sequences in GtA are no longer recognizable due to RIP mutations? This seems counterintuitive, as RIP is primarily triggered by gene duplication.

    Our understanding is that, while RIP can directly mutate coding regions, it predominantly acts on duplicated sequences within repetitive regions such as TEs (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02060-w), which has a knock-on effect of reducing TE-mediated gene duplication. In Neurospora crassa, where RIP was first discovered and thus the model species for much of our understanding of the process, a low number of gene duplicates has been linked to the activity of RIP (https://www.nature.com/articles/nature01554). We therefore believe the current text is reasonable.

    • Reviewer 2 stated that experimental validation of gene function is required to make clear links to lifestyle or pathogenicity:

    In my eyes, the study has two main limitations. First of all, the research only concerns genomics analyses, and therefore is rather descriptive and observational, and as such does not provide further mechanistic details into the pathogen biology and/or into pathogenesis. This is further enhanced by the lack of clear observations that discriminate particular species/lineages or life styles from others in the study. Some observations are made with respect to variations in candidate secreted effector proteins and biosynthetic gene clusters, but clear links to life style or pathogenicity are missing. To further substantiate such links, lab-based experimental work would be required.

    We agree that in an ideal world supportive wet biology gene function experimental evidence would be included. Unfortunately, transformation has not been successfully developed yet in this system (see lines 33-35: ‘There have also been considerable difficulties in producing a reliable transformation system for Gt, preventing gene disruption experiments to elucidate function (Freeman and Ward 2004).’) not for lack of trying – after 18 months of effort using all available transformation techniques and selectable markers neither Gt or Gh was transformable. Undertaking that challenge has proven to be far beyond the scope of this paper, the purpose of which was to generate and analyse high-quality genomic data, a major task in itself. We again appreciate Reviewer 3’s response to this point, agreeing that it is out of scope for this work:

    I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

    As per the suggestion by Reviewer 3, we will add text to address the absence of in-depth experimental work within the scope of this study.

    • Reviewer 3 suggested we might 'consider including formal population differentiation estimators', however, as they previously highlighted above, our sample sizes are too small to produce reliable population-level statistics.

    • Reviewer 3 raised the disparity in the appearance of branches at the root of phylogenetic trees in various figures:

    Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

    All the trees were rooted with M. poae as the outgroup, and although it may seem counterintuitive, a trifurcation at the root is the correct outcome in the case of rerooting a bifurcating tree, please see this discussion including the developers of both leading phylogeny visualisation tools ggtree and phytools (https://www.biostars.org/p/332030/). Although it is possible to force a bifurcating tree after rooting by positioning the root along an edge, the resulting branch lengths in the tree can be misleading, and so in cases where we wanted to include meaningful branch lengths in the figure (i.e. estimated from DNA substitute rates, in Figures 4a, S5 and S13) we have not circumvented the trifurcation. In Fig S17 meaningful branch lengths have not been included and the tree only represents the topology, resulting in the appearance of bifurcation at the root.

    • Reviewer 3 suggested that the discussion on giant Starship TEs resembled more of a review:

    Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

    The reviewer has a valid point. However, Starships are a recently discovered and thus underexplored genetic feature that readers from the wider mycology/plant pathology community may not yet be aware of. We believe it is warranted to include some additional exposition to give context for why their discovery here is novel, interesting and unexpected. We are naturally keen to investigate the make-up of the elements we have found in this lineage, however that will require a substantial amount of further work. Analysis of Starships is not trivial, for example the starfish tool is still under development and a limited number of species have been used to train it. How best to compare elements is also an active area of investigation – they are dynamic in their structure and may include genes originating from the host genome or a previous host – and for this reason we believe is out of scope to interrogate alongside the other foundational genomic data presented in this paper.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    Summary:

    The manuscript "Evolutionary genomics reveals variation in structure and genetic content implicated in virulence and lifestyle in the genus Gaeumannomyces" by Rowena Hill and collaborators is a thorough, well-planned and designed work. They have described 9 almost complete new assemblages, from their most general characteristics to their genetic content and implications. I am very pleased with the quality and completeness of this work and agree that it provides a very useful resource and framework for further research on this important organism.

    The three main motivations of the present study were:

    1. Are there genomic signatures distinguishing Gt A/B virulence lineages?;

    2. How do gene repertoires differ between pathogenic Gt and non-pathogenic Gh? And, iii) Is there evidence of genome compartmentalisation in Gaeumannomyces?

    a) The authors themselves recognise the low number of samples in their work (Lines 453-454) and this limitation hampers the establishment of a clear association between lineage-specific virulence and genomic signatures. I would argue that the present work needs to be reframed factoring this main limitation.

    b) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

    c) There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

    Major comments:

    The authors have not published the genomic data. Therefore, it is impossible to audit the quality of the assemblies and impedes its reproducibility. It is also bad practice by current scientific standards.

    I strongly believe that the manuscript could greatly benefit from merging the result and discussion sections. It would be easier for the reader to follow their entire logic. This is of course something optional and contingent on the journal format.

    Minor and specific comments:

    RESULTS

    • Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

    • Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

    • Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

    • Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

    • Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

    • Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

    • Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

    DISCUSSION

    • Line 267: "Multiple strains" can be misleading about the magnitude.

    • Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

    • Lines 309-314: The message seems a bit out of context in the paragraph.

    • Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

    • Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

    • Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

    • Line 399: You mean, Fig 4C?

    • Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

    • Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

    • Lines 463-464: Please refer to the analyses when discussing the genetic divergence. Consider including formal population differentiation estimators.

    METHODS

    • Line 722: You missed "trimAI"

    • Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder" Line 743: Q-Q plots are not a formal statistical test for normality.

    Referees cross-commenting

    I agree with my peer reviewers and appreciate that we have shared common concerns and suggestions. I also agree with their comments.

    I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

    Significance

    The work presented by Hill and co-workers contributes to the understanding of the genetic basis of host-pathogen interactions and evolutionary dynamics in the important fungus responsible for wheat "take-all-disease", Gaeumannomyces tritici. By providing 9 new near-complete assemblages, this work will provide a valuable resource for research on this agronomically important organism. This work sets the stage for developing a global pangenome of G. tritici that can expand analyses of its population structure and specific genetic elements that are associated with its virulence.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    In this study, the authors present genome assemblies for nine strains of the genus Gaeumannomyces, including 5 strains that belong to two different virulence lineages of the wheat take-all decline pathogen G. tritici, 2 strains of the antagonist G. hyphopodioides and 2 of the oat take-all decline pathogen G. avenae. The authors assess gene catalogs, CAZyme repertoires, effector catalogs, TE abundance, compartmentalisation and the occurrence of Starship giant transposable elements. Overall, there are no striking differences that discriminate the genomes and that can be linked to differential life styles. Weak correlations were found for some of the different lineages, but no functional analyses have been performed to further solidify such differences.

    Significance

    • Overall, the study fills a knowledge gap, given that no-few high quality genomes for the soil-borne fungi of the Gaeumannomyces genus are available. The genome assemblies are of high quality, and the work that is presented is mainly solid and robust. The analyses are well performed, sound and informative.

    • In my eyes, the study has two main limitations. First of all, the research only concerns genomics analyses, and therefore is rather descriptive and observational, and as such does not provide further mechanistic details into the pathogen biology and/or into pathogenesis. This is further enhanced by the lack of clear observations that discriminate particular species/lineages or life styles from others in the study. Some observations are made with respect to variations in candidate secreted effector proteins and biosynthetic gene clusters, but clear links to life style or pathogenicity are missing. To further substantiate such links, lab-based experimental work would be required.

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    The manuscript by Hill et al. presents nearly complete genomes of nine Gaeumannomyces strains, including both phytopathogenic and non-pathogenic (symbiotic) fungi. The manuscript is well-written, and the data it presents are of high quality, offering implications for understanding the evolution and diversification of Magnaporthales fungi, which encompass economically important phytopathogenic species such as Gaeumannomyces graminis and Pyricularia oryzae. I believe that the determination of these nearly complete genomes alone justifies publication. However, I have some concerns as described below.

    Major concern:

    One potential criticism pertains to whether the authors' assertion that Gaeumannomyces taxa have one-compartment genomes is fully supported by the data. The authors demonstrate in this manuscript that transposable elements (TE) and putative effector genes (CSEPs) are not co-localized in the Gaeumannomyces genomes. However, this evidence may not be robust enough to substantiate their claim. The concept of two- or multi-speed genomes suggests that fungal genomes consist of compartments that differ in the rate of evolution but not necessarily in TE content. While TE enrichment is typically associated with accessory compartments, it is not a defining feature. To bolster the authors' claim, it is essential to demonstrate that there is no bias in the ratio of conserved and non-conserved genes across the genomes.

    Minor concern:

    L422: Is the highest RIP rate in GtA consistent with its low levels of gene duplication? Does this suggest that duplicated sequences in GtA are no longer recognizable due to RIP mutations? This seems counterintuitive, as RIP is primarily triggered by gene duplication.

    In my opinion, the analysis of the genomic differences facilitating parasitic and symbiotic lifestyles seems somewhat weak.

    Significance

    This manuscript offers new genomic insights into economically important phytopathogenic fungal species, and sheds light on the diversification of parasitic and symbiotic fungi during evolution. While the analyses conducted are mostly appropriate and reasonable, they do not yield particularly surprising findings.