Telomere-to-telomere African wild rice ( Oryza longistaminata ) reference genome reveals segmental and structural variation

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Rice ( Oryza sativa ) is one of the most important staple food crops worldwide, and its wild relatives serve as an important gene pool in its breeding. Compared with cultivated rice species, African wild rice ( Oryza longistaminata ) has several advantageous traits, such as resistance to increased biomass production, clonal propagation via rhizomes, and biotic stresses. However, previous O. longistaminata genome assemblies have been hampered by gaps and incompleteness, restricting detailed investigations into their genomes. To streamline breeding endeavors and facilitate functional genomics studies, we generated a 343-Mb telomere-to-telomere (T2T) genome assembly for this species, covering all telomeres and centromeres across the 12 chromosomes. This newly assembled genome has markedly improved over previous versions. Comparative analysis revealed a high degree of synteny with previously published genomes. A large number of structural variations were identified between the O. longistaminata and O. sativa . A total of 2,466 segmentally duplicated genes were identified and enriched in cellular amino acid metabolic processes. We detected a slight expansion of some subfamilies of resistance genes and transcription factors. This newly assembled T2T genome of O. longistaminata provides a valuable resource for the exploration and exploitation of beneficial alleles present in wild relative species of cultivated rice.

Article activity feed

  1. AbstractRice (Oryza sativa) is one of the most important staple food crops worldwide, and its wild relatives serve as an important gene pool in its breeding. Compared with cultivated rice species, African wild rice (Oryza longistaminata) has several advantageous traits, such as resistance to increased biomass production, clonal propagation via rhizomes, and biotic stresses. However, previous O. longistaminata genome assemblies have been hampered by gaps and incompleteness, restricting detailed investigations into their genomes. To streamline breeding endeavors and facilitate functional genomics studies, we generated a 343-Mb telomere-to-telomere (T2T) genome assembly for this species, covering all telomeres and centromeres across the 12 chromosomes. This newly assembled genome has markedly improved over previous versions. Comparative analysis revealed a high degree of synteny with previously published genomes. A large number of structural variations were identified between the O. longistaminata and O. sativa. A total of 2,466 segmentally duplicated genes were identified and enriched in cellular amino acid metabolic processes. We detected a slight expansion of some subfamilies of resistance genes and transcription factors. This newly assembled T2T genome of O. longistaminata provides a valuable resource for the exploration and exploitation of beneficial alleles present in wild relative species of cultivated rice.

    This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf074), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer 1: Francois Sabot

    The manuscript from Guang et al deals with a T2T assembly for the wild perennial African rice Oryza longistaminata. Using last up to date technologies and approaches, authors provided a high quality assembly for this wild species, rending it a valuable ressource for understanding rice evolution. While the results as assembly are of high quality, the interpretation of some biological results, in particular about the NBS-LRR, are quite weird, in my opinion, and need to be more refined. That's why I think the manuscript should be published, but after major corrections.

    in details:

    -Introduction: not sure the exceptional biomass is a good idea from longistaminata, as this plant has avery high content in silicium, rendering its biomass complex to use.

    • Methods: We do not have access to most of the command options and command-lines. please provide them at least as a texte file in supp data. In addition, some of the references for tools are missing. Finally, please provide the accession number of the assembled plant.
    • Assembly in itself: O longistaminata is a outcrossing heterozygous organism. Did you obtained the two haplotypes ?
    • Comparison with the previous longistaminata genome: is the inversion in middle of Chr6 specific ? or due to an error of previous assembly ?
    • Table 1: what do you mean "Total size of assembled genomes (bp) 331,045,917" ? What is the residual percentage of N ?
    • Figure 1 and others: please show the legend in other way, here we may mix it with the main text. in addition, check the legends for spelling and the size of figure (3b eg) for lisibility
    • Syri/MUMmer analysis: you limit as min size at 1kb ? What was the order of query vs ref ? can we have a bed file with the positions ?
    • SD: is there a statistical link between chromosome size and number of SD ? It could explain why the first 4 ones have more SD. In general, the data are missing stats.
    • GO in SD: any statistical validation ?
    • Genomes comparison: please provide the acc number of the genome you used for comparison.
    • NBS-LRR: the longistaminata genome has 215 genes for 116 to 289 for other oryza so I cannot see any contraction or expansion. in addition, the text here is weird, starting speaking of onctraction then going to expansion ???
    • TF analysis; the african assemblies are quite bad I think, explaining the discrepency. For glaberrima, did you check the one from Tranchant-Dubreuil et al, 2023 ?
  2. AbstractRice (Oryza sativa) is one of the most important staple food crops worldwide, and its wild relatives serve as an important gene pool in its breeding. Compared with cultivated rice species, African wild rice (Oryza longistaminata) has several advantageous traits, such as resistance to increased biomass production, clonal propagation via rhizomes, and biotic stresses. However, previous O. longistaminata genome assemblies have been hampered by gaps and incompleteness, restricting detailed investigations into their genomes. To streamline breeding endeavors and facilitate functional genomics studies, we generated a 343-Mb telomere-to-telomere (T2T) genome assembly for this species, covering all telomeres and centromeres across the 12 chromosomes. This newly assembled genome has markedly improved over previous versions. Comparative analysis revealed a high degree of synteny with previously published genomes. A large number of structural variations were identified between the O. longistaminata and O. sativa. A total of 2,466 segmentally duplicated genes were identified and enriched in cellular amino acid metabolic processes. We detected a slight expansion of some subfamilies of resistance genes and transcription factors. This newly assembled T2T genome of O. longistaminata provides a valuable resource for the exploration and exploitation of beneficial alleles present in wild relative species of cultivated rice.

    This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf074), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer 2: Chengzhi Liang

    The authors generated a 343-Mb telomere-to-telomere (T2T) genome assembly for an African wild rice (Oryza longistaminata), covering all telomeres and centromeres across the 12 chromosomes, and performed genome annotation and analyses on structural variations and NLR genes. While the manuscript has provided a valuable genome sequence, several problems should be addressed before the manuscript can be published.

    Major issues

    1. The authors estimated that the genome heterozygosity is 1.27%, which is quite high, so I am wondering how large the assembled genome size is using only HiFi data, which could reflect the actual heterozygosity rate of the genome, particularly by comparing it with the final genome size of 12 chromosomes. If there was only one gap in the initial assembly of Hifiasm (a total of 13 contigs), it is unlikely that the genome has such a high heterozygosity. In Table 1, the total size of assembled genome was 331,045,917bp. If this is the summed size of 12 chromosomes, it should be used as the final genome size in the main text. Please clarify. Also, what is the base accuracy of Ultra-long CycloneSEQ data? which is useful to readers for this is a new sequencing technology.
    2. For SV detection, considering that the assembled genome in the manuscript (does it have a accession ID or name?) is an African wild rice, it is rather strange that the authors did not compare it with an O. glaberrima genome, but with an O. sativa genome. Meanwhile, the name of the genomes should be mentioned since there were so many different genomes in each species, all with different SV variations between them.
    3. The conclusion that "This distribution suggests that chromosomes 1, 4, 3, and 2 might have contributed to the evolution of rice in previously unrecognized ways (Table S8)" is purely speculative, and thus should be removed from the manuscript, or the authors should provide more evidence to support it.
    4. The author claimed that "Compared with other Oryza species, O. longistaminata has many fewer NBS-lRR domain genes, which reflects a contraction of resistance genes in this species." Please give specific gene numbers for each species. Meanwhile, the conclusion does not look right here since it looks that O. longistaminata had more NBS-LRR genes than other species.

    Minor issues

    1. What is "quartets"?
    2. The author used "11 Oryza species" which included O. indica, please clarify what this species is.Bold