The genome of Tripterygium wilfordii and characterization of the celastrol biosynthesis pathway

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Tripterygium wilfordii is a vine from the Celastraceae family that is used in traditional Chinese medicine (TCM). The active ingredient, celastrol, is a friedelane-type pentacyclic triterpenoid with putative roles as an antitumor, immunosuppressive, and anti-obesity agent. Here, we report a reference genome assembly of T. wilfordii with high-quality annotation using a hybrid sequencing strategy. The total genome size obtained is 340.12 Mb, with a contig N50 value of 3.09 Mb. We successfully anchored 91.02% of sequences into 23 pseudochromosomes using high-throughput chromosome conformation capture (Hi–C) technology. The super-scaffold N50 value was 13.03 Mb. We also annotated 31,593 structural genes, with a repeat percentage of 44.31%. These data demonstrate that T. wilfordii diverged from Malpighiales species approximately 102.4 million years ago. By integrating genome, transcriptome and metabolite analyses, as well as in vivo and in vitro enzyme assays of two cytochrome P450 (CYP450) genes, TwCYP712K1 and TwCYP712K2, it is possible to investigate the second biosynthesis step of celastrol and demonstrate that this was derived from a common ancestor. These data provide insights and resources for further investigation of pathways related to celastrol, and valuable information to aid the conservation of resources, as well as understand the evolution of Celastrales.

Article activity feed

  1. Now published in Gigabyte doi: 10.46471/gigabyte.14

    Tianlin Pei 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, China2State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, 200032, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMengxiao Yan 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJie Liu 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMengying Cui 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYumin Fang 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBinjie Ge 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJun Yang 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, China2State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, 200032, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: yangjun@csnbgsh.cn zhaoqing@cemps.ac.cn

    **Reviewer 1. C Robin Buell ** Is the language of sufficient quality?
    No. The manuscript could be improved with a round of editing for grammar.

    Is there sufficient detail in the methods and data-processing steps to allow reproduction?
    No. The sequencing, assembly and annotation methods need more details.

    Any Additional Overall Comments to the Author:
    This manuscript describes the sequencing, assembly, annotation, and analysis of the Tripterygium wilfordii genome. T. wilfordii is a medicinal plant that has long been used in traditional medicine due to its production of alkaloids and triterpenoids; the focus of this study was identify cytochrome P450s involved in biosynthesis of the triterpenoid celastrol.

    Based on the genome assembly metrics, the authors generated a robust representation of the genome sequence. Improvements in the analyses of the genome and in the manuscript would greatly strengthen confidence in the assembly. The authors should provide these metrics and additional information to the manuscript:

    More details on the error correction of the assembly. Based on the methods, both nanopore and Illumina WGS reads were used, however, this is not explicit nor are any metrics of the error correction provided.

    Specifically it is not discussed how the nanopore reads were assembled. A company is cited for the genome assembly. Information on what assembly software that was used must be provided.

    Every software program used, its version, and the parameters used should be provided in the methods. This is often missing.

    The quality of the genome should be confirmed using both alignment of the whole genome shotgun reads and the mRNAseq data. Specific metrics should be provided include: total and percentage of reads that mapped, read pairs that mapped in the correct orientation.

    No details on read quality assessment or trimming are provided

    The CEGMA results should be omitted, this program has been deprecated.

    Line 337: The DNA was sheared not interrupted into fragments Line 343: More details on the library preparation and sequencing for the nanopore reads.

    Do the authors know the genome size of the species based on flow cytometry? Do you know the number of chromosomes that this species has? This should be stated and discussed in context of the assembly size and number of pseudochromosomes

    The genome wide identification of the CYP450 candidates was difficult to follow. This section should be revised so that it is clear how the authors identified their candidate genes. Potentially adding a supplemental figure would be helpful. I found the coexpression pattern extremely difficult to follow. I would not call coexpression patterns coexpression profiles. Specifically I did not understand the sentence on line 202 “However, no….”. Essentially this is just sub-functionalization at the expression level, not that there are two independent pathways.

    The evolution section should be expanded. How divergent are T. wilfordii from P. trichocarpa and R. communis?

    Table 1: Index should be replaced with metric

    Figure S1: What k-Mer was used in the analysis? Figure S5: Unclear what is on the X or y axis. Expand the figure legend.

    The manuscript should be proofed for grammar as there are numerous sentences that need editing.

    Recommendation Major Revision

  2. Tripterygium wilfordii

    **Reviewer 2. Xupo Ding ** Is the language of sufficient quality? The language of one third paragraph is sufficient quality

    Comments This manuscript provided the reference genome assembly of T. wilfordii by using a combined sequencing strategy(Nanopore, Bionano, Illumina, HiSeq, and Pacbio)and functions of two CYP450 genes were identified with enzyme assays in vivo and in vitro. This research also provided valuable information to aid the conservation of resources and help us reveal the evolution of Celastrales and key genes involving in celastrol biosynthesis. However, it should be well improved about the text.

    1. The comma in the title is suggested to remove.

    2. Nothing in biology makes sense in the light of evolution (T.Dobzhansky), the abstract were not presented vitial results in the manuscript, such as gene numbers, repeat percentage, comparative evolutional analysis. The contribution or sense of T.wilfordii genome were not limited in celastrol biosynthesis in Line38-39, it also provide valuable information to aid the conservation of resources and help us reveal the evolution of Celastrales and key gene involving in celastrol biosynthesis.

    3. Nanopore is not an appropriate key word, the equal platforms, Illumina, Bionano, Pacbio and Hi-C, were also presented in the manuscript.

    4. Tales of legendia mentioned (line 59-61) in scientific paper might be controversial.

    5. Line 61-63 were described colloquially. Please consider replace it with The extraction of T.wilfordii bark have been used as a pesticide from ancient times in China, which recoded in the Illustrated Catalogues of Plants published in 1848 firstly.

    6. Line 103-104 is not coherent with the above sentence.

    7. Line 112, the N comprising rate is 0% ?

    8. Line 117-118, Both results indicated that the presented genome is relative complete. This is uncommon and definitely worth negotiating over. This sentence might be contained in the section of discussion even it is credible.

    9. Line 145, the full name should be entered for the mentioning firstly.

    10. Line 150-155, Copia and Gypsy were missed.

    11. The gene families contained TwCYP712K1 and TwCYP712K2 was expanded or contracted in the CAFÉ analysis?

    12. WGCNA might present much more reliable evidence for candidate of TwCYP712K1 and TwCYP712K2, even the pearson's correlation coefficients is the simplified version of WGCNA.

    13. The full peak should be presented in figure 5A and 5B. The data of NMR and MS uploading as the additional file will be enhance credibility of enzyme function.

    14. Line 269-272, the evolution analysis in Figure 2B indicated that the original time of T.wilfordii is earlier than the original times of P.trichocarpa and T.communis, is this suggested that the functions of TwCYP712K1 and TwCYP712K2 has been fused in the evolution of Malpighiales and Celastrales in Figure 6? If the authors insisted these two P450 came from the common ancestor, syntenic analysis of TwCYP712K1 and TwCYP712K2 within T.wilfordii and A.trichopoda, O.sativa or V.vinifera might be credible.

    15. The latin name should be contained complete specie name in all figures, such as T.wil should be replaced with T.wilfordii.

    16. Line322, transcriptom is transcriptome.

    17. Line330, please add the longitude and latitude.

    18. Please revise the English of total pages except the line 327- 509 and 526-599. line 327-509 might come from the concluding report of sequence project.

    19. Line 606. LAST might be BLAST?

    20. I noticed that the genome of T.wilfordii genome have been published on Nature communication in Feb. 2020. So I suggest adding some comparison to their assembly or triptolide synthesis and cite this paper. Mentioning these contents will look fair and also will highlight the special celastrol synthesis of the one you present here.

    Major Revision