The assembly and annotation of two teinturier grapevine varieties, Dakapo and Rubired

Curation statements for this article:
  • Curated by GigaByte

    GigaByte logo

    Editors Assessment:

    Teinturier grapes produce berries with pigmented skin and flesh, and are used in red wine blends, as they provide a deeper colour. This paper presents the genomes of two popular teinturier varieties (Dakapo and Rubired); sequenced, assembled, and annotated to provide additional resources for their use in breeding. Combining Nanopore and Illumina sequencing for Dakapo, scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp and 36,940 gene annotations. For Rubired PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long and 56,681 genes annotated. Peer review has helped validate their high quality, these genomes hopefully enabling more insight into the genetics of grapevine berry colour and their other traits like frost and mildew-resistance.

    This evaluation refers to version 1 of the preprint

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Teinturier grapevines, known for their pigmented flesh berries due to anthocyanin production, are valuable for enhancing the pigmentation of wine, for potential health benefits, and for investigating anthocyanin production in plants. Here, we assembled and annotated the Dakapo and Rubired genomes, two teinturier varieties. For Dakapo, we combined Nanopore sequencing, Illumina sequencing, and scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp. Combining de novo annotation and lifting over annotations from the existing grapevine reference produced annotation 36,940 gene annotations for Dakapo. For Rubired, PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7–476.0 Mbp long. De novo annotation of the diploid Rubired genome yielded annotations for 56,681 genes. Both genomes are highly contiguous and complete. The Dakapo and Rubired genome assemblies provide genetic resources for investigations into berry flesh pigmentation and other traits of interest in grapevine.

Article activity feed

  1. Editors Assessment:

    Teinturier grapes produce berries with pigmented skin and flesh, and are used in red wine blends, as they provide a deeper colour. This paper presents the genomes of two popular teinturier varieties (Dakapo and Rubired); sequenced, assembled, and annotated to provide additional resources for their use in breeding. Combining Nanopore and Illumina sequencing for Dakapo, scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp and 36,940 gene annotations. For Rubired PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long and 56,681 genes annotated. Peer review has helped validate their high quality, these genomes hopefully enabling more insight into the genetics of grapevine berry colour and their other traits like frost and mildew-resistance.

    This evaluation refers to version 1 of the preprint

  2. ABSTRACTBackground Teinturier grapevine varieties were first described in the 16th century and have persisted due to their deep pigmentation. Unlike most other grapevine varieties, teinturier varieties produce berries with pigmented flesh due to anthocyanin production within the flesh. As a result, teinturier varieties are of interest not only for their ability to enhance the pigmentation of wine blends but also for their health benefits. Here, we assembled and annotated the Dakapo and Rubired genomes, two teinturier varieties.Findings For Dakapo, we used a combination of Nanopore sequencing, Illumina sequencing, and scaffolding to the existing grapevine genome assembly to generate a final assembly of 508.5 Mbp with an N50 scaffold length of 25.6 Mbp and a BUSCO score of 98.0%. A combination approach of de novo annotation and lifting over annotations from the existing grapevine reference genome resulted in the annotation of 36,940 genes in the Dakapo assembly. For Rubired, PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long. The diploid genome has an N50 scaffold length of 24.9 Mbp and a BUSCO score of 98.7%, and both haplotype-specific genomes are of similar quality. De novo annotation of the diploid Rubired genome yielded annotations for 56,681 genes.Conclusions The Dakapo and Rubired genome assemblies and annotations will provide genetic resources for future investigations into berry flesh pigmentation and other traits of interest in grapevine.

    This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.149). These reviews (including a protocol review) are as follows.

    Reviewer 1. Camille Rustenholz

    Is there sufficient detail in the methods and data-processing steps to allow reproduction? No. Overall, the authors give enough details except for the haplotypes of Chardonnay, Pinot noir, Cabernet sauvignon and Cabernet franc that were used for Figure 3.

    Is the validation suitable for this type of data? No. Overall, the authors provide accurate validation for this type of data except for the inversion that was identified on chromosome 10 of Dakapo assembly. In my opinion, more evidences need to be provided as Dakapo contigs were anchored using PN40024 12X.v2 assembly version. There is indeed a heterozygous region at the beginning of chromosome 10 in PN40024 genome which makes its assembly and scaffolding quality quite doubtful at that exact location and especially for this assembly version. I would suggest to check it using the latest PN40024 T2T version (Shi et al., Hort Res 2023) and to show some Dakapo short read alignments against its own assembly to validate the borders of this inversion, even though some wet lab validation would be even more convincing.

    Additional Comments: The authors provided the assemblies and gene annotations of the genomes of two teinturier varieties, Dakapo and Rubired. Dakapo was assembled using a combination of Nanopore and Illumina reads whereas Rubired was assembled using PacBio HiFi reads. Even though both assemblies are of high quality, quality metrics are better for Rubired assembly than for Dakapo assembly, in terms of contiguity and of phasing. I would have liked the authors to comment and explain these differences more extensively maybe in a dedicated paragraph in the Discussion section:

    • Why Dakapo assembly could not be phased?
    • Are these differences in terms of quality due to the sequencing technologies (Nanopore versus PacBio HiFi) used? Or to different year of dataset acquisition? Or to assembly methods? Both assemblies were also annotated: 36,940 genes in the Dakapo assembly and 56,681 genes in the diploid Rubired. I assume that 56,681 is the sum of the number of genes annotated on haplotype 1 and haplotype 2 of Rubired. If so, it needs to be clearly stated line 328 otherwise it can be confusing for the reader who will think that Rubired has 20,000 more genes than Dakapo. Also, the authors used two different annotation pipelines, which complicates the gene content comparison and the synteny analysis later on. I would have liked the authors to comment and explain it: - Is it due to the difference in the quality of the assemblies? If so, the authors need to highlight the limits of their annotation pipeline regarding assembly quality. - Any other explanation? Some minor suggestions :
    • Line 74: please use the word “clone” in the sentence for a matter of clarity.
    • Line 292-293: PN40024.v4 assembly is not the most recent but the PN40024 T2T is (Shi et al., Hort Res, 2023) The quality of the assemblies and annotations are very good and the resources of the paper will be very valuable for the grapevine community, especially to study the anthocyanin production in grapevine.

    Reviewer 2. Andrea Gschwend

    Are all data available and do they match the descriptions in the paper? No. The supplementary files were not made available to me for review.

    Is there sufficient detail in the methods and data-processing steps to allow reproduction?

    I recommend including additional details for the programs used for the Rubired genome assembly and annotation in this manuscript, though.

    Is there sufficient data validation and statistical analyses of data quality? No. It is unclear from the manuscript if the large Dakapo inversion was validated experimentally. See additional comments from the uploaded word document https://gigabyte-review.rivervalleytechnologies.comdownload-api-file?ZmlsZV9wYXRoPXVwbG9hZHMvZ3gvRFIvNTQ1L1JpdHRlcl9ldF9hbC5fMjAyNF9HaWdhYnl0ZV9yZXZpZXdlcl9jb21tZW50c184LTIzLTI0LmRvY3g=

    Reviewer 3. Yongfeng Zhou and Kekun Zhang

    Are all data available and do they match the descriptions in the paper? No. Is there sufficient data validation and statistical analyses of data quality? No. Is there sufficient information for others to reuse this dataset or integrate it with other data? No. Additional Comments: My main concerns:

    1. Please explain why different sequencing methods were chosen for the genome assembly of Dakapo and Rubired, given that HiFi sequencing is currently mainstream and provides more accurate assembly?
    2. Recently, the T2T level genome of many grape cultivars has been assembled including the reference genome PN_T2T and the teinturier grape Yan73, Please align with the latest complete reference genome PN_T2T in Line 172, and add the genome information about PN_T2T and Yan73 in Table 1. ( DOI10.1093/hr/uhad061, DOI10.1093/hr/uhad205 )
    3. Line 387-389: How did you verify the correctness of this inversion? Is it contained within a single contig without orientation or assembly errors in the Dakapo genome? Have you identified any other genomes with this inversion?
    4. Line 255: can you explain why is the contig N50 so low?
    5. Line 328: whether the total number of annotated genes in the two Rubired haplotypes are all 56,681? it would be more appropriate to describe them separately.
    6. The phenotypes of these two grapes should be included, not just in the pattern diagram.
    7. The sequence difference in Figure 2 should be verified using other methods, such as PCR results and Sanger sequencing.