The Genome of the Mustard Hill Coral, Porites astreoides
Abstract
Coral reefs are threatened both locally and globally by anthropogenic impacts, which to date have contributed to substantial declines in coral cover worldwide. However, some corals are more resilient to these environmental changes and therefore have increased relative abundance on local scales and may represent prominent members shaping future reef communities. Here, we provide the first draft reference genome for one such reef-building coral, the mustard hill coral, Porites astreoides . This reference genome was generated from a sample collected in Bermuda, with DNA sequenced via Pacific Biosciences HiFi long-read technology to provide an initial draft reference genome assembly. Assembly of the PacBio reads with FALCON UnZip resulted in a 678 Mbp assembly with 3,051 contigs with an N50 of 412,256. The genome BUSCO completeness analysis resulted in 90.9% of the metazoan gene set. An ab initio transcriptome was also produced with 64,636 gene models with a transcriptome BUSCO completeness analysis of 77.5% when compared to the metazoan gene set. The function annotation was obtained through a hierarchical approach of SwissProt, TrEMBL, and NCBI nr database of which 86.6% of proteins were annotated. Through our ab initio gene prediction for structural annotation and generation of a functional annotation for the P. astreoides draft genome assembly, we provide valuable resources for improving biological knowledge, which can facilitate comparative genomic analyses for corals, and enhance our capacity to test for the molecular underpinnings of adaptation and acclimatization to support evidence-based restoration and human assisted evolution of corals.
Classifications
Genetics and Genomics; Animal Genetics; Marine Biology
Article activity feed
-
This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.65), and has published the reviews under the same license. These are as follows.
**Reviewer 1. Takeshi Takeuchi **
Is there sufficient detail in the methods and data-processing steps to allow reproduction?
No. 1) How many high-quality reads/nucleotides were retained after filtering and applied to the Falcon assembler? The authors also need to describe the parameters for the Falcon. 2) How did the authors manage the duplicated contigs from different haplotypes in the assembly? 3) In Table 2, stas for "scaffolds" are shown. But there is no description of the scaffolding process.
Is there sufficient data validation and statistical analyses of data quality? No. 4) In Table 3, the authors should not compare transcriptome (refs …
This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.65), and has published the reviews under the same license. These are as follows.
**Reviewer 1. Takeshi Takeuchi **
Is there sufficient detail in the methods and data-processing steps to allow reproduction?
No. 1) How many high-quality reads/nucleotides were retained after filtering and applied to the Falcon assembler? The authors also need to describe the parameters for the Falcon. 2) How did the authors manage the duplicated contigs from different haplotypes in the assembly? 3) In Table 2, stas for "scaffolds" are shown. But there is no description of the scaffolding process.
Is there sufficient data validation and statistical analyses of data quality? No. 4) In Table 3, the authors should not compare transcriptome (refs 32, 55, and 34) and gene models (this study). Did the authors produce transcriptome assembly from the RNA-seq data in fact? If so, please describe the method for the transcriptome assembly. 5) Results of BLAST2GO and InterProScan were not described.
Additional Comments: The number of gene models (64,636) is much higher than those of other Porites species (30,000-40,000). The number of exons per gene is considerably lower than others. These results indicate that the gene models are fragmented, possibly due to insufficient gene model prediction. This issue needs to be discussed. In the Abstract, the genome size "667 Gbp" should be "667 Mbp." In Table 2 and the main text, the assembly size is 678Mbp. Which is correct?
Re-review: I appreciate the authors’ effort to address all referee comments. I believe the data will be valuable for the research community.
**Reviewer 2. Jong Bhak **
Additional Comments: Porites astreoides is an important coral species and this reviewer thinks all the major reference construction parameters have shown a high quality assembly. Predicted gene number, 64,636, is a bit too high. This needs to be checked and improved. (This number has been fluctuating. Not critical, though)
-