Genome of a Giant (Trevally): Caranx ignobilis
Abstract
Caranx ignobilis , commonly known as the kingfish or giant trevally, is a large, reef-associated apex predator. It is a prized sportfish, targeted heavily throughout its tropical and subtropical range in the Indian and Pacific Oceans, and it has drawn significant interest in aquaculture due to an unusual tolerance for freshwater. In this study, we present a high-quality nuclear genome assembly of a C. ignobilis individual from Hawaiian waters, which have recently been shown to host a genetically distinct population. The assembly has a contig NG50 of 7.3Mbp and scaffold NG50 of 46.3Mbp. Twenty-five of the 203 scaffolds contain 90% of the genome. We also present the raw Pacific Biosciences continuous long-reads from which the assembly was created. A Hi-C dataset (Dovetail Genomics Omni-C) and Illumina-based RNA-seq from eight tissues are also presented; the latter of which can be particularly useful for annotation and studies of freshwater tolerance. Overall, this genome assembly and supporting data is a valuable tool for ecological and comparative genomics studies of kingfish and other carangoid fishes.
Article activity feed
-
This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.67), and has published the reviews under the same license. These are as follows.
**Reviewer 1. Alison Gould **
Is there sufficient data validation and statistical analyses of data quality? Not my area of expertise.
Other comments:
This was a very clear and well-written manuscript presenting a whole genome assembly for the giant trevally. This will serve as an important resource for future researcher interested in this and other closely related species of fish. I. only have a few minor suggestions but overall, found the paper to be of high-quality. -It would be helpful to include the estimated genome size and BUSCO score in the abstract -Include the species name on the X-axes of each column in Fig 5. -Several of the tables …
This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.67), and has published the reviews under the same license. These are as follows.
**Reviewer 1. Alison Gould **
Is there sufficient data validation and statistical analyses of data quality? Not my area of expertise.
Other comments:
This was a very clear and well-written manuscript presenting a whole genome assembly for the giant trevally. This will serve as an important resource for future researcher interested in this and other closely related species of fish. I. only have a few minor suggestions but overall, found the paper to be of high-quality. -It would be helpful to include the estimated genome size and BUSCO score in the abstract -Include the species name on the X-axes of each column in Fig 5. -Several of the tables (Table2, Table6, for example) don't seem necessary in the main text as they are not really discussed in the paper and could included as supporting material.
**Reviewer 2. Yue Song **
Are all data available and do they match the descriptions in the paper?
Yes. The description of the data in the article is generally correct, but there are some inconsistencies. e.g. in line 186, the author used single-copy orthologs from the actinopterygii set of OrthoDB (v10) to assess assembly completeness but using vertebrate set for the comparison with other fish genomes. All the other species are also fish genomes, so why not use the same database (e.g. actinopterygii)?
Are the data and metadata consistent with relevant minimum information or reporting standards?
Yes. it is best to provide relevant information about protein-coding genes i think.
Is there sufficient detail in the methods and data-processing steps to allow reproduction?
No. (1) It is better to provide detailed software parameters and the description of how to assemble contigs into scaffold is not clear enough. (2) The method of how to identify single-copy orthologs is not clearly described.
Is there sufficient data validation and statistical analyses of data quality?
No. I don't think it's enough to just rely on single-copy orthologs and/or synteny blocks to assess the genome quality, maybe it would be better to add some others, e.g. reads mapping?
Other comments:
(1) In line 223, the authors just provide how many scaffolds there are in the final assembly version, but how many chromosomes are assembled and how about the proportion of scaffold or contigs which have been located into the chromosomes. These information is not found in the MS. Note that there is already published a genome for this genus in NCBI, but only at the contig level, if using the Hi-C data could provide the chromosomal level one, I think it would be more useful. (2) In line 252, I noticed there was not mentioned about gene sets, especially the protein-coding genes, how many coding genes are there in this genome? (3) In figure 4, many cross-linking intensities are not obvious, which may be related to sequencing depth of Hi-C data, I can't figure out how many chromosomes are there in the final assembly from this diagram. (4) Some minor bugs, in FIGURE CAPTIONS, figure 6, here the author used the ray-finned fish, right? I think it is a mistake here, cause in line 186, the author mentioned vertebrata set.
Re-review: The author has responded to the corresponding questions, recommended accepting the manuscript.
-