The genome of a giant (trevally): Caranx ignobilis

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Caranx ignobilis, commonly known as giant kingfish or giant trevally, is a large, reef-associated apex predator. It is a prized sportfish, targeted throughout its tropical and subtropical range in the Indian and Pacific Oceans. It also gained significant interest in aquaculture due to its unusual freshwater tolerance. Here, we present a draft assembly of the estimated 625.92 Mbp nuclear genome of a C. ignobilis individual from Hawaiian waters, which host a genetically distinct population. Our 97.4% BUSCO-complete assembly has a contig NG50 of 7.3 Mbp and a scaffold NG50 of 46.3 Mbp. Twenty-five of the 203 scaffolds contain 90% of the genome. We also present noisy, long-read DNA, Hi-C, and RNA-seq datasets, the latter containing eight distinct tissues and can help with annotations and studies of freshwater tolerance. Our genome assembly and its supporting data are valuable tools for ecological and comparative genomics studies of kingfishes and other carangoid fishes.

Article activity feed

  1. ABSTRACT

    This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.67), and has published the reviews under the same license. These are as follows.

    **Reviewer 1. Alison Gould **

    Is there sufficient data validation and statistical analyses of data quality? Not my area of expertise.

    Other comments:

    This was a very clear and well-written manuscript presenting a whole genome assembly for the giant trevally. This will serve as an important resource for future researcher interested in this and other closely related species of fish. I. only have a few minor suggestions but overall, found the paper to be of high-quality. -It would be helpful to include the estimated genome size and BUSCO score in the abstract -Include the species name on the X-axes of each column in Fig 5. -Several of the tables (Table2, Table6, for example) don't seem necessary in the main text as they are not really discussed in the paper and could included as supporting material.

    **Reviewer 2. Yue Song **

    Are all data available and do they match the descriptions in the paper?

    Yes. The description of the data in the article is generally correct, but there are some inconsistencies. e.g. in line 186, the author used single-copy orthologs from the actinopterygii set of OrthoDB (v10) to assess assembly completeness but using vertebrate set for the comparison with other fish genomes. All the other species are also fish genomes, so why not use the same database (e.g. actinopterygii)?

    Are the data and metadata consistent with relevant minimum information or reporting standards?

    Yes. it is best to provide relevant information about protein-coding genes i think.

    Is there sufficient detail in the methods and data-processing steps to allow reproduction?

    No. (1) It is better to provide detailed software parameters and the description of how to assemble contigs into scaffold is not clear enough. (2) The method of how to identify single-copy orthologs is not clearly described.

    Is there sufficient data validation and statistical analyses of data quality?

    No. I don't think it's enough to just rely on single-copy orthologs and/or synteny blocks to assess the genome quality, maybe it would be better to add some others, e.g. reads mapping?

    Other comments:

    (1) In line 223, the authors just provide how many scaffolds there are in the final assembly version, but how many chromosomes are assembled and how about the proportion of scaffold or contigs which have been located into the chromosomes. These information is not found in the MS. Note that there is already published a genome for this genus in NCBI, but only at the contig level, if using the Hi-C data could provide the chromosomal level one, I think it would be more useful. (2) In line 252, I noticed there was not mentioned about gene sets, especially the protein-coding genes, how many coding genes are there in this genome? (3) In figure 4, many cross-linking intensities are not obvious, which may be related to sequencing depth of Hi-C data, I can't figure out how many chromosomes are there in the final assembly from this diagram. (4) Some minor bugs, in FIGURE CAPTIONS, figure 6, here the author used the ray-finned fish, right? I think it is a mistake here, cause in line 186, the author mentioned vertebrata set.

    Re-review: The author has responded to the corresponding questions, recommended accepting the manuscript.