A genome assembly of the Atlantic chub mackerel (Scomber colias): a valuable teleost fishing resource

The Atlantic chub mackerel, Scomber colias Gmelin, 1789, is a medium-size pelagic fish with substantial importance in the fisheries of the Atlantic Ocean and the Mediterranean Sea. Over the past decade, this species has gained special relevance being one of the main targets of pelagic fisheries in the NE Atlantic. Here, we sequenced and annotated the first high-quality draft genome assembly of S. colias , produced with Pacbio HiFi long reads and Illumina Paired-End short reads. The estimated genome size is 814 Mb distributed into 2,028 scaffolds and 2,093 contigs with an N50 length of 4,19 and 3,34 Mb, respectively. We annotated 27,675 protein-coding genes and the BUSCO analyses indicated high completeness, with 97.3 % of the single-copy orthologs in the Actinopterygii library profile. The present genome assembly represents a valuable resource to address the biology and management of this relevant fishery. Finally, this is the fourth high-quality genome assembly within the Order Scombriformes and the first in the genus Scomber .

    A version of this preprint has been published in the journal GigaByte under a CC-BY 4.0 license (see https://doi.org/10.46471/gigabyte.40)

    **Reviewer 1. Jianbo Jian **

    This submission described a reference genome for the Atlantic chub mackerel (Scomber colias) using the combination of PacBio HiFi long reads and Illumina short reads. The sequencing data process and genome assembling and related bioinformatics are comprehensive and adequate. The reported reference genome is the first genome and good continuity. It is a pity that the genome is not the chromosome level due to lack of the Hi-C data or genetic map data. However, the associated analysis and results make sense. In my opinion, as the first reference genome in the genus Scomber, this reference genome is a valuable genomic resource for population genetics, ecology and physiology and other future research. I have some concerns that should be addressed before publication in GigaByte.

    1. In the project design, for genome assembly, two individuals were used for genomics DNA extraction. Why not used the same individual for avoiding the assembly error due to the genetic different between individuals?
    2. Line 186-196, I have some confuse about the contamination process, is there some contamination in your sample? In general, most of the genome project will not contain contamination. This process is effective for the specific sample to avoid the contamination.
    3. In Phylogenomics analysis, the divergence time was recommended, then the Figure should be updated make more sense.
    4. Supp. Table 6 is blank.
    5. All of the supplementary tables were not shown in manuscript.
    6. The genome assemble for Illumina sequencing is useless compared with HiFi data.
    7. In supplementary Table 5, N50 (Kb) should be N50 (bp).

    Recommendation: Minor Revision

    **Reviewer 2. Rong Huang **

    Is there sufficient information for others to reuse this dataset or integrate it with other data? No. It is suggested that the author can make a simple table to show the assembly effects within the Order Scombriformes.

    Additional Comments: Scomber colias is a valuable marine resource, with a high impact on the fisheries of several countries on the west coast of the Atlantic Ocean and/or the Mediterranean Sea. This study reports the first genome assembly of Atlantic chub mackerel. This genome is timely and the assembly process is clearly describedis, which contribute to the effective conservation, management, and sustainable exploitation of S. colias species in the Anthropocene. I still have the following questions.

    The assembly effect of the genome does not seem to be particularly good. For example, the length of N50 length of scaffolds is not long enough. How many ploidy is this species? Do heterozygosity and repetition rate affect the assembly effect?

    It is suggested that the author can make a simple table to show the assembly effects within the Order Scombriformes. It is helpful for relevant researchers to make use of the genomic resources.

    Is "data validation" followed by the results section? And there is no subtitle in the result part. Is it required by this type of article?

    Recommendation: Major Revision