Complete de novo assembly and re-annotation of the zebrafish genome

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The zebrafish ( Danio rerio ) is widely used in vertebrate research, but its reference genome assembly has contained extensively unresolved regions across both euchromatic and heterochromatic compartments. The previous reference genome assembly, GRCz11, consisted of 19,725 contigs assembled into 1,917 scaffolds. Recent advances in both long-read sequencing technologies and genome assembly algorithms have made “complete” genome assemblies possible for the first time. We used homozygous fish from two lab strains, “Tübingen” and ”AB,” for de novo genome assemblies. The new assemblies incorporated 7% more genomic sequence than GRCz11 and an additional 130 million bases of previously unassembled sequence. RefSeq annotation incorporating newly generated Iso-Seq cDNA sequences have added notable increases in mRNAs (68%), lncRNAs (47%), and misc_RNAs (1099%). Two assemblies have been elevated to reference genome status (GRCz12tu and GRCz12ab). We generated an additional 40 draft haplotypes to create a zebrafish pangenome resource and demonstrate its utility for variant analysis.

Article activity feed