Transposable Elements Facilitate the De Novo Origin of Antifreeze Protein and the Diversification of Its Gene Family in Snailfishes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Transposable elements (TEs) are increasingly recognized as important sources of genomic innovation, yet mechanistically resolved examples of how they help generate new functional genes in vertebrates remain rare. Type I antifreeze proteins (AFPI) in fishes are lifesaving adaptations shaped by strong freezing selection and provide an exceptional system for studying new gene evolution under extreme environmental pressure. We recently showed that AFPI in flounder, cunner, and sculpin evolved independently through distinct partial de novo routes, converging on a nearly identical alanine-rich antifreeze protein. Here, we elucidate the origin and evolution of AFPI in the last remaining unresolved lineage, snailfishes, using a chromosome-scale genome assembly for Liparis atlanticus together with multi-tissue Iso-Seq, tissue-specific RNA-seq, and comparative genomics across AFPI-bearing and AFPI-lacking snailfishes and teleost outgroups. We show that snailfish AFPI originated within Liparis and rapidly diversified as a young gene family with multiple isoforms and lineage- and populationspecific copy-number change. Genome-wide homology searches support a de novo origin of the alanine-rich coding region from noncoding sequence rather than from a pre-existing proteincoding precursor. In contrast, the surrounding regulatory architecture was assembled through sequence recruitment: a hAT-derived fragment contributes promoter- and transcription-start-siteproximal sequence, and a conserved noncoding segment together with a Ty3/Gypsy-derived long terminal repeat (LTR) contributes the 3′ regulatory region. TE-rich locus structure also provides plausible mechanisms for subsequent locus expansion and translocation. Together, these results reveal a TE-facilitated, mosaic route to new gene evolution in vertebrates, demonstrating how noncoding DNA, repetitive sequence, and TE-derived regulatory fragments can be assembled into a strongly selected adaptive innovation.