DanioDecima: A DNA sequence-to-function model of zebrafish embryogenesis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Deep learning DNA sequence-to-function models offer the promise of gaining mechanistic insights into genome regulation, however their performance is often limited by data scarcity in the species of interest. We present DanioDecima, a zebrafish-specific model leveraging transfer learning from human and mouse-trained models to predict tissue- and cell-type-specific gene expression during zebrafish embryogenesis. Initializing DanioDecima with pretrained human and mouse Borzoi and Decima weights raises the median pseudobulk Pearson r sub-stantially across cell-types and improves gene-level correlations of test set genes. An in silico directed-evolution loop guided by DanioDecima scoring generated synthetic promoters whose motif architectures cluster by the expected target lineage. These findings exemplify a cross-species transfer learning methodology for sequence-to-function models, and position DanioDecima as a practical resource for zebrafish regulatory engineering.

Article activity feed