Neural network-based cross-species chromatin annotation goes beyond sequence conservation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Analogous to the Encyclopedia of DNA Elements (ENCODE) project, the Functional Annotation of ANimal Genomes (FAANG) consortium has produced chromatin annotations for domesticated animals, albeit in smaller amounts. Classical methods based on sequence conservation can be used to infer missing annotations, but are inappropriate for non-conserved sequences. Here, we demonstrate the ability of neural networks trained with human data to infer the missing chromatin annotations in livestock species. For this purpose, we comprehensively assessed predictions of transcription factors, chromatin accessibility, and histone marks in several species. Our results showed good predictions for various annotations in mammalian genomes, and surprisingly, also for bird genomes, despite the large phylogenetic distance from the human genome. Moreover, predictions were accurate even for non-conserved sequences, unlike conservation-based methods. Our results advocate the widespread use of neural networks in cross-species genome annotation, a key step in understanding the genetic architecture of complex traits.
