Highly accurate ab initio gene annotation with ANNEVO
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate gene annotation is essential for deciphering the mapping from genomic sequences to their functional roles. However, current methods struggle to model complex gene transmission patterns and evolutionary variations in gene lengths. Here, we introduce ANNEVO, a mixture of experts (MoE)-based genomic language model that directly models distal sequence dependencies and joint evolutionary relationships from diverse genomes, enabling precise ab initio gene annotation. Through extensive benchmarking on 566 species, we demonstrate that ANNEVO significantly outperforms existing ab initio methods and achieves performance comparable to state-of-the-art annotation pipelines. Furthermore, ANNEVO’s independence from external evidence allows it to deliver more complete annotations than reference annotations for a broad range of species while correcting errors within them. These advancements will significantly advance genome sequence interpretation and provide a framework capable of integrating evolutionary insights.