Accurate ab initio gene prediction in eukaryotes with Tiberius in multiple clades
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Eukaryotic genome annotation is currently bottlenecked by limitations in the generality, scalability and accuracy of computational methods. Deep learning approaches have recently achieved large improvements in ab initio gene prediction accuracy. We extend the deep learning-based ab initio gene predictor Tiberius beyond mammals by training lineage-specific models for Mesangiospermae, Fungi, Vertebrata, Insecta, Chlorophyta and Bacillariophyta. Across a benchmark of 33 species, Tiberius consistently achieves higher accuracy than the other evaluated ab initio methods, Helixer and ANNEVO, while also having the fastest runtimes overall. Compared with BRAKER3, which incorporates RNA-Seq and protein evidence, Tiberius approaches state-of-the-art accuracy in Mesangiospermae, Fungi, Bacillariophyta and Chlorophyta, while being on average 80 times faster when using a GPU.
Availability and implementation
https://github.com/Gaius-Augustus/Tiberius