OMAnnotator: a novel approach to building an annotated consensus genome sequence

Sadé Bates
Christophe Dessimoz
Yannis Nevers

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Motivation

Advances in sequencing technologies have enabled researchers to sequence whole genomes rapidly and cheaply. However, despite improvements in genome assembly, genome annotation (i.e. the identification of protein-coding genes) remains challenging, particularly for eukaryotic genomes: it requires combining several approaches (typically ab initio , transcriptomics, and homology search), each with its own pros and cons. Deciding which gene models to retain in a consensus is far from trivial, and automated approaches tend to lag behind laborious manual curation efforts in accuracy.

Results

Here, we present OMAnnotator, a novel approach to building a consensus annotation. OMAnnotator repurposes the OMA algorithm, originally designed to elucidate evolutionary relationships among genes across species, to integrate predictions from different annotation sources, using evolutionary information as a tie-breaker. We validated OMAnnotator by reannotating the Drosophila melanogaster reference genome and comparing it with the expert-curated reference and results from the automated pipelines BRAKER2 and EvidenceModeller. OMAnnotator produced a consensus annotation that outperformed each individual input and surpassed the existing pipelines. Finally, when applied to three recently published genomes, OMAnnotator gave substantial improvements in two cases, and mixed results in the third, which had already benefited from extensive expert curation.

Conclusion

We introduce an original, flexible, and effective approach to annotating genomes by integrating multiple lines of evidence. The method’s robustness is underlined by its successful implementation in re-annotating recently published genomes, opening up new avenues in eukaryotic genome annotation.

Version published to 10.1101/2024.12.04.626846 on bioRxiv
Dec 8, 2024

A Benchmarking Framework to Catalyze Individual Human Genome Projects

This article has 3 authors:
1. Manjushri kalpande
2. Apoorva Ganesh
3. Subhashini Srinivasan
This article has no evaluationsLatest version Dec 17, 2025
Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

This article has 1 author:
1. Marvin I. De los Santos
This article has no evaluationsLatest version Dec 22, 2025
META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

This article has 8 authors:
1. Louis-Maël Guéguen
2. Alban Mathieu
3. Simon Pelletier
4. Anthony Woo
5. Namita Misra
6. Magali Moreau
7. Olivier Perin
8. Arnaud Droit
This article has no evaluationsLatest version Jan 29, 2026

Discuss this preprint

Listed in

Abstract

Motivation

Results

Conclusion

Article activity feed

Related articles

A Benchmarking Framework to Catalyze Individual Human Genome Projects

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing