Large-scale Evaluation of Prokaryotic Annotation Tools Across Thousands of Species

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Genome annotation is fundamental to prokaryotic genome sequencing, yet systematic evaluations guiding tool selection are lacking. We present the first large-scale investigation of four prominent open-source annotation tools (Prokka, Bakta, EggNOG-mapper, and PGAP) across 156,033 diverse genomes, including Escherichia coli strains for baseline performance, thousands of archaea and bacteria, frameshifted, and metagenome-assembled genomes. Bakta excelled in annotating high-quality bacterial genomes, while PGAP was better for archaeal and challenging genomes, including bacterial metagenome-assembled, fragmented, or contaminated samples. For biological functional Gene Ontology annotation, PGAP provided broader term coverage, whereas EggNOG-mapper offered more terms per feature. Our findings highlight tool-specific strengths crucial for selecting optimal solutions based on genome quality, taxonomy, and origin (e.g., MAGs). This study provides an evidence-based guide for users and informs future tool development.

Article activity feed