Highly accurate prophage island detection with PIDE

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

As important mobile elements in prokaryotes, prophages shape the genomic context of their hosts and regulate the structure of bacterial populations. However, it is challenging to precisely identify prophages through computational methods. Here, we introduce PIDE for identifying prophages from bacterial genomes or metagenome-assembled genomes. PIDE integrates a pre-trained protein language model and gene density clustering algorithm to distinguish prophages. Benchmarking on bacterial genomes with experimental prophage annotation demonstrates that PIDE pinpoints prophages with precise boundaries. Applying PIDE to 4,744 human gut representative genomes reveals 24,467 prophages with widespread functional capacity. PIDE is open source and is available at https://github.com/chyghy/PIDE .

Article activity feed