Discovery and analysis of an 841 kbp phage genome: the largest known to date

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Viruses represent the most abundant biological entities on Earth, with bacteriophages (or phages) specifically infecting bacteria. Typically, double-strand DNA phages possess genomes around 50 kilobase pairs (kbp) in size, although the largest known genome surpasses 735 kbp. This raises intriguing questions about the potential maximum size of phage genomes. In this study, we present the first phage genome over 800 kbp, named BF3, which was reconstructed from oil reservoir production water. BF3 encodes 1,164 protein-coding genes (code 11) and 46 tRNA genes, and is classified within the Caudoviricetes, though its bacterial host remains unidentified. We utilized ColabFold to predict the structures of 744 protein-coding genes with high confidence, finding that 395 and 591 of these genes corresponded with known structures in the Protein Data Bank and the Big Fantastic Virus Database (BFVD), respectively. Notably, 153 of BF3's predicted structures exhibited no similarity to those catalogued in the BFVD, which is currently the most extensive viral protein structure database. This study not only expands our understanding of phage genome capacities but also underscores the need for specialized analytical tools and pipelines to investigate exceptionally large phages.

Article activity feed