Unlocking the genomic repertoire of a cultivated megaphage

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Megaphages are bacteriophages (i.e., phages) with exceptionally large genomes that are ecosystem cosmopolitans, infect various bacterial hosts, and have been discovered across various metagenomic datasets globally. To date, almost all megaphages have evaded cultivation, with only phage G being in active culture for over 50 years. We examined with multiomics this five decades long cultivated history from nine different laboratories with five different lab variants to the modern era. In this work, we resolved the five complete phage G genomes, the particle proteome, de novo methylome, and used artificial intelligence (AI) to annotate the genome of phage G. Phage G is one of the largest phages with a size of >0.6 µm, about half the width of the host cell, and a 499 kbp, non-permuted, linear genome that has, uniquely among known phages, two pairs of ends. Its closest known relative is Moose phage W30-1 which was metagenomically assembled without cultivation from a moose rumen sample. Phage G has >650 protein-coding open reading frames (ORFs), with >65% being hypothetical proteins with no known function, with the rest of the genome geared towards nucleic acid replication (e.g., helicases, polymerases, endonucleases) and are structural in nature (e.g., capsid, tail, portal, terminase). The genome encodes a 35 kbp stretch with 66 ORFs without any known functional homology, a cryptic genomic region that is roughly the size of phage lambda. Phage G has an expansive repertoire of auxiliary metabolic genes (AMGs) acquired from its bacterial host, including a phoH , ftsZ , UvsX/RecA-like, gyrA, gyrB ,and DHFR . Furthermore, AMGs discovered in phage G could manipulate host sporulation ( sspD, RsfA, spoK ) and antiviral escape genes (e.g., anti-CBass nuclease and Anti-Pycsar protein). Phage proteomics found >15% of the protein ORFs were present in either the wild-type or mutant variants of phage G, including genes involved in replication (e.g., UvsX/RecA-like ), host sporulation, as well as structural genes (e.g., capsid, tail, portal). The methylome of phage G was localized to the cryptic region with limited functional homology, with supervised machine learning (i.e., HMMs) was unable to resolve this region, but was resolved with protein structural AI. This cryptic region was a hot spot for methylation at 32%, where many of the functions of the ORF are still unknown. Our study represents a doorway into the complexity of the genomic repertoire of the only cultivated megaphage, highlighting five decades of continuous cultivation for the first time.

Article activity feed