Genome-resolved expansion of Nucleocytoviricota and Mirusviricota reveals new diversity, functional potential, and biotechnological applications

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Nucleocytoplasmic large DNA viruses (NCLDV) of the phyla Nucleocytoviricota and viruses of the newly proposed Duplodnaviria phylum, Mirusviricota , exhibit taxonomic richness which continues to expand due to metagenomic sequencing of Earth\'s biomes. Giant viruses contain complex genomes encoding genes of both viral and cellular origin, representing a reservoir of unexplored biological functions with potential implications for ecology, evolution, and biotechnology. Here, we present the largest curated database of giant virus metagenome-assembled genomes (GVMAGs V2), comprising 8,508 species-level clusters inferred from 18,727 genomes, originating from marine, freshwater, anthropogenic and terrestrial environments, a six-fold increase from the previous giant virus phylogenetic frameworks. Phylogenomics and relative evolutionary distance analysis revealed 712 novel genera, 13 previously unknown viral families and a new proposed order, tentatively named Mycodnavirales . We improved gene calling of 12% of giant virus genomes by accounting for alternative and custom genetic codes, enabling more accurate identification of protein-coding genes. Database mining uncovered endogenous viral elements in a broad spectrum of eukaryotes, spanning algae, fungi, and parasitic protists highlighting that giant virus integration is both widespread and evolutionarily persistent. Orthologous clustering of 2.5 million proteins identified 135,998 orthogroups representing comprehensive metabolic capabilities, such as enrichment of genes involved in aromatic compound degradation (commonly associated with bioremediation) in Algavirales genomes. Furthermore, we detected widespread biosynthetic gene clusters underpinning antimicrobial activity and antibiotic resistance, suggesting roles of giant viruses in host defense and in the dissemination of antibiotic resistance genes. Conversely, 67% of orthogroups have unknown functions, underscoring a substantial unexplored potential. This comprehensive publicly available database provides a critical resource for the giant virus research community and a foundation for uncovering virus-host interactions, exploring viral evolution, and identifying reservoirs for novel enzymes with the potential to advance biotechnological applications.

Article activity feed