Exploring the double-stranded DNA viral landscape in eukaryotic genomes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Genomic sequences of viral origin, especially those derived from retroviruses, constitute a large fraction of eukaryotic genomes, yet the role of double-stranded (ds) DNA viruses in shaping eukaryotic genomes remains underexplored. Here, we present a computational framework to identify dsDNA viral regions (VRs) in eukaryotic genomes. By systematically screening 37,254 eukaryotic genome assemblies, we identified 781,111 VRs in 7,103 genome assemblies. These VRs accounted for up to 16% of individual genomes and represented 343 associations between viral and eukaryotic taxa at the class level, which notably included 305 (89%) associations lacking evidence from isolation studies. Some VRs are phylogenetically nested within clades containing known viruses, while others form deep-branching clades composed solely of VR-derived sequences. Our study greatly expands the known dsDNA endogenous virosphere and the resulting VR catalog offers opportunities for wider exploration of the virus– host coevolutionary processes.

Article activity feed