A genome-resolved view of the wastewater RNA virome

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Sequencing-based wastewater surveillance is emerging as an important tool in pathogen-agnostic threat detection, potentially enabling early identification before capture through clinical surveillance systems. However, virus sequences of human pathogens are typically low in abundance in wastewater while much of the data is unclassifiable at the read level. This presents a challenge because genomes may not assemble well for novel pathogens of interest, but read-based methods cannot currently separate novel from previously seen unclassified sequences. Using ultra-deep untargeted sequencing of the wastewater RNA virome performed by the CASPER consortium (321 samples), we constructed a wastewater virus genome database (“WVDB”) with the goal of expanding the set of available high-quality non-redundant reference genomes. The first version of this database contains 21,015 near-complete viral genomes, of which the majority are ssRNA bacteriophage (79%). We additionally recovered genomes for putative plant and vertebrate-infecting viruses, human enteric viruses, and viruses whose host could not be predicted. Fewer than 4000 genomes had matches in previously published virus genome databases, and WVDB captured around one fifth of the reads that could not be classified by Kraken2. Further expansion of WVDB will provide a comprehensive resource of RNA virus genomes for characterization of viral diversity and dynamics in wastewater across space and time.

Article activity feed