Accurate Assembly of Full-length Consensus for Viral Quasispecies

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Viruses can inhabit their hosts in the form of an ensemble of various mutant strains. Reconstructing a robust consensus representation for these diverse mutant strains is essential for recognizing the genetic variations among strains and delving into aspects like virulence, pathogenesis, and selecting therapies. Virus genomes are typically small, often composed of only a few thousand to several hundred thousand nucleotides. While constructing a high-quality consensus of virus strains might seem feasible, most current assemblers can only generate fragmented contigs. It's important to emphasize the significance of assembling a single full-length consensus contig, as it's vital for identifying genetic diversity and estimating strain abundance accurately.In this paper, we developed FC-Virus, a de novo genome assembly strategy specifically targeting highly diverse viral populations. FC-Virus initially identifies the homologous k-mers present in the majority of viral strains, and then uses these k-mers as a backbone to build a full-length consensus sequence covering the entir genome. We benchmark FC-Virus against state-of-the-art genome assemblers. Experimental results confirm FC-Virus‘s ability to construct a single accurate full-length consensus, while other assemblers only manage to generate fragmented contigs. FC-Virus is freely available at https://github.com/TheyuGao/FC-Virus.

Article activity feed