LazypipeX: Customizable Virome Analysis Pipeline Enabling Fast and Sensitive Virus Discovery from NGS data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Metagenomic next-generation sequencing (mNGS) is pivotal for detecting known and novel viruses in diverse samples; however, its efficacy relies on robust bioinformatics pipelines. We present LazypipeX with multiple advanced features, including flexible annotation strategies that can be adjusted to match different datasets and applications. Using a synthetic benchmark, we show that a simple two-round annotation strategy designed for speed and low false positive rate required in virus diagnostics can reduce execution time to a fraction of the BLASTN search without any loss in accuracy. Additionally, using real data, we show that annotation strategies based on combinations of fast-to-accurate homology searches reduce execution time from 5- to 20-fold compared to the BLASTN/BLASTP baseline. Using one of these “annotation chains”, we characterized multiple novel complete viral genomes that were missed by the Lazypipe v1/v2 and CZ ID analyses. LazypipeX is a highly efficient and versatile tool for virome analysis, offering customizable and transparent workflows that can facilitate virus discovery and identification in diverse mNGS applications.