vartracker: an end-to-end tool for pathogen longitudinal variant analysis and visualisation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Longitudinal sequencing can reveal fine-grained pathogen evolution during acute and chronic infections and inform public health responses. However, integrating ordered pathogen genomic data into a coherent evolutionary and clinical framework can be tedious and error-prone. We present vartracker, an open-source tool for longitudinal pathogen variant analysis and visualisation. Given an ordered sample manifest, vartracker supports three entry points: raw sequence reads, reference-aligned BAM files, or user-supplied VCF and coverage inputs. Raw-read and BAM inputs are processed through an integrated Snakemake workflow, whereas VCF mode starts from precomputed files. Variants are normalised and annotated relative to a reference genome, tracked across timepoints, and classified as original or newly emerging and as transient or persistent. Inferred amino acid changes are reported, and for SARS-CoV-2 analyses, relevant published literature for key mutations can be automatically linked through a functional database. vartracker outputs a schema-documented results table, provenance metadata for reproducibility, publication-quality static figures, and an interactive heatmap for data exploration. Although packaged with SARS-CoV-2 reference assets and initially developed for SARS-CoV-2 datasets, vartracker is pathogen-agnostic when appropriate reference data are supplied. We demonstrate its utility using SARS-CoV-2 and respiratory syncytial virus A (RSV-A) datasets. vartracker is freely available through GitHub, PyPI and Bioconda.

Article activity feed