Variant evolution graph: Can we infer how SARS-CoV-2 variants are evolving?

Badhan Das
Lenwood S. Heath

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The SARS-CoV-2 virus has undergone extensive mutations over time, resulting in considerable genetic diversity among circulating strains. This diversity directly affects important viral characteristics, such as transmissibility and disease severity. During a viral outbreak, the rapid mutation rate produces a large cloud of variants, referred to as a viral quasispecies. However, many variants are lost due to the bottleneck of transmission and survival. Advances in next-generation sequencing have enabled continuous and cost-effective monitoring of viral genomes, but constructing reliable phylogenetic trees from the vast collection of sequences in GISAID (the Global Initiative on Sharing All Influenza Data) presents significant challenges.

We introduce a novel graph-based framework inspired by quasispecies theory, the Variant Evolution Graph (VEG), to model viral evolution. Unlike traditional phylogenetic trees, VEG accommodates multiple ancestors for each variant and maps all possible evolutionary pathways. The strongly connected subgraphs in the VEG reveal critical evolutionary patterns, including recombination events, mutation hotspots, and intra-host viral evolution, providing deeper insights into viral adaptation and spread. We also derive the Disease Transmission Network (DTN) from the VEG, which supports the inference of transmission pathways and super-spreaders among hosts.

We have applied our method to genomic data sets from five arbitrarily selected countries — Somalia, Bhutan, Hungary, Iran, and Nepal. Our study compares three methods for computing mutational distances to build the VEG, sourmash, pyani, and edit distance, with the phylogenetic approach using Maximum Likelihood (ML). Among these, ML is the most computationally intensive, requiring multiple sequence alignment and probabilistic inference, making it the slowest. In contrast, sourmash is the fastest, followed by the edit distance approach, while pyani takes more time due to its BLAST-based computations. This comparison highlights the computational efficiency of VEG, making it a scalable alternative for analyzing large viral data sets.

Version published to 10.1371/journal.pone.0323970
Jun 9, 2025
Version published to 10.1101/2024.09.13.612805 on bioRxiv
Sep 15, 2024

Dengue Virus Type 2: Global Epidemiology, Molecular Evolution, and Immune Response Insights

This article has 5 authors:
1. Qun Chen
2. Peipei Ye
3. Mengye Ma
4. Zhu Chen
5. Liming Jiang
This article has no evaluationsLatest version Jan 30, 2026
Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

This article has 1 author:
1. Marvin I. De los Santos
This article has no evaluationsLatest version Dec 22, 2025
Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

This article has 15 authors:
1. Pulchérie Pelembi
2. Philippe Colson
3. Alain Farra
4. Ornella Anne Sibiro-Demi
5. Christian Noël Malaka
6. Aurélia Kwasiborski
7. Véronique Hourdel
8. Gilles Landry Ngaya
9. Romaric Nzoumbou-Boko
10. Jean-Claude Manuguerra
11. Emmanuel Ryvalin Nakoune-Yandoko
12. Guy VERNET
13. Bernard La Scola
14. Valérie Caro
15. Alexandre Manirakiza
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Dengue Virus Type 2: Global Epidemiology, Molecular Evolution, and Immune Response Insights

Rapid Phylogenomic Analysis of Thousands Outbreak‐Causing Viral Genomes Using Covary

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.