SaVanache: indexing and visualizing pangenome variation graphs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the rapid increase in genome sequencing and the growing availability of genomic resources, genomics is shifting toward pangenome representations that capture intra- and inter-specific diversity by integrating multiple genomes into a single entity. These pangenomes are increasingly modeled as graphs, encoding complex genomic variations in structures such as de Bruijn or variation graphs. However, while genome browsers provide standard and effective solutions for visualizing single or limited numbers of genomes, equivalent interactive tools for graph-based pangenomes remain limited, particularly for variation graph models.
We developed SaVanache, a multi-resolution visualization interface designed to explore pangenome variation graphs at various depths. SaVanache enables the exploration of both global diversity and structural variations (SVs) across genomes relative to a user-defined linear pivot genome. Unlike synteny viewers, SaVanache emphasizes variations by representing SV types through a dedicated set of glyphs, facilitating intuitive one-to-many comparisons. To support smooth exploration, SaVanache preprocesses a Graphical Fragment Assembly (GFA) pangenome file into optimized index and data structures, enabling fast, real-time queries on large pangenome graphs.
By combining advanced visualization techniques with efficient data handling, SaVanache provides a robust tool for scientists to analyze and visualize genetic variation within genomes and pangenomes, facilitating the identification of genetic determinants associated with phenotypes of interest and fully exploiting current genomic resources.
Author summary
We introduce SaVanache, an innovative tool that transforms the way we explore genomic resources. SaVanache allows visualization and analysis of pangenome variation graphs (PVGs), which capture genomic diversity by integrating structural variants (SV) and single nucleotide polymorphisms (SNPs) across multiple genomes. Unlike traditional genome browsers limited to a few genomes, SaVanache offers a multi-level, user-friendly interface that allows users to explore from whole pangenomes down to individual structural variants, enabling multidimensional research and development. Using a linear pivot genome as a visual reference, SaVanache simplifies complex PVG structures into intuitive comparisons. It efficiently handles large datasets and speeds up data retrieval through internal parsing. The front-end, built with modern JavaScript frameworks, provides interactive and responsive visualization, while the Python/Django backend supports real-time data updates. Users can detect and classify SVs by comparing syntenic segments between genomes, visualized through a novel glyph-based system that uses shapes and colors to represent complex rearrangements. SaVanache supports seamless zooming from chromosome-wide to nucleotide-level views, interactive diversity scatterplots, dynamic pivot genome switching, and grouping genomes by metadata to explore genotype-phenotype links. In addition, export functions bridge visualization with downstream bioinformatics. Developed with user feedback, SaVanache balances biological relevance and computational efficiency, overcoming PVG complexity to empower users with unprecedented insight into genomic diversity and SVs.