Investigating the topological motifs of inversions in pangenome graphs
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recent technological advances have accelerated the production of high-quality genome assemblies within species, driving the growing use of pangenome graphs in genetic diversity analyses. These graphs reduce reference bias in read mapping and enhance variant discovery and genotyping from SNPs to Structural Variants (SVs). In pangenome graphs, variants appear as bubbles, which can be detected by dedicated bubble calling tools. Although these tools report essential information on the variant bubbles, such as their position and allele walks in the graph, they do not annotate the type of the detected variants. While simple SNPs, insertions, and deletions are easily distinguishable by allele size, large balanced variants like inversions are harder to differentiate among the large number of unannotated bubbles. In fact, inversions and other types of large variants remain underexplored in pangenome graph benchmarks and analyses.
In this work we focused on inversions, which have been drawing renewed attention in evolutionary genomics studies in the past years, and aimed to assess how this type of variant is handled by state of the art pangenome graphs pipelines. We identified two distinct topological motifs for inversion bubbles: one path-explicit and one alignment-rescued, and developed a tool to annotate them from bubble-caller outputs. We constructed pangenome graphs with both simulated data and real data using four state of the art pipelines, and assessed the impact of inversion size, punctual genome divergence and haplotype number on inversion representation and accuracy.
Our results reveal substantial differences between pipelines, with many inversions either mis-represented or lost. Most notably, recovery rates remain strikingly low, even with the most simple simulated genome sets, highlighting major challenges in analyzing inversions in pangenomic approaches.