Assessing genome conservation on pangenome graphs with PanSel

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

With more and more telomere-to-telomere genomes assembled, pangenomes make it possible to capture the genomic diversity of a species. Because they introduce less biases, pangenomes, represented as graphs, tend to supplant the usual linear representation of a reference genome, augmented with variations. However, this major change requires new tools adapted to this data structure. Among the numerous questions that can be addressed to a pangenome graph is the search for conserved or divergent genes.

Results

In this article, we present a new tool, named PanSel, which computes a conservation score for each segment of the genome, and finds genomic regions that are significantly conserved, or divergent. PanSel can be used on prokaryotes and eukaryotes, with a sequence identity not less than 98%.

Availability and implementation

PanSel, written in C++11 with no dependency, is available at https://github.com/mzytnicki/pansel.

Article activity feed