skalo: using SKA split k-mers with coloured de Brujin graphs to genotype indels

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Insertions and deletions (indels) are important contributors to the genetic diversity and evolution of pathogens like Mycobacterium tuberculosis . However, accurately identifying them from genomic data remains challenging using current variant calling methods. We present skalo, a graph-based algorithm that complements the popular split k-mer approach implemented in the SKA software. skalo is designed for alignment-free inferences of indels between closely related haploid genomes, which are ignored by SKA. The graph traversal implemented in skalo enables rapid detection of indels and complex variants, while retaining the speed and alignment-free advantages of SKA. Through benchmarking on simulated and real Mycobacterium tuberculosis data, we demonstrated its ability to identify indels and complex variants with high precision, and explored their utility as phylogenetic markers to resolve isolates’ relationships. By providing an efficient and easy-to-use method to extract additional variants from genomic data, skalo can enhance our understanding of pathogen evolution and transmission, with potential applications across diverse pathogen species. skalo is written in Rust and is freely available at https://github.com/rderelle/skalo .

Article activity feed