Singletrack: An Algorithm for Improving Memory Consumption and Performance of Gap-Affine Sequence Alignment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Advances in DNA sequencing have outpaced advances in computation, making sequence alignment a major bottleneck in genome data analyses. Classical dynamic programming (DP) algorithms are particularly memory-intensive, especially when computing gap-affine and dual gap-affine alignments. Existing strategies to reduce memory consumption often sacrifice either speed or alignment accuracy.
Results
We present Singletrack, an efficient algorithm for backtrace gap-affine and dual gap-affine alignments that requires only storing a single DP matrix. Compared to classical DP algorithms, Singletrack removes the need to store additional matrices (i.e., 2 for gap-affine and 4 for dual gap-affine), significantly reducing memory consumption and, in turn, reducing pressure on the memory hierarchy and improving overall performance. Most importantly, Singletrack is a general backtrace method compatible with state-of-the-art DP-based algorithms and heuristics, such as the Suzuki-Kasahara (SK) and the Wavefront Alignment (WFA) algorithms. Our results demonstrate that Singletrack accelerates the SK implementation of KSW2, used within Minimap2, by up to 1.4×. Similarly, Singletrack enhances the performance of the WFA implementation in WFA2-lib by 1.2–2.1× while reducing memory usage by 3× for gap-affine and 5× for dual gap-affine. Compared to the efficient linear-memory BiWFA algorithm, the Singletrack-accelerated version of WFA trades a practical increase in memory usage for up to 5.2× higher performance.
Availability
All the implementations of the Singletrack algorithm presented in this work are available at https://github.com/LorienLV/singletrack .