A*PA2: up to 20 times faster exact global alignment

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Methods

We introduce A*PA2, an exact global pairwise aligner with respect to edit distance. The goal of A*PA2 is to unify the near-linear runtime of A*PA on similar sequences with the efficiency of dynamic programming (DP) based methods. Like E dlib , A*PA2 uses Ukkonen’s band doubling in combination with Myers’ bitpacking. A*PA2 1) extends this with SIMD (single instruction, multiple data), 2) uses large block sizes inspired by B lock A ligner , 3) avoids recomputation of states where possible as suggested before by Fickett, 4) introduces a new optimistic technique for traceback based on diagonal transition, and 5) applies the heuristics developed in A*PA and improves them using pre-pruning .

Results

The average runtime of A*PA2 is 19 × faster than the exact aligners B i WFA and E dlib on > 500 kbp long ONT reads of a human genome having 6% divergence on average. On shorter ONT reads of 11% average divergence the speedup is 5.6 × (avg. length 11 kbp) and 0.81 × (avg. length 800 bp). On all tested datasets, A*PA2 is competitive with or faster than approximate methods.

Availability

github.com/RagnarGrootKoerkamp/astar-pairwise-aligner

Contact

ragnar.grootkoerkamp@inf.ethz.ch

Article activity feed