linearPOA: A parallel, memory-efficient framework for Partial Order Alignment with linear space complexity

Yanming Wei
Zhaoyang Huang
Pinglu Zhang
Qinzhong Tian
Yan Li
Quan Zou
Liang Yu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Multiple sequence alignment (MSA) is a fundamental problem in computational bioinformatics, playing a critical role in genome biology, especially in long read sequencing and assembly. One solution for representing and solving MSA is Partial Order Alignment (POA), which employs Directed Acyclic Graphs (DAGs) to represent sequence relationships. However, when facing the ultra-long, error-prone reads (e.g., >100 kbps), existing POA algorithms with quadratic space complexity become impractical due to excessive memory consumption. This paper introduces the linearPOA, which based on divide-and-conquer strategy to solve the POA, aimed at saving memory compared to quadratic space complexity algorithms like SPOA, abPOA and TSTA. Particularly notable is its capability to save up to 102.74 times memory usage when aligning sequences with 100 kbp reads, compared to the abPOA method using non-heuristic methods. The algorithm was implemented within the linearPOA library, providing functionality for POA and foundational support for sequencing analysis, like error correction for reads. The linearPOA algorithm provides memory-efficient algorithms for long-read sequencing, especially in directly assembling long reads like 100 kbp reads.

Availability

The linearPOA library is freely available at https://github.com/malabz/linearPOA , and the data underlying this article are available in Zenodo, at https://doi.org/10.5281/zenodo.15637837 .

Supplementary information

Supplementary information are available at BioRxiv online.

Version published to 10.64898/2026.04.27.720899 on bioRxiv
Apr 30, 2026

Rapid-PFP: Accelerating Prefix-Free Parsing with GPU Parallelism

This article has 5 authors:
1. Eddie Ferro
2. Tyler Pencinger
3. Oded Green
4. Mahsa Lotfollahi
5. Christina Boucher
This article has no evaluationsLatest version May 1, 2026
Why phylogenies compress so well: combinatorial guarantees under the Infinite Sites Model

This article has 2 authors:
1. Veronika Hendrychová
2. Karel Břinda
This article has no evaluationsLatest version Mar 27, 2026
A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers

This article has 1 author:
1. Richard Durbin
This article has no evaluationsLatest version Mar 29, 2026

Discuss this preprint

Listed in

Abstract

Availability

Supplementary information

Article activity feed

Related articles

Rapid-PFP: Accelerating Prefix-Free Parsing with GPU Parallelism

Why phylogenies compress so well: combinatorial guarantees under the Infinite Sites Model

A run-length-compressed skiplist data structure for dynamic GBWTs supports time and space efficient pangenome operations over syncmers