opjMap: A Sensitive Mapper for Repetitive Structural Variations in Long Noisy Reads Based on Orthogonal Projection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background The continuous advancements in single-molecule sequencing (SMS) technologies, including PacBio Single Molecule Real-Time and Oxford Nanopore Technologies (ONT), have led to a significant increase in read lengths. This has unlocked tremendous potential for a wide range of cutting-edge genomic applications. However, these long reads suffer from higher sequencing error rates and contain repetitive segments, making it challenging for most existing alignment tools to effectively map these repetitive regions. Given the crucial role that repetitive variations play in biological evolution, we introduce opjMap, an alignment tool based on orthogonal projection localization, which is specifically designed to align long, noisy SMS reads to a reference sequence while also accommodating repetitive structural variations (SVs). Results Through exhaustive benchmark experiments on both simulated and real SMS datasets, we demonstrate that opjMap exhibits higher sensitivity compared to other mainstream alignment tools like minimap2, NGMLR, and Winnowmap2, enabling it to align more reads and bases to the reference genome. Furthermore, opjMap produces a greater number of alignment results under challenging conditions of high error rates and short repetitive segments. Conclusions opjMap provides a robust and highly sensitive solution for mapping noisy long reads containing repetitive structural variations. opjMap supports multi-threaded alignment. The source code is publicly available for download at https://github.com/FanXingGuo/opjMap.

Article activity feed