TARPON - a Telomere Analysis and Research Pipeline Optimized for Nanopore
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Long-read sequencing has transformed many areas of biology and holds significant promise for telomere research by enabling analysis of nucleotide-level resolution chromosome arm–specific telomere length in both model organisms and humans. However, the adoption of new technologies, particularly in clinical or diagnostic contexts, requires careful validation to recognize potential technical and computational limitations.
Results
We present TARPON (Telomere Analysis and Research Pipeline Optimized for Nanopore), a best-practices Nextflow pipeline designed for the analysis of telomeres sequenced on the Oxford Nanopore Technologies (ONT) platform. TARPON can be executed via the command line or integrated into ONT’s EPI2ME agent, providing a user-friendly graphical interface for those without computational training. Nextflow’s container-based architecture eliminates dependency conflicts, thereby streamlining deployment across platforms.
TARPON isolates telomeric repeat–containing reads, assigns strand specificity, and identifies enrichment probes that can be used both for demultiplexing and for confirming capture-based library preparation. To ensure that the analysis is restricted to full-length telomeres, reads lacking a capture probe or non-telomeric sequence on the opposite end are excluded. A sliding-window approach defines the subtelomere-to-telomere boundary, followed by quality filtering to remove low-quality or subtelomeric reads that passed earlier steps. The pipeline generates customizable statistics, text-based summaries, and publication-ready visualizations (HTML, PNG, PDF). While default settings are optimized for diagnostic workflows, all parameters are easily adjustable via the GUI or command line to support diverse applications. These include telomere analyses in variant-rich samples (e.g., ALT-positive tumors) and organisms with non-canonical telomeric repeats such as some insects (GTTAG) and certain plants (GGTTTAG).
Conclusions
TARPON is the first complete and experimentally validated pipeline for Nanopore-based telomere analysis requiring no data pre-processing or prior bioinformatics expertise, while offering flexibility for advanced users.