Systematic benchmarking of small variant calling pipelines for long-read RNA sequencing data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Long-read RNA sequencing (lrRNA-seq) enables transcript-resolved variant detection, but systematic and neutral evaluations of small variants calling pipelines remain limited. The performance of existing tools across sequencing technologies, alignment strategy, variant caller choice, genomic contexts and downstream haplotype phasing is not fully understood.
Results
Here, we systematically benchmark four lrRNA-seq variant callers (Clair3-RNA, DeepVariant, longcallR, and longcallR-nn), along with a widely used short-read RNA-seq variant caller (GATK HaplotypeCaller) as a baseline, using Genome in a Bottle (GIAB) datasets comprising three cell lines sequenced with four Oxford Nanopore Technologies (ONT) and two PacBio library preparation protocols. We further evaluate the impact of upstream alignment strategies, including aligner choice and alignment transformation, on variant-calling performance. Accuracy is assessed across sequencing depths and genomic contexts. Additionally, we compare haplotype phasing tools (WhatsHap, LongPhase, HapCUT2, HiPhase and longcallR) using variant calls generated by different callers to identify optimal pipeline combinations. Finally, we extend our evaluation of variant-calling performance to more recent LongBench datasets.
Conclusions
Our benchmark shows that sequencing quality is the primary determinant of lrRNA-seq variant-calling performance, followed by variant caller and alignment strategy, with additional effects from genomic context. In GIAB datasets, all lrRNA-seq-specific callers performed reasonably well, with Clair3-RNA (across both ONT and PacBio) and DeepVariant (for PacBio) ranking among the top-performing methods. In more recent LongBench datasets of cancer cell lines, DeepVariant and longcallR showed higher sensitivity, whereas Clair3-RNA and longcallR-nn were more conservative, yielding fewer variant calls. For downstream haplotype phasing, we recommend WhatsHap or HapCUT2 for most libraries, owing to their high phasing coverage and accuracy, respectively, while longcallR performs better on ONT dRNA004 datasets across both metrics.