A ratio-based framework using Quartet reference materials for integrating long- and short-read RNA-seq

Qingwang Chen
Xiaorou Guo
Duo Wang
Jiaxin Zhao
Yang Xu
Yupei You
Yuanbang Mai
Shumeng Duan
Yaqing Liu
Yutong Zhang
Xiaojing Li
Hu Chen
Wanwan Hou
Ying Yu
Lianhua Dong
Jinming Li
Matthew E. Ritchie
Rui Zhang
Leming Shi
Yuanting Zheng

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Long-read RNA sequencing (lrRNA-seq) enables full-length transcript profiling but is confounded by technical batch effects that compromise quantification and prevent data integration across platforms, protocols, and laboratories. The lack of a transcriptome-wide biological ground truth has hindered objective benchmarking. To address these dual challenges, we leveraged certified Quartet reference materials to generate one of the largest multi-center lrRNA-seq resources to date: over one billion long reads from 144 libraries across four PacBio and Nanopore protocols in four independent laboratories. We first establish that ratio-based quantification against built-in reference samples effectively removes technical noise, revealing underlying biological signals. We then constructed the first ratio-based reference datasets for full-length transcripts— comprising 10,218 isoforms and 6,032 alternative splicing (AS) events—and orthogonally validated them with RT–qPCR. Finally, a comprehensive benchmark using these ground truths reveals that a hybrid strategy integrating long- and short-read data (hybrid-seq) achieves the highest quantification accuracy for both isoforms and AS events. Our work provides a foundational framework and resource for evaluating lrRNA-seq technologies and accelerating the standardization of full-length transcriptomics for research and clinical applications.

Version published to 10.1101/2025.09.15.676287 on bioRxiv
Sep 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed