Comprehensive benchmarking with guidelines for analyzing transposable element-derived RNA expression

Jianqi She
Jiadong Wang
Ence Yang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Transposable element-derived RNAs (teRNAs) have been recognized with accelerating fundamental or pathogenic roles, especially in human. Despite the rapid development of computational methods, the best practice for accurate identification and quantification of teRNAs are currently lacking owing to the difficulties of evaluation. Here we present benchmarking of 16 representative tools with 120 simulated datasets and 60 real-world paired datasets (comprising both long- and short-read data), by evaluating the performance of teRNA identification or quantification across family-, unit-, exon-, and transcript-level. Our findings demonstrate not only the exon-level as a trade-off between accuracy and resolution for teRNA analysis, but also the level-dependent strengths and weaknesses of evaluated methods. To refine our benchmarking results, we present decision-tree-style guidelines and develop an integrated best-practice pipeline, serving as the basis for future functional researches. In addition, our evaluation framework also provides a gold standard for developing and benchmarking better computational tools in the field.

Version published to 10.1101/2025.09.30.679421 on bioRxiv
Oct 1, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed