Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples

Hong Zheng
Kevin Brennan
Mikel Hernaez
Olivier Gevaert

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (GigaScience)

Abstract

Background

Long non-coding RNAs (lncRNAs) are emerging as important regulators of various biological processes. While many studies have exploited public resources such as RNA sequencing (RNA-Seq) data in The Cancer Genome Atlas to study lncRNAs in cancer, it is crucial to choose the optimal method for accurate expression quantification.

Results

In this study, we compared the performance of pseudoalignment methods Kallisto and Salmon, alignment-based transcript quantification method RSEM, and alignment-based gene quantification methods HTSeq and featureCounts, in combination with read aligners STAR, Subread, and HISAT2, in lncRNA quantification, by applying them to both un-stranded and stranded RNA-Seq datasets. Full transcriptome annotation, including protein-coding and non-coding RNAs, greatly improves the specificity of lncRNA expression quantification. Pseudoalignment methods and RSEM outperform HTSeq and featureCounts for lncRNA quantification at both sample- and gene-level comparison, regardless of RNA-Seq protocol type, choice of aligners, and transcriptome annotation. Pseudoalignment methods and RSEM detect more lncRNAs and correlate highly with simulated ground truth. On the contrary, HTSeq and featureCounts often underestimate lncRNA expression. Antisense lncRNAs are poorly quantified by alignment-based gene quantification methods, which can be improved using stranded protocols and pseudoalignment methods.

Conclusions

Considering the consistency with ground truth and computational resources, pseudoalignment methods Kallisto or Salmon in combination with full transcriptome annotation is our recommended strategy for RNA-Seq analysis for lncRNAs.

GigaScience
Mar 14, 2022

A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz145), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

These peer reviews were as follows:

Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102945

Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102946

Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102947

Reviewer 4: http://dx.doi.org/10.5524/REVIEW.102948

Reviewer 5: http://dx.doi.org/10.5524/REVIEW.102949

Read the original source
Version published to 10.1093/gigascience/giz145
Dec 1, 2019
Version published to 10.1101/241869 on bioRxiv
Jan 2, 2018

Long non-coding RNAs as regulators, biomarkers, and therapeutic targets in colorectal cancer

This article has 13 authors:
1. Hazem A. El-Kady
2. Rofaida A. Mashhour
3. Basmala T. Mohammad
4. Basmala M. Elhendy
5. Aisha Sabry Elsamalouty
6. Majdeldin E. Abdelgilil
7. Ahmed O. Sayed
8. Omar O. Ibrahim
9. Logine M. AlAshry
10. Shaza A. Fahmy
11. Amany M. Abd-Elshafy
12. Farah Esmail Mohamed
13. Mahmoud Said
This article has no evaluationsLatest version Jan 23, 2026
Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

This article has 5 authors:
1. Yu Yang
2. Liping Ren
3. Juan Feng
4. Yang Zhang
5. Tianyuan Liu
This article has no evaluationsLatest version Dec 17, 2025
Integrated transcriptomic analysis reveals a lncRNA-miRNA-TF-mRNA regulatory network underlying quercetin’s anti-hepatocellular carcinoma effects

This article has 4 authors:
1. Tong Lin
2. Ningna Weng
3. Yifan Chen
4. Zhengrong Huang
This article has no evaluationsLatest version Dec 22, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusions

Article activity feed

Related articles

Long non-coding RNAs as regulators, biomarkers, and therapeutic targets in colorectal cancer

Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

Integrated transcriptomic analysis reveals a lncRNA-miRNA-TF-mRNA regulatory network underlying quercetin’s anti-hepatocellular carcinoma effects