Integrating single-cell and single-nucleus datasets improves bulk RNA-seq deconvolution

Adriana Ivich
Casey S. Greene

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Bulk RNA-seq deconvolution typically uses single-cell RNA-sequencing (scRNA-seq) references, but some cell types are only detectable through single-nucleus RNA sequencing (snRNA-seq). Because snRNA-seq captures nuclear, but not cytoplasmic, transcripts, direct use as a reference could reduce deconvolution accuracy. Here, we systematically benchmark strategies to integrate both modalities, focusing on transformations and gene-filtering approaches that harmonize snRNA-seq with scRNA-seq references. Across four diverse tissues, we evaluated principal component–based shifts, conditional and non-conditional variational autoencoders (scVI), and the removal of cross-modality differentially expressed genes (DEGs). While all methods improved performance relative to untransformed snRNA-seq, filtering consistent cross-modality DEGs delivered the greatest gains, often matching or surpassing scRNA-only references. Conditional scVI performed comparably and was especially effective when matched scRNA–snRNA cell types were unavailable. In real adipose bulk samples without ground truth, DEG pruning and conditional scVI provided the most robust cell-fraction estimates across donors and transformations. Together, these results demonstrate that scRNA-seq should be prioritized as the reference when available, with snRNA-seq appended only after filtering cross-modality DEGs. For less-characterized systems where DEG information is limited, conditional scVI offers a practical alternative. Our findings provide clear guidelines for modality-aware integration, enabling near-scRNA-seq accuracy in bulk deconvolution workflows.

Version published to 10.1101/2025.08.20.671333 on bioRxiv
Aug 24, 2025

DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming

This article has 25 authors:
1. Ying Yang
2. Ralph Patrick
3. Xiaoli Chen
4. Stacey Anderson
5. Jingyu Zhang
6. Yifei Huang
7. Mohammadhossein Esmaeili
8. Kanupriya Tiwari
9. Shivangi Wani
10. Monisha Ganesan
11. Hsin-Yi Chou
12. Dominique Power
13. Cassy M Spiller
14. Sas Loganathan
15. Solal Chauquet
16. Michael Piper
17. Majid Alhomrani
18. Walaa Alsanie
19. Sonia Shah
20. Josephine Bowles
21. Jessica C Mar
22. Shyuan T Ngo
23. Melanie D White
24. Marina Naval-Sanchez
25. Christian M Nefzger
This article has no evaluationsLatest version Dec 23, 2025
An integrated single-cell transcriptomic dataset for Mouse cortex

This article has 8 authors:
1. Xuefeng Shi
2. Zhihui Qi
3. Hong Huang
4. Zhiming Ye
5. YuMin Wu
6. Kahei Chan
7. Maojin Yao
8. Zhongxing Wang
This article has no evaluationsLatest version Dec 18, 2025
Comprehensive benchmarking of RNA velocity methods across single-cell datasets

This article has 6 authors:
1. Jin Liu
2. Yida Wu
3. Chuihan Kong
4. Xu Liao
5. Zhixiang Lin
6. Xiaobo Sun
This article has no evaluationsLatest version Feb 2, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DoseH-seq: A single-cell multiome platform to decode gene-dosage logic driving developmental reversion and cell fate reprogramming

An integrated single-cell transcriptomic dataset for Mouse cortex

Comprehensive benchmarking of RNA velocity methods across single-cell datasets