Deconvolution of Sparse-count RNA Sequencing Data for Tumor Cells Using Embedded Negative Binomial Distributions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Estimating tumor-specific transcript proportions from mixed bulk samples has potential to inform novel biology. However, estimation accuracy using existing methods in sparse-count data such as microRNA-seq and spatial transcriptomics has yet to be established. We generated a mixed small RNA benchmark dataset to demonstrate analytical challenges. To resolve them, we developed DeMixNB, a semi-reference-based deconvolution model assuming a sum of negative binomial distributions. Applications to miRNA-seq from 856 patients with breast cancer and 3,755 spatial spots from lung cancer generated either clinical or mechanistic insights into tumor cell plasticity. This supports the important utility of DeMixNB to investigate cancer RNomes.