Recovering undersampled single-cell transcriptomes with HyperCell

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell transcriptomic technology has now matured, allowing quantification of mRNA transcripts corresponding to tens of thousands of genes within a cell. However, still only a small fraction of these mRNA is captured and measured by today’s single-cell assays. There are likely hundreds of thousands of mRNA copies present within a typical human cell, yet these assays omit a majority of the transcripts that are actually present. This introduces technical noise, especially non-biological variability and excessive sparsity, which frustrates downstream analysis and potentially skews biological conclusions. To overcome these challenges, we here develop HyperCell, a probabilistic deep learning approach that explicitly models this undersampling to produce estimates of each cell’s original gene transcript abundances across the whole transcriptome. We demonstrate that our framework offers benefits in various mRNA modeling settings, by i) correctly differentiating between spurious sampling-induced and real biological zeros, outperforming existing approaches, ii) estimating the total mRNA content of cells across states to reduce contamination due to background transcripts, iii) reducing contamination due to background transcripts, and iv) helping to counteract biases that may appear during typical differential gene expression analyses using widespread normalization approaches. Our approach to correcting for the technical noise introduced by the single-cell experimental process brings us closer to studying biology, starting from the true transcriptome of cells.

Article activity feed