Extrinsic biological stochasticity and technical noise normalization of single-cell RNA sequencing data

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The technical noise introduced during single-cell RNA sequencing (scRNA-seq) has led to the use of size factor normalization as a first step prior to data analysis. However, this scaling approach inherently affects extrinsic (between cell) variability of gene expression, which stems from both biological and technical factors. Based on previous models on biological and technical extrinsic noise, we propose a general extrinsic noise model for scRNA-seq to provide a theoretical basis for size factor normalization, thus providing a framework for estimating both biological and technical components of extrinsic noise. We highlight the relationship between normalized gene expression covariance, extrinsic noise, and overdispersion, showing that extrinsic noise explains the baseline overdispersion commonly observed in scRNA-seq data. We validated the technical model by testing the relationship on data from pooled RNA. Interestingly, our model accurately describes mature mRNA counts but not nascent mRNA counts, suggesting the need for an alternative technical model for data derived from nascent transcripts. Using single-cell RNA-seq data, we characterize both biological and technical extrinsic noise and cell size factors estimated using Poisson-like genes. Overall, our model helps clarify common misconceptions and provides insight into the role of extrinsic noise and size factor normalization in scRNA-seq data.

Article activity feed