SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

Droplet-based single-cell RNA sequence analyses assume that all acquired RNAs are endogenous to cells. However, any cell-free RNAs contained within the input solution are also captured by these assays. This sequencing of cell-free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data.

Results

We demonstrate that contamination from this "soup" of cell-free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background-corrected" cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics.

Conclusions

We present SoupX, a tool for removing ambient RNA contamination from droplet-based single-cell RNA sequencing experiments. This tool has broad applicability, and its application can improve the biological utility of existing and future datasets.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giaa151

    Matthew D Young 1Wellcome Trust Sanger Institute, University of CambridgeFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: my4@sanger.ac.ukSam Behjati 1Wellcome Trust Sanger Institute, University of Cambridge2Cambridge University Hospitals NHS Foundation Trust, University of Cambridge3Department of Paediatrics, University of CambridgeFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa151 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102569 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102570 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102571