Evaluating stably expressed genes in single cells
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (GigaScience)
Abstract
Background
Single-cell RNA-seq (scRNA-seq) profiling has revealed remarkable variation in transcription, suggesting that expression of many genes at the single-cell level is intrinsically stochastic and noisy. Yet, on the cell population level, a subset of genes traditionally referred to as housekeeping genes (HKGs) are found to be stably expressed in different cell and tissue types. It is therefore critical to question whether stably expressed genes (SEGs) can be identified on the single-cell level, and if so, how can their expression stability be assessed? We have previously proposed a computational framework for ranking expression stability of genes in single cells for scRNA-seq data normalization and integration. In this study, we perform detailed evaluation and characterization of SEGs derived from this framework.
Results
Here, we show that gene expression stability indices derived from the early human and mouse development scRNA-seq datasets and the "Mouse Atlas" dataset are reproducible and conserved across species. We demonstrate that SEGs identified from single cells based on their stability indices are considerably more stable than HKGs defined previously from cell populations across diverse biological systems. Our analyses indicate that SEGs are inherently more stable at the single-cell level and their characteristics reminiscent of HKGs, suggesting their potential role in sustaining essential functions in individual cells.
Conclusions
SEGs identified in this study have immediate utility both for understanding variation and stability of single-cell transcriptomes and for practical applications such as scRNA-seq data normalization. Our framework for calculating gene stability index, "scSEGIndex," is incorporated into the scMerge Bioconductor R package (https://sydneybiox.github.io/scMerge/reference/scSEGIndex.html) and can be used for identifying genes with stable expression in scRNA-seq datasets.
Article activity feed
-
Now published in GigaScience doi: 10.1093/gigascience/giz106
Yingxin Lin 1School of Mathematics and Statistics, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShila Ghazanfar 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia2Charles Perkins Centre, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDario Strbenac 1School of Mathematics and Statistics, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAndy Wang 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia3Sydney Medical School, University of Sydney, NSW 2006, …
Now published in GigaScience doi: 10.1093/gigascience/giz106
Yingxin Lin 1School of Mathematics and Statistics, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShila Ghazanfar 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia2Charles Perkins Centre, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDario Strbenac 1School of Mathematics and Statistics, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAndy Wang 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia3Sydney Medical School, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEllis Patrick 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia4Westmead Institute for Medical Research, University of Sydney, Westmead, NSW 2145, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDave Lin 5Department of Biomedical Sciences, Cornell University, Ithaca, NY, 14853, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTerence Speed 6Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia7Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC 3010, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJean YH Yang 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia2Charles Perkins Centre, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jean YH YangPengyi Yang 1School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia2Charles Perkins Centre, University of Sydney, NSW 2006, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site
A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz106 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
These peer reviews were as follows:
Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101910 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101911
-
-
-