A knowledge-guided approach to recovering important rare signals from high-dimensional single-cell data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell transcriptomic data are high-dimensional, with many genes profiled in each cell. Dimensionality reduction is routinely applied to improve interpretability, remove noise and redundancy, and enable visualization. Most existing methods aim at preserving the most prominent data properties, which can lead to omission of rare but important signals. Here we propose a novel framework that uses knowledge-derived genes of interest to guide dimensionality reduction, which can help cluster rare cells and separate highly similar cell sub-populations. We demonstrate the utility of our framework in identifying endocrine cell subtypes in the pancreatic islet, highly similar hematopoietic sub-populations, and rare senescent cells.