Cell Type Weighted Dimensionality Reduction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
High-dimensional single-cell technologies, such as flow cytometry and CITE-Seq, typically rely on established lineage markers to define cell identities. Additional markers are commonly analyzed within the context of these predefined cell types. Nonlinear projection methods such as t-SNE and UMAP provide a visual framework for this analysis by enabling the overlay of cell types and marker expression. However, these methods frequently produce projections where distinct cell types substantially overlap, hindering interpretation of marker expression patterns relative to known cell types.
In this study, we investigate the underlying causes of this phenomenon and demonstrate that such overlaps often stem from the inherent high-dimensional structure of the data rather than limitations in the dimensionality reduction algorithms themselves. To address this, we introduce Cell Type Weighted Dimensionality Reduction (CWDR), a novel approach that incorporates lineage-based information through a supervised weighting mechanism.
By integrating both cell identity and marker expression, CWDR preserves the visual separation between predefined cell types while maintaining the local variance necessary for downstream analysis. We validate our method across multiple high-dimensional flow cytometry and proteogenomic datasets. Our results show that CWDR significantly reduces inter-cluster overlap compared to traditional methods, providing a clearer framework for visualizing marker expression within the context of specific cell lineages.