Identifying similar populations across independent single cell studies without data integration

Oscar González-Velasco
Malte Simon
Rüstem Yilmaz
Rosanna Parlato
Jochen Weishaupt
Charles D Imbusch
Benedikt Brors

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Supervised and unsupervised methods have emerged to address the complexity of single cell data analysis in the context of large pools of independent studies. Here, we present ClusterFoldSimilarity (CFS), a novel statistical method design to quantify the similarity between cell groups across any number of independent datasets, without the need for data correction or integration. By bypassing these processes, CFS avoids the introduction of artifacts and loss of information, offering a simple, efficient, and scalable solution. This method match groups of cells that exhibit conserved phenotypes across datasets, including different tissues and species, and in a multimodal scenario, including single-cell RNA-Seq, ATAC-Seq, single-cell proteomics, or, more broadly, data exhibiting differential abundance effects among groups of cells. Additionally, CFS performs feature selection, obtaining cross-dataset markers of the similar phenotypes observed, providing an inherent interpretability of relationships between cell populations. To showcase the effectiveness of our methodology, we generated single-nuclei RNA-Seq data from the motor cortex and spinal cord of adult mice. By using CFS, we identified three distinct sub-populations of astrocytes conserved on both tissues. CFS includes various visualization methods for the interpretation of the similarity scores and similar cell populations.

Version published to 10.1093/nargab/lqaf042
Mar 29, 2025
Version published to 10.1101/2024.09.27.615367 on bioRxiv
Sep 29, 2024

An integrated single-cell transcriptomic dataset for Mouse cortex

This article has 8 authors:
1. Xuefeng Shi
2. Zhihui Qi
3. Hong Huang
4. Zhiming Ye
5. YuMin Wu
6. Kahei Chan
7. Maojin Yao
8. Zhongxing Wang
This article has no evaluationsLatest version Dec 18, 2025
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025
Accurate, scalable, and unified single-cell atlas integration with scBIOT

This article has 2 authors:
1. Haihui Zhang
2. Peiwu Qin
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

An integrated single-cell transcriptomic dataset for Mouse cortex

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

Accurate, scalable, and unified single-cell atlas integration with scBIOT