GraphHDBSCAN*: Graph-based Hierarchical Clustering on High Dimensional Single-cell RNA Sequencing Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell RNA sequencing (scRNA-seq) is widely used to resolve cellular heterogeneity across thousands to millions of cells. A major challenge is to identify biologically meaningful cell populations while preserving their hierarchical organization, because broad cell types frequently split into more specialized subtypes. However, state-of-the-art approaches mostly focus on flat partitions and ignore the hierarchical structure of single-cell data. Here we introduce GraphHDBSCAN*, a graph-based, hyperparameter-free extension of HDBSCAN* that performs hierarchical density-based clustering on a graph representation of the data, enabling robust recovery of both single-level and hierarchical relationships in high-dimensional and sparse datasets. We evaluate GraphHDBSCAN* across multiple scRNA-seq datasets and show that it recovers biologically meaningful hierarchies that reveal fine-grained structure in complex data, including monocyte subpopulations. In addition, the method yields high-quality flat partitions that outperform widely used community-detection methods.

Article activity feed