Cellohood: multi-granular discovery of cellular neighborhoods with a permutation-invariant set transformer auto-encoder
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Discovering cellular neighborhoods and their roles in disease requires computational methods that consider the full breadth of data and offer multi-level granularity and interpretability. Here, we introduce Cellohood, a permutation-invariant set transformer auto-encoder equipped with a clinical association pipeline. Cellohood encodes full readouts for bags of spatially co-localized cells and supports multi-level analyses, providing interpretability by mapping latent dimensions to spatial tissue features. Our model surpasses current methods in accuracy of cellular neighborhood detection for spatial transcriptomics of the human cortex and CODEX spleen data measuring lupus progression. Applied to cancer data across three granularity levels, our method recovers neighborhoods along the expected immune-cold to immune-hot spectrum and further refines them into biologically and clinically meaningful subclasses, revealing spatial patterns linked to prognosis, histology, stage, and tumor mutational burden, and uncovering subgroups that transcend standard classifications. Overall, Cellohood enables in-depth analysis of complex tissues, revealing clinically informative spatial neighborhoods.