Interpretable deep generative ensemble learning for single-cell omics with Hydra
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell omics enable the dissection of cellular heterogeneity, yet the high dimensionality, inherent noise, and sparsity present significant challenges. These challenges are amplified for rare cell populations, which are often difficult to annotate reliably but can be central to development and disease. As single-cell assays increasingly capture multiple molecular layers, the integrative analysis of such multimodal data further increases complexity. Here, we propose Hydra, a deep generative framework based on an ensemble of variational autoencoders for effective learning of unimodal and multimodal single-cell omics data. Hydra implements interpretable modules for capturing cell type-specific molecular signatures. The ensemble of such interpretable modules enables reproducible feature selection and robust cell type annotation, with particular effectiveness for rare populations. We benchmarked Hydra on a repertoire of 21 datasets, including unimodal and multimodal single-cell omics data. Our results demonstrate that Hydra offers comparable to superior performance to several state-of-the-art methods. Finally, we highlight the utility of Hydra in robustly annotating brain cellular subtypes and preserving disease-relevant signatures using our previously published dataset that profiles Alzheimer’s disease.