Scalable integration and prediction of unpaired single-cell and spatial multi-omics via regularized disentanglement
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding cellular states urgently requires methods capable of integrating large-scale, heterogeneous single-cell and spatial omics data. However, these data are often completely unpaired due to destructive assays and suffer from technical noise, variable feature coverage, and immense scale. We present scMRDR, a scalable computational framework leveraging regularized disentangled representation learning to integrate multiple, completely unpaired single-cell omics datasets with heterogeneous resolutions and coverages. scMRDR overcomes common data-pairing requirements and computational bottlenecks by learning a unified, structure-preserving latent embedding, efficiently scalable to large-scale multi-omics data. This integrated representation further enables robust cross-modal translation like predicting chromatin accessibility from gene expression and, critically, allows for the imputation of spatial coordinates onto non-spatial single-cell modalities using a reference atlas. This spatial mapping capability provides the necessary input for sophisticated, spatially-aware statistical models, enabling the identification of novel spatially variable genes and the dissection of epigenetic regulatory programs within their native tissue context.