Partially characterized topology guides reliable anchor-free scRNA-integration
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell RNA sequencing (scRNA-seq) is an important technique for obtaining biological insights at cellular resolution, with scRNA-seq batch integration a key step before downstream statistical analysis. Despite the plethora of methods proposed, achieving reliable batch correction while preserving the heterogeneity of biological signals that define cell type continues to pose a challenge, with existing methods’ performance varying significantly across different scenarios and datasets. To address this, we propose scCRAFT, an autoencoder model designed to segregate cell-type-related biological signals from batch effects for reliable multi-batch scRNA-seq integration. scCRAFT comprises three key loss components: a reconstruction loss that targets observation reconstruction, a multi-domain adaptation loss aimed at eliminating batch effects, and an innovative dual-resolution triplet loss for preserving topology within each batch, which is introduced as an effective mechanism to counteract the over-correction effect of domain adaptation loss amid heterogeneous cell distributions across batches. We show that scCRAFT effectively manages unbalanced batches, rare cell types, and batch-specific cell phenotypes in simulations, and surpasses state-of-the-art methods in a diverse set of real datasets.