ECLARE: multi-teacher contrastive learning via ensemble distillation for diagonal integration of single-cell multi-omic data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Integrating multimodal single-cell data, such as scRNA-seq and scATAC-seq, is key for decoding gene regulatory networks but remains challenging due to issues like feature harmonization and limited quantity of paired data. To address these challenges, we introduce ECLARE , a novel framework combining multi-teacher ensemble knowledge distillation with contrastive learning for diagonal integration of single-cell multi-omic data. ECLARE trains teacher models on paired datasets to guide a student model for unpaired data, leveraging a refined contrastive objective and transport-based loss for precise cross-modality alignment. Experiments demonstrate ECLARE ’s competitive performance in cell pairing accuracy, multimodal integration and biological structure preservation, indicating that multi-teacher knowledge distillation provides an effective mean to improve a diagonal integration model beyond its zero-shot capabilities. Additionally, we validate ECLARE ’s applicability through a case study on major depressive disorder (MDD) data, illustrating its capability to reveal gene regulatory insights from unpaired nuclei. While current results highlight the potential of ensemble distillation in multi-omic analyses, future work will focus on optimizing model complexity, dataset scalability, and exploring applications in diverse multi-omic contexts. ECLARE establishes a robust foundation for biologically informed single-cell data integration, facilitating advanced downstream analyses and scaling multi-omic data for training advanced machine learning models.