Multi-modal Diffusion Model with Dual-Cross-Attention for Multi-Omics Data Generation and Translation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell multi-omics technologies offer unprecedented opportunities to decipher complex cellular mechanisms. To overcome experimental limitations in scale, cost, and coverage, powerful computational methods are essential for integrating diverse data modalities and generating high-fidelity in-silico data. In this paper, we present scDiffusion-X, a latent diffusion model specifically designed for this purpose. The core innovation is a Dual-Cross-Attention (DCA) module that adaptively learns intricate, hidden relationships between different molecular modalities, offering a more flexible and interpretable approach than traditional integration strategies. Extensive benchmarking experiments demonstrate that scDiffusion-X excels at generating realistic multi-omics data, preserving cellular heterogeneity and global data structures with excellent scalability. Distinct from existing multi-omics simulators, scDiffusion-X uniquely enables high-fidelity modality translation by predicting one molecular modality from another and provides robust uncertainty quantification. Beyond data generation, we designed a gradient-based interpretation framework to transform DCA module into a discovery tool, enabling inference of comprehensive cell-type-specific heterogeneous gene regulatory networks (GRNs). By integrating state-of-the-art generative modeling with deep biological interpretability, scDiffusion-X serves as a powerful tool for dissecting regulatory relationships, predicting perturbation responses, poised to accelerate discovery in single-cell multi-omics research.

Article activity feed