A Multimodal Graph Learning Framework for Versatile Spatial Transcriptomics Analysis with SpatialModal
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The development of spatial transcriptomics(ST) technologies enables the exploration of tissue structure and cellular function within a spatial context. However, mainstream analytical methods primarily focus on the gene expression modality itself and struggle to fully leverage auxiliary information from histological images, which limits precise dissection of complex biological structures. To address this, we propose SpatialModal, a multimodal graph learning framework that effectively integrates gene expression data and histological images from spatial transcriptomics to construct unified spatial representations. SpatialModal effectively captures synergistic interactions between modalities by jointly modeling intra- and inter-modal complementary features while incorporating spatial adjacency information. Furthermore, it employs a dual contrastive learning strategy to enhance the discriminative power of representations, thereby enabling efficient and robust analysis of tissue structures. We validate the effectiveness of this approach on multiple public datasets covering diverse tissue types, species, and resolutions. Experiments demonstrate that SpatialModal exhibits significant advantages in downstream tasks, including spatial domain identification, gene expression reconstruction, and pseudotime inference, and accurately elucidates the hierarchical structure of the mouse cortex as well as the metabolic-immune dynamic boundaries in the breast cancer microenvironment. Additionally, SpatialModal shows exceptional robustness in single-cell resolution and multi-slice integration tasks, providing an efficient and broadly applicable analytical tool for spatial transcriptomics research.