GatorST: A Versatile Contrastive Meta-Learning Framework for Spatial Transcriptomic Data Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction
Recent advances in spatial transcriptomics (ST) technologies have revolutionized our understanding of cellular functions by providing gene expression profiles with rich spatial context. Effectively learning spatial representations is crucial for downstream analyses and requires robust integration of spatial information with transcriptomic data. While existing methods have shown promise, they often fail to adequately capture both local (neighbor-level) and global (tissue-wide) spatial contexts. Moreover, they tend to rely heavily on augmentation strategies, which can introduce noise and instability.
Objectives
This study aims to introduce and demonstrate a novel, versatile framework called GatorST, which explicitly combines graph-based modeling with advanced learning strategies to generate spatially informed representations of ST data. GatorST is designed to improve various downstream tasks, including identification of spatial domains, gene expression imputation, batch effect removal, and trajectory inference.
Methods
GatorST constructs a spot-spot graph by connecting each node to its k nearest spatial neighbors and extracts two-hop neighborhood subgraphs to capture local context. At the global level, gene expression profiles are clustered using soft K-means to generate pseudo-labels, which serve as weak supervision signals within a contrastive learning framework. This process encourages the alignment of embeddings with shared pseudo-labels while separating those with different labels. GatorST further adopts an episodic training strategy inspired by meta-learning, wherein each episode consists of a support set for contrastive optimization and a disjoint query set for embedding classification, guided by the pseudo-labeled data. This design enables the model to classify unseen samples based on learned embeddings, thereby enhancing its generalization to new spatial contexts.
Results
Comprehensive comparisons with fifteen state-of-the-art methods across fourteen spatial transcriptomics datasets demonstrate that GatorST consistently achieves superior performance in identifying spatial domains, imputing gene expressions, and removing batch effects. The results showcase the versatility and strong generalization capabilities of GatorST across diverse tissue types and experimental settings.
Conclusion
GatorST effectively integrates spatial topology and global gene expression through graph-based modeling, pseudo-labeling, and contrastive meta-learning. This framework generates biologically meaningful representations and significantly improves key downstream tasks, including spatial domain identification, gene expression imputation, batch effect removal, and trajectory inference.