scGPT-spatial: Continual Pretraining of Single-Cell Foundation Model for Spatial Transcriptomics

Chloe Wang
Haotian Cui
Andrew Zhang
Ronald Xie
Hani Goodarzi
Bo Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Spatial transcriptomics has emerged as a pivotal technology for profiling gene expression of cells within their spatial context. The rapid growth of publicly available spatial data presents an opportunity to further our understanding of microenvironments that drive cell fate decisions and disease progression. However, existing foundation models, largely pretrained on single-cell RNA sequencing (scRNA-seq) data, fail to resolve the spatial relationships among samples or capture the unique distributions from various sequencing protocols. We introduce scGPT-spatial , a specialized foundation model for spatial transcriptomics continually pretrained on our previously published scGPT scRNA-seq foundation model. We also curate SpatialHuman30M, a comprehensive spatial transcriptomics dataset comprising of 30 million spatial transcriptomic profiles, encompassing both imaging- and sequencing-based protocols. To facilitate integration, scGPT-spatial introduces a novel MoE (Mixture of Experts) decoder that adaptively routes samples for protocol-aware decoding of gene expression profiles. Moreover, scGPT-spatial employs a spatially-aware sampling strategy and a novel neighborhood-based training objective to better capture spatial co-localization patterns among cell states within tissue. Empirical evaluations demonstrate that scGPT-spatial robustly integrates spatial data in mulit-slide and multi-modal settings, and effectively supports cell-type deconvolution and contextualized missing gene expression imputation, outperforming many existing methods. The scGPT-spatial codebase is publicly available at https://github.com/bowang-lab/scGPT-spatial .

Version published to 10.1101/2025.02.05.636714 on bioRxiv
Feb 8, 2025

Microenvironment-aware transcriptome reconstruction in spatial transcriptomics

This article has 7 authors:
1. Shi-Tong Yang
2. Pai Peng
3. Hui-Feng He
4. Meng-Guo Wang
5. Bo-Han Si
6. Xiao-Fei Zhang
7. Luonan Chen
This article has no evaluationsLatest version Jan 13, 2026
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025
An integrated single-cell transcriptomic dataset for Mouse cortex

This article has 8 authors:
1. Xuefeng Shi
2. Zhihui Qi
3. Hong Huang
4. Zhiming Ye
5. YuMin Wu
6. Kahei Chan
7. Maojin Yao
8. Zhongxing Wang
This article has no evaluationsLatest version Dec 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Microenvironment-aware transcriptome reconstruction in spatial transcriptomics

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

An integrated single-cell transcriptomic dataset for Mouse cortex