Transcriptome Graph Transformer--A Graph Transformer-Based Unsupervised Model for Transcriptome Data Analysis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Rapidly growing transcriptomic datasets pose challenges for traditional analytical methods, which struggle with high dimensionality, heterogeneity, and nonlinear gene relationships. Existing deep learning models often require fixed-length inputs and fail to integrate biological network information. Methods: We introduce Transcriptome Graph Transformer (TGT), an unsupervised graph Transformer framework that constructs a heterogeneous gene--pathway graph using expression data, STRING interactions, and GO/KEGG/Reactome pathway annotations. TGT is pretrained with a masked-node prediction task and fine-tuned for disease classification, biomarker discovery, and zero-shot clustering of single-cell and spatial transcriptomics. Results: TGT achieves superior performance across Alzheimer's disease, cancer, acute kidney injury, and COVID-19 datasets, outperforming state-of-the-art baselines. The model generalizes well across platforms and yields biologically meaningful gene and pathway importance scores consistent with known disease mechanisms. Conclusion: TGT provides an effective and generalizable approach for transcriptomic representation learning by integrating biological network knowledge with graph Transformer architectures. Its strong performance highlights its utility for broad transcriptomic applications and precision medicine.

Article activity feed