GeneJepa: A Predictive World Model of the Transcriptome

Elon Litman
Tyler Myers
Vinayak Agarwal
Ekansh Mittal
Orion Li
Ashwin Gopinath
Timothy Kassis

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We introduce G ene J epa , a self-supervised foundation model that learns a predictive world model of single-cell transcriptomes. Based on the Joint-Embedding Predictive Architecture, G ene J epa predicts latent representations of masked gene sets from visible context, a shift away from reconstructing noisy expression values and toward world-model style inference over cellular state. To realize this at scale, a Perceiver encoder handles variable gene sets at fixed cost, and a tokenizer jointly represents gene identity and continuous expression using Fourier features. Trained on the Tahoe-100M atlas, G ene J epa learns general representations that transfer across tissues and datasets. On downstream tasks, including drug response and perturbation prediction, it surpasses strong baselines and enables test-time scaling by progressively enlarging the cross-attention over the gene set, trading a small read cost for higher accuracy at inference. G ene J epa moves toward foundation models that reason over gene–gene relations, enabling applications in annotation, prediction, and in-silico discovery.

Version published to 10.1101/2025.10.14.682378 on bioRxiv
Oct 15, 2025

Transcriptome Graph Transformer--A Graph Transformer-Based Unsupervised Model for Transcriptome Data Analysis

This article has 3 authors:
1. Teng Long
2. Sachit Satyal
3. Jean Gao
This article has no evaluationsLatest version Jan 9, 2026
Understanding Pathways in Bioinformatics, Genomics, and Health Applications

This article has 1 author:
1. Diptarup Mallick
This article has no evaluationsLatest version Jan 19, 2026
GENERator: A Long-Context Generative Genomic Foundation Model

This article has 18 authors:
1. Qiuyi Li
2. Wei Wu
3. Yuanyuan Zhang
4. Zhihao Zhan
5. Ruipu Chen
6. Mingyang Li
7. Kun Fu
8. Junyan Qi
9. Yongzhou Bao
10. Chao Wang
11. Yiheng Zhu
12. Zhiyun Zhang
13. Jian Tang
14. Fuli Feng
15. Jieping Ye
16. Liu Yuwen
17. Hui Xiong
18. Zheng Wang
This article has no evaluationsLatest version Feb 4, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Transcriptome Graph Transformer--A Graph Transformer-Based Unsupervised Model for Transcriptome Data Analysis

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

GENERator: A Long-Context Generative Genomic Foundation Model