Explainable Prototype Booster: Enhancing Latent Representations of Foundation Models for Gene Expression Prediction

Chaoyi Li
Quan Nguyen

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Spatial transcriptomics (ST) is a cutting-edge technology that measures gene expression while preserving spatial context and generating pathology-grade tissue images. Although ST has enabled numerous discoveries and demonstrated a huge application potential in pathological diagnosis and prognosis, the technology remains time-consuming and costly. The ability to predict gene markers of cancer from histological H&E-stained tissue images can overcome these technological barriers to open new horizons for precision and personalised pathology. Recently, foundation models have demonstrated improvements in generating general-purpose embeddings of H&E-images. However, these improved representations are not optimized for gene expression prediction and lack task-specific adaptability. To address this limitation, we propose the Explainable Prototype Booster (EP-Booster), which incorporates biological prior knowledge to guide the construction and training of learnable prototypes for embedding refinement, thereby improving gene expression prediction. Importantly, model predictions are inherently interpretable through pathway-level attributions associated with the prototypes. Extensive experiments across multiple datasets, cancer types, and spatial transcriptomics platforms demonstrate that EP-Booster consistently outperforms existing methods. Moreover, EP-Booster can be integrated with diverse foundation models to enhance task-specific representations, thereby improving predictive performance and biological interpretability in clinically relevant applications, including cancer biomarker prediction, survival analysis, and drug response prediction.

Version published to 10.64898/2026.04.27.720478 on bioRxiv
Apr 29, 2026

Adaptive Integration of Heterogeneous Foundation Models to Find Histologically Predictable Genes in Breast Cancer

This article has 6 authors:
1. Hao Nguyen
2. Chaoyi Li
3. Can Peng
4. Peter Simpson
5. Nan Ye
6. Quan Nguyen
This article has no evaluationsLatest version Apr 8, 2026
EVEE: Interpretable variant effect prediction from genomic foundation model embeddings

This article has 22 authors:
1. Michael T. Pearce
2. Thomas Dooms
3. Ryo Yamamoto
4. Joshua Meehl
5. Carl Molnar
6. Mark Bissell
7. Dron Hazra
8. Ching Fang
9. Nam Nguyen
10. Michael Anderson
11. Collin Osborne
12. Patrick Duffy
13. Bridget Toomey
14. Eric Klee
15. Elena Myasoedova
16. Alexander J. Ryu
17. Shant Ayanian
18. Panos Korfiatis
19. Matt Redlon
20. Archa Jain
21. Daniel Balsam
22. Nicholas K. Wang
This article has no evaluationsLatest version Apr 11, 2026
RNABag: A Generalizable Transcriptome Foundation Model for Precision Oncology across Biopsy Modalities

This article has 7 authors:
1. Pengchao Luo
2. Dong Luo
3. Dan Li
4. Xiangyang Xue
5. Jianbo Yang
6. Xuejun Gong
7. Kun Tang
This article has no evaluationsLatest version Apr 22, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Adaptive Integration of Heterogeneous Foundation Models to Find Histologically Predictable Genes in Breast Cancer

EVEE: Interpretable variant effect prediction from genomic foundation model embeddings

RNABag: A Generalizable Transcriptome Foundation Model for Precision Oncology across Biopsy Modalities