AGP-Net: A Universal Network for Gene Expression Prediction of Spatial Transcriptomics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the era of high-throughput biology, molecular phenotypes have proven effective in predicting disease states and future trajectories. Transcriptomics, in particular, has enabled the dissection of complex diseases with heterogeneous genetic and environmental aetiology, both aiding diagnosis and augmenting treatment. As improving technology has led to measurements of gene expression at increasing granularity, it has become progressively feasible to resolve disease traits that present locally or with spatial heterogeneity. Principal among these are cancers in which tumour gene expression, while itself heterogenous, exhibits a distinct signature from that of the surrounding tissue. Identifying this signature through molecular phenotyping facilitates specific cancer diagnosis and treatment.
Here, we introduce AGP-Net, a multi-modal foundation framework capable of predicting gene expression from histopathology images. Rather than produce an aggregated estimate of expression for each gene, AGP-Net disaggregates images into spots and attempts to resolve the variation in gene expression across them, thereby providing coarse spatial transcriptomic predictions across the tumour slice and surrounding region. The challenge in doing so is due to data sparsity relative to the dimensionality of the problem: the number of genes and their contextual heterogeneity within and between tissue and cancer types makes it difficult to train a model on the limited data available. The innovation of AGP-Net lies in borrowing strength across similar genes as defined by their textual language descriptions. Our AGP-Net supports datasets with varying gene coverage and facilitates the prediction of gene expression for previously unseen genes based on their textual descriptions. Trained on millions of spots from diverse dataset sources, AGP-Net establishes state-of-the-art performance in zero-shot spatial gene expression prediction, demonstrating its adaptability to generalize across novel scenarios.