spEMO: Leveraging Multi-Modal Foundation Models for Analyzing Spatial Multi-Omic and Histopathology Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recent advances in pathology foundation models (PFMs), which are pretrained on large-scale histopathological images, have significantly accelerated progress in disease-centered applications. In parallel, spatial multi-omic technologies collect gene and protein expression levels at high spatial resolution, offering rich understanding of tissue context. However, current models fall short in effectively integrating these complementary data modalities. To fill in this gap, we introduce spEMO, a novel computational system that unifies embeddings from pathology foundation models and large language models (LLMs) to analyze spatial multi-omic data. By incorporating multimodal representations and information, spEMO outperforms models trained on single-modality data across a broad range of downstream tasks, including spatial domain identification, spot-type classification, whole-slide disease-state prediction and interpretation, inference of multicellular interactions, and automated medical report generation. The outstanding performances of spEMO in these tasks demonstrate its strength in both biological and clinical applications. Additionally, we propose a new evaluation task, known as multi-modal alignment, to assess the information retrieval capabilities of pathology foundation models. This task provides a principled benchmark for evaluating and improving model architectures. Collectively, spEMO represents a step forward in building holistic, interpretable, and generalizable AI systems for spatial biology and pathology.