SpatialFinder: A Human-in-the-Loop Vision-Language Framework for Prioritizing High-Value Regions in Spatial Transcriptomics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Sequencing an entire spatial transcriptomics slide can cost thousands of dollars per assay, making routine use impractical. Focusing on smaller regions of interest (ROIs) based on adjacent routine H&E slides offers a practical alternative, but there is (i) no reliable way to identify the most informative areas from standard H&E images alone; and (ii) limited solutions for clinicians to prioritize the microenvironment of their own interests. Here we introduce SpatialFinder , a framework that combines a biomedical vision-language model (VLM) with a human-in-the-loop optimization pipeline to predict gene expression heterogeneity and rank high-value ROIs across routine H&E tissue slides. Evaluated across four Visium HD tissue types, SpatialFinder consistently outperforms baseline VLMs in selecting regions with high cellular diversity and tumor presence, achieving up to 89% correlation with ground truth rankings. These results demonstrate the potential of human-AI collaboration to make spatial transcriptomics more cost-effective and clinically actionable.