Semantically-Guided State-Space Models for Data-Efficient and Robust Cross-Platform Virtual Staining

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Histologic staining is a cornerstone of clinical pathology, enabling the visualization of cellular structures and facilitating accurate diagnoses. However, traditional staining methods are labor-intensive, inconsistent, and reliant on reagent concentrations and operator expertise. Virtual staining offers an efficient alternative but faces challenges including cross-platform adaptability and heavy reliance on paired training data. Here, we present a virtual staining framework that integrates Mamba state space models with cycle-consistent adversarial networks (CycleGAN), significantly reducing data requirements while maintaining or improving staining quality. Our approach incorporates three key innovations: (1) an efficient, high-quality virtual staining method based on Mamba state-space models, enhanced by modules like Adaptive Frequency Filtering Upsampling (AFFU), demonstrating robust cross-platform generalization; (2) a highly efficient entropy-hue guided data selection strategy that drastically reduces data requirements, potentially applicable to other data-scarce domains in biomedical imaging; and (3) a multi-level semantic guidance approach utilizing vision-language models to inject domain knowledge, improving feature preservation and cross-modal adaptability. We validated our approach on two distinctly different microscopy platforms: a UV photoacoustic microscopy system with a 40× objective lens and a Zeiss AxioScan scanner with a 20× objective lens. For H\&E virtual staining from label-free UV photoacoustic images, our method required only 12.5% of the data compared to baseline CycleGAN models while achieving a 22% improvement in FID metrics (from 51.13 to 41.03). For H&E to Masson's trichrome conversion on the Zeiss system, our approach used only 38.5% of the data while improving FID by 16.1% (from 17.06 to 13.26) and achieving a structural similarity index of 0.984. Our framework requires only 2-3 complete tissue sections to meet training needs on new microscopy platforms, with inference times under 3 minutes per whole-slide image on a standard workstation (NVIDIA RTX 3090). This approach reduces staining time from 24-72 hours to minutes while preserving essential morphological features, offering potential for rapid pathological screening and diagnosis in resource-constrained settings.

Article activity feed