ADAPTIVE MULTI-SCALE GRAPH TRANSFORMER FRAMEWORK FOR HISTOPATHOLOGICAL IMAGES
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Whole slide images (WSIs) contain hierarchical information from cellular to tissue architecture but their gigapixel scale poses major memory and computational challenges. Existing multi-scale graph and transformer models capture complex WSI features effectively but struggle with efficiency. We propose an Adaptive Multi-Scale Graph Transformer (AMGT) for WSI classification that addresses this limitation through two key modules: a Self-Guided Token Aggregation (SGTA) mechanism that fuses multi-resolution features to reduce redundancy, and a Prototypical Transformer (PT) that groups similar tokens into phenotype-representative prototypes with linear complexity. This design preserves essential spatial and semantic information, substantially lowering memory cost and improving interpretability by prototypical learning. AMGT achieves superior performance and efficiency, outperforming state-of-the-art models by 1.8% and 5.3% AUC on high-grade ovarian cancer and Camelyon16 datasets, respectively. These results demonstrate AMGT’s capacity for scalable, interpretable multi-scale representation learning.