Foundation Model-Architected Sparse Dictionaries for Fully Quantitative Interpretability in Whole Slide Image Decoding
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Artificial intelligence (AI)-based pathological image analysis has achieved remarkable progress. However, a critical limitation persists: beyond attention-based visualization methods—which offer only partial interpretability—existing AI approaches fail to provide quantifiable diagnostic evidence that meets clinically actionable standards and aligns with pathologists' domain expertise. Consequently, the interpretability gap between AI diagnostics and pathologists impedes clinical adoption. To resolve this impasse, we propose PATH-SparseD (Pathology-Aware Transformer–Hybridized Sparse Dictionary Learning), a framework that reformulates AI-driven diagnostic inference as a dictionary-query process. Central to PATH-SparseD is the novel integration of foundation models with sparse representation theory. This synergy empowers the model to deconstruct the highly complex and heterogeneous data of a WSI into a sparse combination of fundamental, clinically meaningful visual "primitives"—each acting as an atomic diagnostic unit. Specifically, for any WSI tile, PATH-SparseD does not merely extract features; it encodes the tile by identifying and activating a minimal set of these primitives from a multi-scale dictionary. This process effectively reformulates the entire WSI as a quantifiable histogram of primitive occurrences, creating a transparent and pattern-based transcript of the tissue that directly mirrors a pathologist's diagnostic reasoning. Extensive experiments on 10 datasets covering 8 tumor types demonstrated that PATH-SparseD not only provides an interpretable and quantitatively verifiable pathological analysis framework recognized by clinical pathologists, but also significantly outperforms state-of-the-art foundation-model approaches, yielding accuracy improvements of 5.6% in glioma grading, 4.4% in IDH1 genotyping, 14.5% in TNM grading, 17.0% in tumor differentiation, 11.8% in cellular origin, and 8.6% in organ origin. Ultimately, PATH-SparseD establishes a novel paradigm for WSI decoding that harnesses the performance advantages of foundation models while providing an effective solution to the interpretability challenge.