Foundation Model-Architected Sparse Dictionaries for Fully Quantitative Interpretability in Whole Slide Image Decoding

Jinhua Yu
GuoQing Wu
Tianyi Pan
Hanning Xu
Ye Tang
Xuan Xie
Chengqian Zhao
Feiyu Yin
Pengfei Song
Ji Xiong
Zhifeng Shi
Ying Liu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Artificial intelligence (AI)-based pathological image analysis has achieved remarkable progress. However, a critical limitation persists: beyond attention-based visualization methods—which offer only partial interpretability—existing AI approaches fail to provide quantifiable diagnostic evidence that meets clinically actionable standards and aligns with pathologists' domain expertise. Consequently, the interpretability gap between AI diagnostics and pathologists impedes clinical adoption. To resolve this impasse, we propose PATH-SparseD (Pathology-Aware Transformer–Hybridized Sparse Dictionary Learning), a framework that reformulates AI-driven diagnostic inference as a dictionary-query process. Central to PATH-SparseD is the novel integration of foundation models with sparse representation theory. This synergy empowers the model to deconstruct the highly complex and heterogeneous data of a WSI into a sparse combination of fundamental, clinically meaningful visual "primitives"—each acting as an atomic diagnostic unit. Specifically, for any WSI tile, PATH-SparseD does not merely extract features; it encodes the tile by identifying and activating a minimal set of these primitives from a multi-scale dictionary. This process effectively reformulates the entire WSI as a quantifiable histogram of primitive occurrences, creating a transparent and pattern-based transcript of the tissue that directly mirrors a pathologist's diagnostic reasoning. Extensive experiments on 10 datasets covering 8 tumor types demonstrated that PATH-SparseD not only provides an interpretable and quantitatively verifiable pathological analysis framework recognized by clinical pathologists, but also significantly outperforms state-of-the-art foundation-model approaches, yielding accuracy improvements of 5.6% in glioma grading, 4.4% in IDH1 genotyping, 14.5% in TNM grading, 17.0% in tumor differentiation, 11.8% in cellular origin, and 8.6% in organ origin. Ultimately, PATH-SparseD establishes a novel paradigm for WSI decoding that harnesses the performance advantages of foundation models while providing an effective solution to the interpretability challenge.

Version published to 10.21203/rs.3.rs-8542624/v1 on Research Square
Feb 11, 2026

ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities

This article has 10 authors:
1. Williams Ayivi
2. Xiaoling Zhang
3. Yeongx Yeong Hyeon Gu
4. Amil Aligayev
5. Ali Alqahtani
6. Wisdom Xornam Ativi
7. Francis Sam
8. Muhammed Amin Abdullah
9. Emmanuel Sarpong Addai Gyarteng
10. Mugahed A. Al-antari
This article has no evaluationsLatest version Mar 17, 2026
ForVA and GCM-CLIP: A Million-Scale Multimodal Dataset and Representation Learning Framework for Virtual Autopsy

This article has 8 authors:
1. Jing Cai
2. Jikai Mao
3. Nanze Du
4. Tu Lyu
5. Hao Li
6. Yi Shen
7. Liang Shen
8. Junjun Guo
This article has no evaluationsLatest version Mar 30, 2026
CUVAE: Strengthening Latent Representations in Skip-Connection VAEs for High-Fidelity Medical Image Reconstruction

This article has 2 authors:
1. Kailash Kandpal
2. Prabhat Verma
This article has no evaluationsLatest version Mar 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities

ForVA and GCM-CLIP: A Million-Scale Multimodal Dataset and Representation Learning Framework for Virtual Autopsy

CUVAE: Strengthening Latent Representations in Skip-Connection VAEs for High-Fidelity Medical Image Reconstruction