Multiple instance learning using pathology foundation models effectively predicts kidney disease diagnosis and clinical classification

Yu Kurata
Imari Mimura
Satoshi Kodera
Hiroyuki Abe
Daisuke Yamada
Haruki Kume
Tetsuo Ushiku
Tetsuhiro Tanaka
Norihiko Takeda
Masaomi Nangaku

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Introduction

Histological analysis of kidney biopsies is crucial in diagnosing kidney diseases and predicting clinical outcomes. Recently developed pathology foundation models, pretrained on large-scale pathology datasets, have demonstrated excellent performance in various downstream applications. This study evaluated the utility of pathology foundation models combined with multiple instance learning (MIL) for kidney pathology analysis.

Methods

We used 242 hematoxylin and eosin (H&E)-stained whole slide images (WSIs) from the Kidney Precision Medicine Project (KPMP) and Japan-Pathology Artificial Intelligence Diagnostics Project (JP-AID) databases as the development cohort, comprising 47 healthy controls, 35 acute interstitial nephritis, and 160 diabetic kidney disease (DKD) slides. External validation was performed using 83 WSIs from the University of Tokyo Hospital (UT dataset). Diagnoses were based on adjudicated diagnoses (KPMP) or expert pathologists-derived diagnoses (JP-AID and UT). Pretrained pathology foundation models were utilized as patch encoders and compared with ImageNet-pretrained ResNet50.

Results

In internal validation, all foundation models outperformed ResNet50, achieving area under the receiver operating characteristic curve (AUROC) over 0.980. In external validation, the performance of ResNet50 markedly dropped (AUROC = 0.768), whereas all foundation models maintained higher performance (AUROC over 0.800). Visualization with attention heatmaps confirmed that foundation models accurately recognized diagnostically relevant structures. Additionally, foundation models outperformed ResNet50 in predicting severe proteinuria among DKD cases from KPMP dataset.

Conclusion

We successfully integrated pathology foundation models with MIL to achieve robust diagnostic performance, even when trained on a relatively small dataset, highlighting their potential for real-world clinical applications. Key words: artificial intelligence, renal pathology, foundation model, multiple instance learning

Version published to 10.1101/2025.06.03.25328845 on medRxiv
Jun 4, 2025

Inferring Clinically Relevant Molecular Subtypes of Pancreatic Cancer from Routine Histopathology Using Deep Learning

This article has 13 authors:
1. Abdul Rehman Akbar
2. Alejandro Levya
3. Ashwini Esnakula
4. Elshad Hasanov
5. Anne Noonan
6. Upender Manne
7. Vaibhav Sahai
8. Lingbin Meng
9. Susan Tsai
10. Anil Parwani
11. Wei Chen
12. Ashish Manne
13. Muhammad Khalid Khan Niazi
This article has no evaluationsLatest version Jan 16, 2026
Prognosis Prediction in Bladder Cancer Pathological Images Based on Nuclear Structure Encoding

This article has 7 authors:
1. Bo Guan
2. Yuan Gao
3. Feng Wang
4. Guangdi Chu
5. Jianchang Zhao
6. Haitao Niu
7. Jianmin Li
This article has no evaluationsLatest version Dec 29, 2025
Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

This article has 16 authors:
1. Hao Liu
2. Meijun Liu
3. Xinmiao Guan
4. Feng Cao
5. Changhao Liang
6. Zhongwen Qi
7. Jiaqi Hui
8. Junnan Zhao
9. Jingli Xing
10. Jianguo Zhou
11. Dong Zhang
12. Lei Liu
13. Xiaoliang Hao
14. Minjing Luo
15. Fengqin Xu
16. Yutong Fei
This article has no evaluationsLatest version Jan 12, 2026

Discuss this preprint

Listed in

Abstract

Introduction

Methods

Results

Conclusion

Article activity feed

Related articles

Inferring Clinically Relevant Molecular Subtypes of Pancreatic Cancer from Routine Histopathology Using Deep Learning

Prognosis Prediction in Bladder Cancer Pathological Images Based on Nuclear Structure Encoding

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease