Evaluating Vision and Pathology Foundation Models for Computational Pathology: A Comprehensive Benchmark Study

Rohan Bareja
Francisco Carrillo-Perez
Yuanning Zheng
Marija Pizurica
Tarak Nath Nandi
Jeanne Shen
Ravi Madduri
Olivier Gevaert

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

To advance precision medicine in pathology, robust AI-driven foundation models are increasingly needed to uncover complex patterns in large-scale pathology datasets, enabling more accurate disease detection, classification, and prognostic insights. However, despite substantial progress in deep learning and computer vision, the comparative performance and generalizability of these pathology foundation models across diverse histopathological datasets and tasks remain largely unexamined. In this study, we conduct a comprehensive benchmarking of 31 AI foundation models for computational pathology, including general vision models (VM), general vision-language models (VLM), pathology-specific vision models (Path-VM), and pathology-specific vision-language models (Path-VLM), evaluated over 41 tasks sourced from TCGA, CPTAC, external benchmarking datasets, and out-of-domain datasets. Our study demonstrates that Virchow2, a pathology foundation model, delivered the highest performance across TCGA, CPTAC, and external tasks, highlighting its effectiveness in diverse histopathological evaluations. We also show that Path-VM outperformed both Path-VLM and VM, securing top rankings across tasks despite lacking a statistically significant edge over vision models. Our findings reveal that model size and data size did not consistently correlate with improved performance in pathology foundation models, challenging assumptions about scaling in histopathological applications. Lastly, our study demonstrates that a fusion model, integrating top-performing foundation models, achieved superior generalization across external tasks and diverse tissues in histopathological analysis. These findings emphasize the need for further research to understand the underlying factors influencing model performance and to develop strategies that enhance the generalizability and robustness of pathology-specific vision foundation models across different tissue types and datasets.

Version published to 10.1101/2025.05.08.25327250 on medRxiv
May 12, 2025

OmiMRI: A Clinical-adaptive AI Framework for Format-Free Interpretation of Heterogeneous Brain MRIs

This article has 7 authors:
1. Lei Ma
2. Feng Su
3. Xiaoping Yi
4. Ye Cheng
5. Yongjie Ma
6. Zeming Tan
7. Gengdi Huang
This article has no evaluationsLatest version Jan 21, 2026
A Survey of Contrastive Learning in Medical AI: Foundations, Biomedical Modalities, and Future Directions

This article has 6 authors:
1. George Obaido
2. Ibomoiye Domor Mienye
3. Kehinde Aruleba
4. Chidozie Williams Chukwu
5. Ebenezer Esenogho
6. Cameron Modisane
This article has no evaluationsLatest version Dec 26, 2025
Harness Behavioural Analysis for Unpacking the Bio-Interpretability of Pathology Foundation Models

This article has 11 authors:
1. Yang Hu
2. George Batchkala
3. Kezia Gaitskell
4. Enric Domingo
5. Bin Li
6. Tianyang Zhang
7. Zexi Li
8. Matthias Friedrich
9. Dan Woodcock
10. Clare Verrill
11. Jens Rittscher
This article has no evaluationsLatest version Jan 29, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

OmiMRI: A Clinical-adaptive AI Framework for Format-Free Interpretation of Heterogeneous Brain MRIs

A Survey of Contrastive Learning in Medical AI: Foundations, Biomedical Modalities, and Future Directions

Harness Behavioural Analysis for Unpacking the Bio-Interpretability of Pathology Foundation Models