ALPaCA: Adapting Llama for Pathology Context Analysis to enable slide-level question answering

Zeyu Gao
Kai He
Weiheng Su
Ines P. Machado
William McGough
Mercedes Jimenez-Linan
Brian Rous
Chunbao Wang
Chengzu Li
Xiaobo Pang
Tieliang Gong
Ming Y. Lu
Faisal Mahmood
Mengling Feng
Chen Li
Mireia Crispin-Ortuzar

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large Vision Language Models (LVLMs) have recently revolutionized computational pathology. LVLMs transform pathology image embeddings into tokens recognizable by large language models, facilitating zero-shot image classification, description generation, question answering, and interactive diagnostics. In clinical practice, pathological assessments often require the analysis of entire tissue slides, integrating information from multiple sub-regions and magnification levels. However, existing LVLM frameworks have been restricted to the analysis of small, predefined regions of interest, lacking the ability to analyze pyramidal, gigapixel-scale whole-slide images (WSIs). In this work, we introduce ALPaCA ( A dapting L lama for Pa thology C ontext A nalysis), and train the first general-purpose slide-level LVLM, leveraging 35,913 WSIs with curated descriptions alongside 341,051 question and answer pairs encompassing diverse diagnoses, procedures, and tissue types. By developing LongFormer, a vision-text interactive slide-level adaptor, and integrating it with a Gaussian mixture model-based prototyping adaptor, followed by training with Llama3.1, ALPaCA achieves superior performance in slide-level question answering, achieving over 90% accuracy in close-ended tests and high accuracy in open-ended questions as evaluated by expert pathologists, highlighting its potential for slide-level computer-aided diagnosis systems. Additionally, we show that ALPaCA can be readily fine-tuned on in-depth, organ-specific, or disease-specific datasets, underscoring its adaptability and utility for specialized pathology tasks.

Version published to 10.1101/2025.04.22.25326190v1 on medRxiv
Apr 22, 2025

Optimized BERT-based NLP outperforms Zero-Shot Methods for Automated Symptom Detection in Clinical Practice

This article has 10 authors:
1. Juan G. Diaz Ochoa
2. Natalie Layer
3. Jonas Mahr
4. Faizan E Mustafa
5. Christian U. Menzel
6. Martina Müller
7. Tobias Schilling
8. Gerald Illerhaus
9. Markus Knott
10. Alexander Krohn
This article has no evaluationsLatest version Apr 22, 2025
Domain specific models outperform large vision language models on cytomorphology tasks

This article has 6 authors:
1. Ivan Kukuljan
2. Muhammed Furkan Dasdelen
3. Julia Schäfer
4. Michele Buck
5. Katharina S. Götze
6. Carsten Marr
This article has no evaluationsLatest version May 6, 2025
Large Language Models in Portuguese for Healthcare: A Systematic Review

This article has 7 authors:
1. Andre Massahiro Shimaoka
2. Antonio Carlos da Silva Junior
3. José Marcio Duarte
4. Thiago Bulhões da Silva Costa
5. Ivan Torres Pisa
6. Luciano Rodrigo Lopes
7. Paulo Bandiera-Paiva
This article has no evaluationsLatest version May 22, 2025

Listed in

Abstract

Article activity feed

Related articles

Optimized BERT-based NLP outperforms Zero-Shot Methods for Automated Symptom Detection in Clinical Practice

Domain specific models outperform large vision language models on cytomorphology tasks

Large Language Models in Portuguese for Healthcare: A Systematic Review