Agentic Chart Review from Longitudinal Clinical Notes: a Lung Cancer Guideline Concordance Use Case

Yuhang Jiang
Xing He
Xuguang Ai
Ramya Keerthi Majji
Rohan Maniar
Shadia Jalal
David A. Fedele
Jessica Hollenbach
Jinsong Liu
Yan Zhuang
Yiye Zhang
Jiang Bian

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Clinical chart abstraction extracts structured patient variables from longitudinal clinical notes but is labor-intensive and difficult to scale. We evaluated LLM agents for question-guided chart review using lung cancer molecular testing guideline concordance as a use case. Two configurations were compared: (1) sequential note review using metadata and chronology, and (2) the same framework augmented with keyword-based note search. Gold-standard labels were established by human annotators. The search-enabled agent achieved higher accuracy (92.4% vs. 83.5%) and reduced errors by more than half (41 vs. 89) by retrieving evidence from long, heterogeneous note histories. In guideline concordance evaluation, most determinate patient–rule assessments were concordant (80.7%), while most apparent non-concordance reflected missing molecular testing documentation rather than documented care deviations. These results suggest tool-augmented LLM agents can approximate key aspects of human chart review and support scalable information extraction from longitudinal clinical documentation.

Version published to 10.64898/2026.06.02.26354727 on medRxiv
Jun 3, 2026

A Retrospective Evaluation of the Microsoft Healthcare Agent Orchestrator for Tumor Board Patient Summaries

This article has 10 authors:
1. Joy Roy
2. Jack Korleski
3. Ryan C. Augustin
4. Leeor S. Yefet
5. Zach Jensen
6. Eric C. Ehman
7. Gelareh Zadeh
8. Amy Lynn Conners
9. Amye J. Tevaarwerk
10. Panagiotis Korfiatis
This article has no evaluationsLatest version Jun 1, 2026
Augmenting Structured Diagnoses through Effective Use of Pre-trained Large Language Models on Clinical Notes

This article has 6 authors:
1. Hanieh Razzaghi
2. Nhat Nguyen
3. Mohan Pargi
4. Kaleigh Wieand
5. H. Timothy Bunnell
6. L. Charles Bailey
This article has no evaluationsLatest version Jun 2, 2026
General-purpose large language models can achieve physician-level accuracy in complex medical data extraction

This article has 2 authors:
1. Manu Rajeev
2. Ananthu Narayan
This article has no evaluationsLatest version Jun 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Retrospective Evaluation of the Microsoft Healthcare Agent Orchestrator for Tumor Board Patient Summaries

Augmenting Structured Diagnoses through Effective Use of Pre-trained Large Language Models on Clinical Notes

General-purpose large language models can achieve physician-level accuracy in complex medical data extraction