Orchestrated multi agents sustain accuracy under clinical-scale workloads compared to a single agent

Eyal Klang
Mahmud Omar
Ganesh Raut
Reem Agbareia
Prem Timsina
Robert Freeman
Nicholas Gavin
Lisa Stump
Alexander W Charney
Benjamin S Glicksberg
Girish N Nadkarni

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We tested state-of-the-art large language models (LLMs) in two configurations for clinical-scale workloads: a single agent handling heterogeneous tasks versus an orchestrated multi-agent system assigning each task to a dedicated worker. Across retrieval, extraction, and dosing calculations, we varied batch sizes from 5 to 80 to simulate clinical traffic. Multi-agent runs maintained high accuracy under load (pooled accuracy 90.6% at 5 tasks, 65.3% at 80) while single-agent accuracy fell sharply (73.1% to 16.6%), with significant differences beyond 10 tasks (FDR-adjusted p < 0.01). Multi-agent execution reduced token usage up to 65-fold and limited latency growth compared with single-agent runs. The design’s isolation of tasks prevented context interference and preserved performance across four diverse LLM checkpoints. This is the first evaluation of LLM agent architectures under sustained, mixed-task clinical workloads, showing that lightweight orchestration can deliver accuracy, efficiency, and auditability at operational scale.

Version published to 10.1101/2025.08.22.25334049 on medRxiv
Aug 24, 2025

Multi-Agent AI Systems for Biological and Clinical Data Analysis

This article has 5 authors:
1. Jackson Spieser
2. Ali Balapour
3. Jarek Meller
4. Krushna Patra
5. Behrouz Shamsaei
This article has no evaluationsLatest version Dec 30, 2025
A Survey on LLM-based Multi-Agent AI Hospital

This article has 2 authors:
1. Zonghai Yao
2. Hong Yu
This article has no evaluationsLatest version Dec 26, 2025
Towards a Science of Scaling Agent Systems

This article has 20 authors:
1. Yubin Kim
2. Ken Gu
3. Chanwoo Park
4. Chunjong Park
5. Samuel Schmidgall
6. A. Ali Heydari
7. Yao Yan
8. Zhihan Zhang
9. Yuchen Zhuang
10. Yun Liu
11. Mark Malhotra
12. Paul Liang
13. Hae Won Park
14. Yuzhe Yang
15. Xuhai Xu
16. Yilun Du
17. Shwetak Patel
18. Tim Althoff
19. Daniel McDuff
20. Xin Liu
This article has no evaluationsLatest version Jan 23, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multi-Agent AI Systems for Biological and Clinical Data Analysis

A Survey on LLM-based Multi-Agent AI Hospital

Towards a Science of Scaling Agent Systems