Clinically grounded multi-agent artificial intelligence for preventive health management
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Routine health examinations generate dense, heterogeneous data, yet their preventive value depends on consistent interpretation, calibrated risk stratification, and actionable follow-up. In practice, these tasks are distributed across clinicians and time, leading to variability in the detection of subtle abnormalities and in decisions about when and how to intervene. Such variability reflects the difficulty of maintaining consistent, high-quality preventive decision-making at scale. Here we present G-Health, a clinically grounded multi-agent artificial intelligence framework that translates examination reports into structured preventive action. The system combines three-stage clinical alignment of large language models with specialist quantitative risk models and guideline-informed retrieval to stabilize reasoning under uncertainty. Trained on large-scale medical dialogue data and further specialized on multi-center real-world examination reports, the framework integrates 20 quantitative risk models that provide calibrated multi-disease estimates with feature-level interpretability. Across 13 medical and general benchmarks, the aligned models achieve the best overall average rank among strong baselines. In a fully blinded evaluation involving 79 medically trained assessors, G-Health reports were consistently preferred over outputs from three other large language models and 12 senior practicing physicians across five clinical dimensions. Together, these findings establish a deployable paradigm that transforms routine examinations into structured and scalable preventive decision-making.