Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

Wenya Xie
Qingying Xiao
Yu Zheng
Xidong Wang
Junying Chen
Ke Ji
Anningzhe Gao
Prayag Tiwari
Xiang Wan
Feng Jiang
Benyou Wang

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rise of large language models (LLMs) has profoundly influenced health-care by offering medical advice, diagnostic suggestions, and more. However, their deployment directly toward patients poses substantial risks, as limited domain knowledge may result in misleading or erroneous outputs. To address this challenge , we propose repositioning LLMs as clinical assistants that collaborate with experienced physicians rather than interacting with patients directly. We begin with a two-stage inspiration–feedback survey to identify real-world needs in clinical workflows. Guided by this, we construct DoctorFLAN, a large-scale Chi-nese medical dataset comprising 92,000 Q&A instances across 22 clinical tasks and 27 specialties. To evaluate model performance in doctor-facing applications, 1 we introduce DoctorFLAN-test (550 single-turn Q&A items) and DotaBench (74 multi-turn conversations mimicking realistic scenarios). Experimental results with over ten popular LLMs demonstrate that DoctorFLAN notably improves the performance of open-source LLMs in medical contexts, facilitating their alignment with physician workflows and complementing existing patient-oriented models. This work contributes a valuable resource and framework for advancing doctor-centered medical LLM development.

Version published to 10.21203/rs.3.rs-6763537/v1 on Research Square
Jun 2, 2025

CLEVER: Clinical Large Language Model Evaluationby Expert Review

This article has 4 authors:
1. Veysel Kocaman
2. Mustafa Kaya
3. Andrei Ferrer
4. David Talby
This article has no evaluationsLatest version Jul 23, 2025
Leveraging Large Language Models on Automating Outpatients’ Message Classifications of Electronic Medical Records

This article has 3 authors:
1. Amima Shifa
2. G G Md Nawaz Ali
3. Roopa Foulger
This article has no evaluationsLatest version Jul 3, 2025
Semantic Encoding in Medical LLMs for Vocabulary Standardisation

This article has 3 authors:
1. Samuel Mainwood
2. Aashish Bhandari
3. Sonika Tyagi
This article has no evaluationsLatest version Jun 17, 2025

Listed in

Abstract

Article activity feed

Related articles

CLEVER: Clinical Large Language Model Evaluationby Expert Review

Leveraging Large Language Models on Automating Outpatients’ Message Classifications of Electronic Medical Records

Semantic Encoding in Medical LLMs for Vocabulary Standardisation