Can Large Language Models Generate Role-Consistent Clinical Dialogue for Education? A Multi-agent Approach

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: The development of high-quality healthcare simulation scenarios and educationalclinical dialogue is resource-intensive, limiting scalability in healthcare education. Large languagemodels (LLMs) offer new opportunities for generating dynamic conversational simulations but raiseconcerns regarding role fidelity, realism, and evaluation.Innovation: We describe a role-locked, multi-agent LLM system designed to generate realisticemergency department conversations involving a doctor, nurse, and patient. Separate LLM agentswere assigned fixed clinical roles and interacted within a shared conversational environment, supportedby explicit role constraints and rule-based role-guarding mechanisms. An independent LLM wasused as an automated evaluator (“AI-as-judge”) to assess role fidelity, turn coherence, communicationrealism, and educational usability.Evaluation: Twenty-five simulated conversations were generated and evaluated using the automatedjudge. A subset of ten conversations underwent independent human evaluation by two clinically experienced raters using aligned assessment domains. Automated evaluation demonstrated consistentlyhigh ratings across all domains, with all simulations judged educationally usable. Human evaluation showed substantial agreement for role fidelity and moderate agreement across other domains,providing expert plausibility benchmarking for the automated approach.Implications: This work demonstrates the feasibility of a role-locked, multi-agent LLM architecturefor generating educationally plausible conversational simulations. The combination of automated andlimited human evaluation provides early validity evidence supporting feasibility and usability. Thisapproach may support rapid prototyping and scalable development of simulation-based educationalcontent.

Article activity feed