AI-Simulated Clinical Consultations: Assessing the Potential of ChatGPT to Support Medical Training

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Simulated medical scenarios are useful for evaluating and developing clinical competencies but scheduling them is expensive and time-consuming. Large language models (LLMs) show promise in role-playing tasks. We investigated the fidelity with which ChatGPT can mimic patients, clinicians and examiners in educational settings.

Objective

To determine the realism with which ChatGPT can portray patient, doctor and examiner roles, and the utility of these agents in clinical education.

Method

We selected four paediatric scenarios from mock OSCEs and set up separate patient, doctor and examiner ChatGPT agents for each. The patient and doctor agents conversed with each other in written format. The examiner agent marked the doctor agent based on this conversation. Patients and clinicians familiar with the OSCE assessed the dialogues.

Results

The patient agent was judged to be true to character most of the time and good at expressing emotion. The doctor agent was reported to be an effective communicator but occasionally used jargon. Both agents tended to produce repetitive responses which undermined realism. The examiner agent had good correlation with human clinicians. There was moderate support for using the simulated interactions for educational purposes.

Conclusion

Although the realism of the agents can be improved, ChatGPT can generate plausible proxies of participants in medical scenarios and could be useful for complementing standardised patient (SP)-based training.

KEY MESSAGES

What is already known on this topic

  • LLM-based agents show promise for portraying clinical roles and supporting simulation-based learning. Doctor agents provide correct diagnoses most of the time, while patient agents can accurately relay role information such as medical history or symptoms.

What this study adds

  • There is scope for improvement in the realism and authenticity of the conversations produced by GPT patient and doctor agents. Notable issues included a tendency to produce repetitive and verbose responses, and an inability to accurately convey the hesitation shown by real patients.

  • Disparities observed between (human) patient and clinician assessment for the GPT agents suggest that diverse viewpoints are needed to fully capture the experiential learning associated with clinical communication.

  • How this study might affect research, practice or policy

  • Low fidelity of GPT simulations for difficult or challenging medical scenarios necessitates human oversight and correction for AI deployed in educational settings.

  • The impact of AI on medical education is likely to increase in the future, which necessitates promoting AI literacy among educators and students.

Article activity feed