Evaluation of Large Language Models in Medical Examinations: A Scoping Review Protocol
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction
Large language models (LLMs) demonstrate human-level performance in three key domains: linguistic understanding, knowledge-based reasoning, and complex problem-solving. These characteristics make LLMs valuable tools for medical education. Standardized medical examinations evaluate clinical competencies in trainees. These examinations allow rigorous verification of LLMs’ accuracy and reliability in medical contexts. Current methods use standardized examinations to test LLMs’ clinical reasoning abilities. Significant performance variations emerge across different clinical scenarios. No comprehensive reviews have compared different LLM versions in medical examinations. Most studies focus on individual models, lacking comparative analyses of multiple LLM versions. Current approaches struggle to keep pace with evolving research needs. This study synthesizes extant research on LLMs in medical examinations, by analyzing the current challenges and limitations, offers guidance for future investigations.
Methods and analysis
The protocol was designed following the JBI Manual for Evidence Synthesis guidelines. We established explicit inclusion/exclusion criteria and search strategies. Systematic searches were performed in PubMed and Web of Science Core Collection databases. The methodology details literature screening, data extraction, analysis frameworks, and process mapping. This approach ensures methodological rigor throughout the research process.
Ethics and dissemination
This protocol outlines a scoping review methodology. The study involves systematic synthesis and analysis of published literature. It does not include human/animal experimentation or sensitive data collection. Ethical approval is not required for this literature-based study.
Strengths and limitations of this study
This scoping review programme strictly adheres to the standardized guidelines for the implementation of scoping reviews. Includes the JBI Manual for Evidence Synthesis and the Preferred Reporting Items for Systematic Reviews and Scoping Reviews Extended Meta-Analysis (PRISMA-ScR) guideline.
The search strategy included two databases:PubMed, Web of Science Core Collection.
This scoping review will bridge the knowledge gap of LLMs across medical examinations due to recent rapid technological advances.
By the nature of the scoping review, failure to critically evaluate identified sources of evidence.
The results of the scoping review will serve as a basis for identifying directions for further research on LLMs in the field of medical examinations.