A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs) have shown potential to transform medical specialties but there are uncertainties in their performance. This systematic review aims to evaluate LLMs utilization, impacts, and challenges across 19 medical specialties. We systematically searched on Scopus and Web of Science for journal articles employing LLMs for medical specialties. 5,790 peer-reviewed studies were identified, of which 84 were included in this systematic review. The results revealed the most used evaluation metric for assessing LLM performance is the accuracy metric, followed by the F1-score. The LLM scores for the evaluation 1 metrics varied based on the task of utilizing the LLM, where five main applications were identified. The main positive impact reported was enhancing medical processes efficiency, while the most reported negative impact was inconsistent reliability. More rigorous validation standards that consider the unique challenges of LLMs research could enhance future studies. Continued research and collaboration between computer science and medical communities will be crucial for advancing and ensuring the reliable and trustworthy application of LLMs in healthcare.

Article activity feed