Medical Abbreviation Disambiguation with Large Language Models: Zero- and Few-Shot Evaluation on the MeDAL Dataset
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Abbreviation disambiguation is a critical challenge in processing clinical and biomedical texts, where ambiguous short forms frequently obscure meaning. In this study, we assess the zero-shot performance of large language models (LLMs) on the task of medical abbreviation disambiguation using the MeDAL dataset, a large-scale resource constructed from PubMed abstracts. Specifically, we evaluate GPT-4 and LLaMA models, prompting them with contextual information to infer the correct long-form expansion of ambiguous abbreviations without any task-specific fine-tuning. Our results demonstrate that GPT-4 substantially outperforms LLaMA across a range of ambiguous terms, indicating a significant advantage of proprietary models in zero-shot medical language understanding. These findings suggest that LLMs, even without domain-specific training, can serve as effective tools for improving readability and interpretability in biomedical NLP applications.