Medication information extraction using local large language models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctoral letters, requiring manual extraction -- a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clinical setup, including the demand for clinical expertise, limited time-resources, restricted IT infrastructure, and the demand for transparent predictions. Recent advances in generative large language models (LLMs) and parameter-efficient fine-tuning methods show potential to address these challenges.
We evaluated local LLMs for end-to-end extraction of medication information, combining named entity recognition and relation extraction. We used format-restricting instructions and developed a feedback pipeline to facilitate automated evaluation. We applied token-level Shapley values to visualize and quantify token contributions, to improve transparency of model predictions.
Two open-source LLMs -- one general (Llama) and one domain-specific (OpenBioLLM) – were evaluated on the English n2c2 2018 corpus and the German CARDIO:DE corpus. OpenBioLLM frequently struggled with structured outputs and hallucinations. Fine-tuned Llama models achieved new state-of-the-art results, improving F1-score by 10% for adverse drug events and 6% for medication reasons on English data. On the German dataset, Llama established a new benchmark, outperforming traditional machine learning methods by 16% micro average F1-score.
Our findings show that fine-tuned local open-source generative LLMs outperform SOTA methods for medication information extraction, delivering high performance with limited time and IT resources in a clinical setup, and demonstrate their effectiveness on both English and German data. Applying Shapley values improved prediction transparency, supporting informed clinical decision-making.
Highlights
-
Robust end-to-end medication information extraction with automated evaluation : We present an end-to-end joint named entity recognition and relation extraction pipeline using generative LLMs, enhanced by an automatic feedback mechanism utilizing a feedback LLM to simplify the oftentimes complex assessment of clinically critical predictions.
-
State-of-the-art performance across languages: Fine-tuned general LLMs surpassed existing benchmarks by up to 10% in complex medication classes on English data and established a new benchmark for German clinical datasets.
-
Resource-efficient fine-tuning in clinical setup: We demonstrated that parameter-efficient fine-tuning of local open-source LLMs yields consistent structured outputs and superior extraction performance, addressing clinical constraints like limited expertise, restricted IT infrastructure, and stringent transparency requirements.
-
Enhanced transparency with Shapley values: We utilized token-level Shapley values tailored for generative LLMs to systematically quantify and visualize individual token contributions, enabling clinicians to better understand and trust model predictions in medication information extraction tasks.