Medication information extraction using local large language models

Phillip Richter-Pechanski
Marvin Seiferling
Christina Kiriakou
Dominic M. Schwab
Nicolas A. Geis
Christoph Dieterich
Anette Frank

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctoral letters, requiring manual extraction -- a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clinical setup, including the demand for clinical expertise, limited time-resources, restricted IT infrastructure, and the demand for transparent predictions. Recent advances in generative large language models (LLMs) and parameter-efficient fine-tuning methods show potential to address these challenges.

We evaluated local LLMs for end-to-end extraction of medication information, combining named entity recognition and relation extraction. We used format-restricting instructions and developed a feedback pipeline to facilitate automated evaluation. We applied token-level Shapley values to visualize and quantify token contributions, to improve transparency of model predictions.

Two open-source LLMs -- one general (Llama) and one domain-specific (OpenBioLLM) – were evaluated on the English n2c2 2018 corpus and the German CARDIO:DE corpus. OpenBioLLM frequently struggled with structured outputs and hallucinations. Fine-tuned Llama models achieved new state-of-the-art results, improving F1-score by 10% for adverse drug events and 6% for medication reasons on English data. On the German dataset, Llama established a new benchmark, outperforming traditional machine learning methods by 16% micro average F1-score.

Our findings show that fine-tuned local open-source generative LLMs outperform SOTA methods for medication information extraction, delivering high performance with limited time and IT resources in a clinical setup, and demonstrate their effectiveness on both English and German data. Applying Shapley values improved prediction transparency, supporting informed clinical decision-making.

Highlights

Robust end-to-end medication information extraction with automated evaluation : We present an end-to-end joint named entity recognition and relation extraction pipeline using generative LLMs, enhanced by an automatic feedback mechanism utilizing a feedback LLM to simplify the oftentimes complex assessment of clinically critical predictions.
State-of-the-art performance across languages: Fine-tuned general LLMs surpassed existing benchmarks by up to 10% in complex medication classes on English data and established a new benchmark for German clinical datasets.
Resource-efficient fine-tuning in clinical setup: We demonstrated that parameter-efficient fine-tuning of local open-source LLMs yields consistent structured outputs and superior extraction performance, addressing clinical constraints like limited expertise, restricted IT infrastructure, and stringent transparency requirements.
Enhanced transparency with Shapley values: We utilized token-level Shapley values tailored for generative LLMs to systematically quantify and visualize individual token contributions, enabling clinicians to better understand and trust model predictions in medication information extraction tasks.

Version published to 10.1101/2025.03.28.25324847v1 on medRxiv
Mar 31, 2025

Detecting Medication Mentions in Social Media Data Using Large Language Models

This article has 3 authors:
1. Guillermo Lopez-Garcia
2. Dongfang Xu
3. Graciela Gonzalez-Hernandez
This article has no evaluationsLatest version May 18, 2025
ATCodeR: a dictionary-based R-tool to standardize medication free-text

This article has 6 authors:
1. Isabel Schnorr
2. Stefanie Andreas
3. Linnea Schumann
4. Svenja Hahn
5. Jörg Janne Vehreschild
6. Daniel Maier
This article has no evaluationsLatest version Apr 10, 2025
Large Language Models in Portuguese for Healthcare: A Systematic Review

This article has 7 authors:
1. Andre Massahiro Shimaoka
2. Antonio Carlos da Silva Junior
3. José Marcio Duarte
4. Thiago Bulhões da Silva Costa
5. Ivan Torres Pisa
6. Luciano Rodrigo Lopes
7. Paulo Bandiera-Paiva
This article has no evaluationsLatest version May 22, 2025

Listed in

Abstract

Highlights

Article activity feed

Related articles

Detecting Medication Mentions in Social Media Data Using Large Language Models

ATCodeR: a dictionary-based R-tool to standardize medication free-text

Large Language Models in Portuguese for Healthcare: A Systematic Review