Biomedical Text Normalization through Generative Modeling

Jacob S. Berkowitz
Yasaman Fatapour
Apoorva Srinivasan
Jose Miguel Acitores Cortina
Nicholas P Tatonetti

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective

Around 80% of electronic health record (EHR) data consists of unstructured medical language text. By its nature, this text is flexible and inconsistent, making it challenging to use for clinical trial matching, decision support, and predictive modeling. In this study, we develop and assess text normalization pipelines built using large-language models.

Materials and Methods

Here, we evaluated four LLM-based normalization strategies: Zero-Shot Recall, Prompt Recall, Semantic Search, and Retrieval-Augmented Generation (RAG) and one baseline, TF-IDF based String Matching. We compared normalization performance across two datasets of condition terms mapped to SNOMED, one tailored to oncology, and one covering a wide range of medical conditions. Additionally, we benchmarked our models against the TAC 2017 drug label annotations, which normalizes terms to the Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms.

Results

RAG, which effectively combines Prompt Recall and Semantic Search, was the most effective, accurately identifying the correct term 88.31% of the time for the domain-specific dataset and 79.97% for the broader dataset. Our model achieved a micro F1 score of 88.01 on task 4 of the TAC2017 conference, surpassing all other models without relying on the provided training data.

Discussion

These findings demonstrate the potential of LLMs in medical text normalization. We find that retrieval-focused approaches overcome traditional LLM limitations for this task.

Conclusion

Large language models combined with retrieval-augmented generation should be explored for text normalization of biomedical free text.

Version published to 10.1101/2024.09.30.24314663v1 on medRxiv
Oct 1, 2024

Large Language Model Benchmarks in Medical Tasks

This article has 11 authors:
1. Lawrence K.Q. Yan
2. Ming Li
3. Yichao Zhang
4. Caitlyn Heqi Yin
5. Cheng Fei
6. Benji Peng
7. Ziqian Bi
8. Pohsun Feng
9. Keyu Chen
10. Junyu Liu
11. Qian Niu
This article has no evaluationsLatest version Oct 22, 2024
MulMed: Addressing Multiple Medical Tasks Utilizing Large Language Models

This article has 3 authors:
1. Nannan Cheng
2. Fangli Li
3. Li Huang
This article has no evaluationsLatest version Oct 16, 2024
Navigating Complexity: A Tailored Question-Answering Approach for PDFs in Finance, Bio-Medicine, and Science

This article has 4 authors:
1. Teerath Kumar
2. Rutu Bhujbal
3. Kislay Raj
4. Arunabha M. Roy
This article has no evaluationsLatest version Oct 17, 2024

Listed in

Abstract

Objective

Materials and Methods

Results

Discussion

Conclusion

Article activity feed

Related articles

Large Language Model Benchmarks in Medical Tasks

MulMed: Addressing Multiple Medical Tasks Utilizing Large Language Models

Navigating Complexity: A Tailored Question-Answering Approach for PDFs in Finance, Bio-Medicine, and Science