Biomedical Text Normalization through Generative Modeling
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective
Around 80% of electronic health record (EHR) data consists of unstructured medical language text. By its nature, this text is flexible and inconsistent, making it challenging to use for clinical trial matching, decision support, and predictive modeling. In this study, we develop and assess text normalization pipelines built using large-language models.
Materials and Methods
Here, we evaluated four LLM-based normalization strategies: Zero-Shot Recall, Prompt Recall, Semantic Search, and Retrieval-Augmented Generation (RAG) and one baseline, TF-IDF based String Matching. We compared normalization performance across two datasets of condition terms mapped to SNOMED, one tailored to oncology, and one covering a wide range of medical conditions. Additionally, we benchmarked our models against the TAC 2017 drug label annotations, which normalizes terms to the Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms.
Results
RAG, which effectively combines Prompt Recall and Semantic Search, was the most effective, accurately identifying the correct term 88.31% of the time for the domain-specific dataset and 79.97% for the broader dataset. Our model achieved a micro F1 score of 88.01 on task 4 of the TAC2017 conference, surpassing all other models without relying on the provided training data.
Discussion
These findings demonstrate the potential of LLMs in medical text normalization. We find that retrieval-focused approaches overcome traditional LLM limitations for this task.
Conclusion
Large language models combined with retrieval-augmented generation should be explored for text normalization of biomedical free text.