A Framework for Extraction of Clinical Information from Radiological Mammography Reports Using Large Language Models and Retrieval Augmented Generation.

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: The application of text mining in radiological reports is crucial for many tasks but principally analyzing and projecting trends to enhance diagnostic accuracy. This is especially important in mammography reports, where early detection is crucial. Large language models (LLMs) provide a viable alternative. They can generate accurate results from a limited set of examples compared with the traditional state-of-the-art models. Methods: This work presents a framework utilizing a general proposed retrieval-augmented generation (RAG) and large language model LLM to create a replicable model capable of structuring mammography radiology reports and extracting relevant concepts associated with the findings. The resulting model is applied to a real-world scenario, using a dataset of mammography radiology reports provided by a hospital. These reports, written by radiologists, are in free text and Spanish. The application of the designed framework is evaluated across several LLMs, and its results are compared to a conventional and specialized NER-type model based on BERT, using a dataset labeled by radiologists. Results: Several models have been implemented and evaluated with the proposed LLM framework. In Named Entity Recognition (NER) tasks using GPT-4, the zero-shot learning scenario achieved an F1-score of 0.80, while the five-shot scenario reached an F1-score of 0.96. This is comparable to the specific-context NER-BERT model, which achieved an F1-score of 0.97. Similarly, in Relation Extraction tasks, we achieved an F1-score of 0.93, a task for which a specialized model was not available. Conclusion: The results demonstrate that large language models (LLMs) can benefit from additional in-prompt examples and achieve results comparable to those of specialized models like NER-BERT. Additionally, this study shows that through a well-defined framework, it is possible to effectively leverage the capabilities of LLMs for specific purposes such as NER and Relation Extraction over clinical text and mammography reports.

Article activity feed