RAG-KED Summarization: A Framework for Knowledge-Augmented Article Summarization with Large Language Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study introduces the RAG-KED framework, a novel approach designed to enhance the accuracy and relevance of scientific article summaries by integrating Retrieval-Augmented Generation (RAG) and Knowledge Example-Driven (KED) summarization. The primary objective is to address the limitations of traditional large language models (LLMs) in handling knowledge-intensive tasks, particularly in dynamic fields like biomedical research, where static training data quickly becomes outdated. The methodology leverages retrieval mechanisms to access current, do-main-specific information and employs curated example analyses to guide the structure and content of summaries. Evaluation was conducted using the eLife dataset, assessing advanced models such as Llama-3.2-90b, Llama-3.1-70b, and GPT-4o Mini. Key results demonstrate that models incorporating sample summaries significantly outper-form those without, as evidenced by higher ROUGE and BLEU + scores. Specifically, Llama-3.2-90b achieves the highest performance among tested models when guided by samples, while GPT-4o Mini excels across multiple met-rics. The study concludes that the RAG-KED framework markedly improves summary quality, thereby enhancing the accessibility of complex scientific knowledge. These findings underscore the framework’s potential to bridge critical gaps in domain-specific summarization, although its effectiveness hinges on the robustness of retrieval mechanisms and the quality of example summaries.