RAG-KED Summarization: A Framework for Knowledge-Augmented Article Summarization with Large Language Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study introduces the RAG-KED framework, a novel approach designed to enhance the accuracy and relevance of scientific article summaries by integrating Retrieval-Augmented Generation (RAG) and Knowledge Example-Driven (KED) summarization. The primary objective is to address the limitations of traditional large language models (LLMs) in handling knowledge-intensive tasks, particularly in dynamic fields like biomedical research, where static training data quickly becomes outdated. The methodology leverages retrieval mechanisms to access current, do-main-specific information and employs curated example analyses to guide the structure and content of summaries. Evaluation was conducted using the eLife dataset, assessing advanced models such as Llama-3.2-90b, Llama-3.1-70b, and GPT-4o Mini. Key results demonstrate that models incorporating sample summaries significantly outper-form those without, as evidenced by higher ROUGE and BLEU + scores. Specifically, Llama-3.2-90b achieves the highest performance among tested models when guided by samples, while GPT-4o Mini excels across multiple met-rics. The study concludes that the RAG-KED framework markedly improves summary quality, thereby enhancing the accessibility of complex scientific knowledge. These findings underscore the framework’s potential to bridge critical gaps in domain-specific summarization, although its effectiveness hinges on the robustness of retrieval mechanisms and the quality of example summaries.

Article activity feed