Retrieval-Augmented Generation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Retrieval-augmented generation (RAG) is a hybrid architecture that combines the generative power of large language models (LLMs) with the factual reliability of information retrieval systems. Although the emergence of large language models (LLMs) has significantly improved the performance of natural language understanding and generation tasks. However, these models often suffer from information distortion, outdated information, and lack of transparency. Retrieval-augmented generation (RAG) addresses these limitations by introducing an external retrieval mechanism into the generation process. RAG systems follow the retrieve first, then generate paradigm, which retrieves relevant documents from knowledge sources and uses them as input to the language model. This approach enables the model to generate more accurate, solid, and timely responses. RAG has become an infrastructure for knowledge-intensive natural language processing (NLP) and LLM. In this review, we comprehensively review the basic architecture of RAG systems, analyze key components such as retrievers and generators, compare mainstream implementations, and evaluate their performance on various tasks. We also discuss challenges in the RAG pipeline, including latency, hallucinations, context filtering, and knowledge freshness. Finally, we highlight future research directions in terms of scalability, personalization, and integration with structured knowledge sources.

Article activity feed