Domain-Partitioned Retrieval as a Hallucination Mitigation Strategy in Conversational RAG: The PC-RAG Architecture

Ruben Alejandro Jaime

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Hallucination in Retrieval-Augmented Generation (RAG) systems is commonly attributed to generative model limitations. We argue that the primary cause in enterprise document retrieval is context window contamination: single-pass retrieval aggregates chunks from semantically heterogeneous domains, forcing the synthesis model to reason over a noisy cross-domain context and producing coherent but factually unstable responses. We present PC-RAG (Pipeline-Conversational RAG), an architecture built on domain-partitioned retrieval: complex queries are decomposed into domain-isolated sub-tasks, each serviced by an independent retrieval pass. Within each partition, intra-domain multi-query expansion generates k=6 semantic variants, maximizing coverage without cross-domain interference. Evaluation on a 1,200-document production deployment (HR, legal, compliance) shows that domain partitioning alone reduces false-positive chunk inclusion by 51% and raises human-rated synthesis quality from 3.1 to 4.1 on a 5-point scale (two annotators, κ=0.79, Wilcoxon p<0.001). The full PC-RAG system achieves a synthesis score of 4.2, outperforming HyDE Gao et al. [2023] (3.6), standard multi-query retrieval (3.7), and RAPTOR Sarthi et al. [2024] (3.9). RAGAS automated evaluation corroborates human judgements (r=0.83, p<0.001). By contrast, upgrading the synthesis model alone yields only marginal gains (3.1 to 3.4, p=0.031), confirming the bottleneck is retrieval architecture, not model capacity. The pipeline further integrates conversational coreference resolution (97.4% accuracy) and a semantic reranker; median end-to-end latency is 90 s, with primary contributions accounting for only 12% of this cost.

Version published to 10.21203/rs.3.rs-9334334/v1 on Research Square
Apr 16, 2026

Retrieval-Augmented Large Language Model Agents for Automated Scientific Literature Review Generation

This article has 6 authors:
1. Ruotong Wang
2. Nyutian Long
3. Shunqi Liu
4. Yuxi Wang
5. Zhen Qi
6. Huajun Zhang
This article has no evaluationsLatest version Apr 6, 2026
How Can Hallucinatory Biases Be Effectively Audited and Mitigated in Vision-Language Models? A Unified Theoretical and Empirical Framework Across GPT-4o, Grok 3, and Claude Sonnet 4.5

This article has 1 author:
1. Amirali Ghajari
This article has no evaluationsLatest version Apr 8, 2026
reEtym: A Natively Feature-Disentangled Transformer for Interpretability

This article has 1 author:
1. Hongyu Shi
This article has no evaluationsLatest version Apr 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Retrieval-Augmented Large Language Model Agents for Automated Scientific Literature Review Generation

How Can Hallucinatory Biases Be Effectively Audited and Mitigated in Vision-Language Models? A Unified Theoretical and Empirical Framework Across GPT-4o, Grok 3, and Claude Sonnet 4.5

reEtym: A Natively Feature-Disentangled Transformer for Interpretability