From Inference-Time Routing to Ingestion-Time Graphs: Referential Discovery, Actor-Agent Parallelism, and Formal Completeness Guarantees for Deterministic Multi-Hop RAG
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The document dependency graph is a property of the corpus, not of the query. This observation—simple in retrospect, absent in the literature—is the foundation of this paper. Because the dependency structure between documents exists independently of any query, it can be pre-computed at ingestion time. Once pre-computed, multi-hop retrieval becomes a deterministic graph traversal with formal completeness guarantees, eliminating the need for autonomous LLM agents to make routing decisions at inference time. This paper introduces Referential Discovery, an incremental ingestion pattern that pre-computes cross-document chunk references via a two-pass procedure and a pending-queue mechanism for forward references, reducing multi-hop resolution from O(bD · logNidx) to O(|Vc|) at inference time. Referential Discovery enables Controlled Multi-Hop RAG — a deterministic, auditable four-stage pipeline im plementing a parallel multi-agent architecture where each Ray actor operates as a formally defined Actor-Agent [29] with plan-based coordination, achieving full dependency-chain traversal without delegating routing to an autonomous LLM loop. Infrastructure homogeneity is preserved throughout: the dependency graph is stored in PostgreSQL, the same persistence layer already required by the base system, introducing zero additional infrastructure. A multi-tenant production evaluation across six independent HR technology or ganizations — comprising 5,169 real-world curriculum vitae documents and 500 concurrent queries at three complexity levels — demonstrates full dependency-chain traversal in under 62 minutes wall-clock time (t¯< 7.44s/query), with deterministic behavior and full step-level auditability. Domain practitioner evaluation by profes sional recruiters at each tenant organization confirmed zero hallucinations across all 500 responses — a result explained not by measurement luck but by architectural construction: if the dependency graph is complete, retrieval is complete; if retrieval is complete, hallucination is architecturally impossible. Consistent results across six independent corpora demonstrate cross-tenant generalizability. Prompt injection, unbounded tool execution, and EU AI Act non-compliance — structural risks of autonomous agentic systems— are eliminated by design.