RCT-MARS: When Per-Query Retrieval Routing Fails, and What It Takes to Succeed

Ankit Srivastava

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

An oracle that selects the best retrieval paradigm per query substantially outperforms any fixed method, yet learned routers consistently fail to beat strong baselines---the routing paradox . We study this paradox across five retrieval paradigms (BM25, dense embeddings, knowledge-graph retrieval, agentic multi-step search, and cross-encoder reranking) and six benchmarks. Hard routing (discrete paradigm selection) fails severely: the best hard router (XGBoost-Direct, 0.387) falls 7.8 pp below the best fixed paradigm (Reranker, 0.465) despite an oracle ceiling of 0.599, closing \((-57.9%)\) of the oracle gap. We then show what it takes to succeed : Corpus-Aware Soft Routing (CASR) replaces discrete selection with learned per-query fusion weights via XGBoost multi-output regression and temperature-scaled softmax (\((\tau = 0.05)\)). CASR achieves 0.487 nDCG@10, significantly outperforming Dense (\((+4.9)\) pp, \((p = 0.002)\)), numerically exceeding the best fixed paradigm (Reranker, \((+2.2)\) pp, \((p = 0.112)\)), and matching unsupervised fusion (RRF, 0.482; \((p = 0.755)\))---closing \((+16.4%)\) of the oracle gap versus \((-57.9%)\) for hard routing---a 74 percentage-point swing (\((+16.4 - (-57.9))\)). To diagnose why hard routing fails and soft routing succeeds, RCT-MARS (Retrieval Complexity Taxonomy through Multi-paradigm Algorithm Routing and Selection) constructs performance-signature vectors, clusters them to discover three stable complexity classes (ARI \((= 0.773)\), 95% CI \(([0.529, 0.991])\)), and uses the taxonomy as a diagnostic lens. The key insight is that discrete paradigm selection requires precise information that query features alone cannot provide, whereas soft routing hedges across paradigms, tolerating prediction uncertainty while exploiting paradigm complementarity.

Version published to 10.21203/rs.3.rs-9132181/v1 on Research Square
Apr 7, 2026

Dimension-Direct Routing: Achieving 25% Depth Improvement in Multi- Model LLM Systems via Explicit Capability Factorization

This article has 1 author:
1. Tao Rui
This article has no evaluationsLatest version Apr 7, 2026
ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference

This article has 1 author:
1. Samuel Edusa
This article has no evaluationsLatest version Apr 13, 2026
From Inference-Time Routing to Ingestion-Time Graphs: Referential Discovery, Actor-Agent Parallelism, and Formal Completeness Guarantees for Deterministic Multi-Hop RAG

This article has 1 author:
1. Ruben Jaime
This article has no evaluationsLatest version Apr 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Dimension-Direct Routing: Achieving 25% Depth Improvement in Multi- Model LLM Systems via Explicit Capability Factorization

ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference

From Inference-Time Routing to Ingestion-Time Graphs: Referential Discovery, Actor-Agent Parallelism, and Formal Completeness Guarantees for Deterministic Multi-Hop RAG